Issue #20101 has been reported by kjtsanaktsidis (KJ Tsanaktsidis).
----------------------------------------
Bug #20101: rb_file_open and rb_io_fdopen don't perform CRLF -> LF conversion when encoding is set
https://bugs.ruby-lang.org/issues/20101
* Author: kjtsanaktsidis (KJ Tsanaktsidis)
* Status: Open
* Priority: Normal
* Assignee: kjtsanaktsidis (KJ Tsanaktsidis)
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN
----------------------------------------
When opening a file with `File.open`, as long as `'b'` is not set in the mode, Ruby will perform CRLF -> LF conversion on Windows when reading text files - i.e. CRLF line endings on disk get converted to Ruby strings with only "\n" in them. If you explicitly set the encoding with `IO#set_encoding`, this still works properly.
If you open the file in C with either the `rb_io_fdopen` or `rb_file_open` APIs in text mode, CRLF -> LF conversion also works. However, if you then call `IO#set_encoding` on this file, the CRLF -> LF conversion stops happening.
Concretely, this means that the conversion doesn't happen in the following circumstances:
* When loading ruby files with require (that calls `rb_io_fdopen`)
* When parsing ruuby files with RubyVM::AbstractSyntaxTree (that calls `rb_file_open`).
This then causes the ErrorHighlight tests to fail on windows if git has checked them out with CRLF line endings - the error messages it's testing wind up with literal \r\n sequences in them because the iseq text from the parser contains un-newline-converted strings.
This seems to happen because, in `File.open`, the file's encflags get the flag `ECONV_DEFAULT_NEWLINE_DECORATOR` in `rb_io_extract_modeenc`; however, this method isn't called for `rb_io_fdopen` or `rb_file_open`, so `encflags` doesn't get set to `ECONV_DEFAULT_NEWLINE_DECORATOR`. Without that flag, the underlying file descriptor's mode gets changed to binary mode by the `NEED_NEWLINE_DECORATOR_ON_READ_CHECK` macro.
--
https://bugs.ruby-lang.org/
Issue #19965 has been reported by mame (Yusuke Endoh).
----------------------------------------
Feature #19965: Make the name resolution interruptible
https://bugs.ruby-lang.org/issues/19965
* Author: mame (Yusuke Endoh)
* Status: Open
* Priority: Normal
----------------------------------------
## Problem
Currently, Ruby name resolution is not interruptible.
```
$ cat /etc/resolv.conf
nameserver 198.51.100.1
$ ./local/bin/ruby -rsocket -e 'Addrinfo.getaddrinfo("www.ruby-lang.org", 80)'
^C^C^C^C
```
If you set a non-responsive IP as the nameserver, you cannot stop `Addrinfo.getaddrinfo` by pressing Ctrl+C. Note that `Timeout.timeout` does not work either.
This is because there is no way to cancel `getaddrinfo(3)`.
## Proposal
I wrote a patch to make `getaddrinfo(3)` work in a separate pthread.
https://github.com/ruby/ruby/pull/8695
Whenever it needs name resolution, it creates a worker pthread, and executes `getaddrinfo(3)` in it.
The caller thread waits for the worker to complete.
When an interrupt occurs, the caller thread leaves stop waiting and leaves the worker pthread.
The detached worker pthread will exit after `getaddrinfo(3)` completes (or name resolution times out).
## Evaluation
By applying this patch, name resolution is now interruptible.
```
$ ./local/bin/ruby -rsocket -e 'pp Addrinfo.getaddrinfo("www.ruby-lang.org", 80)'
^C-e:1:in `getaddrinfo': Interrupt
from -e:1:in `<main>'
```
As a drawback, name resolution performance will be degraded.
```
10000.times { Addrinfo.getaddrinfo("www.ruby-lang.org", 80) }
# Before patch: 2.3 sec.
# After ptach: 3.0 sec.
```
However, I think that name resolution is typically short enough for the application's runtime. For example, the difference is small for the performance of `URI.open`.
```
100.times { URI.open("https://www.ruby-lang.org").read }
# Before patch: 3.36 sec.
# After ptach: 3.40 sec.
```
## Alternative approaches
I proposed using c-ares to resolve this issue (#19430). However, there was an opinion that it would be a problem that c-ares does not respect the platform-dependent own name resolution.
## Room for improvement
* Currently, this patch works only when pthread is available.
* It might be possible to force to stop the worker threads by using `pthread_cancel`. However, `pthread_cancel` with `getaddrinfo(3)` seems still premature; there seems to be a bug in glibc until recently: https://bugzilla.redhat.com/show_bug.cgi?id=1405071https://sourceware.org/bugzilla/show_bug.cgi?id=20975
* It would be more efficient to pool worker pthreads instead of creating them each time.
--
https://bugs.ruby-lang.org/
Issue #19908 has been reported by nobu (Nobuyoshi Nakada).
----------------------------------------
Feature #19908: Update to Unicode 15.1
https://bugs.ruby-lang.org/issues/19908
* Author: nobu (Nobuyoshi Nakada)
* Status: Assigned
* Priority: Normal
* Assignee: duerst (Martin Dürst)
* Target version: 3.3
----------------------------------------
The Unicode 15.1 is released.
The current enc-unicode.rb seems to fail because of `Indic_Conjunct_break` properties with values.
I'm not sure how these properties should be handled well.
`/\p{InCB_Liner}/` or `/\p{InCB=Liner}/` as the comments in that file?
https://github.com/nobu/ruby/tree/unicode-15.1 is the former.
--
https://bugs.ruby-lang.org/
Issue #19409 has been reported by luke-gru (Luke Gruber).
----------------------------------------
Bug #19409: Object's shape is reset after a ractor move
https://bugs.ruby-lang.org/issues/19409
* Author: luke-gru (Luke Gruber)
* Status: Open
* Priority: Normal
* Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN
----------------------------------------
I believe an object should have the same shape after being moved from 1 ractor to another.
```ruby
class Obj
attr_accessor :a, :b, :c, :d
def initialize
@a = 1
@b = 2
@c = 3
end
end
r = Ractor.new do
obj = receive
#p RubyVM::Shape.of(obj)
obj.d = 4
p obj.a, obj.b, obj.c, obj.d # gets wrong values due to object shape id being reset on object
end
obj = Obj.new
#p RubyVM::Shape.of(obj)
r.send(obj, move: true)
r.take
```
--
https://bugs.ruby-lang.org/
Issue #19411 has been reported by luke-gru (Luke Gruber).
----------------------------------------
Bug #19411: GC issue with moved objects
https://bugs.ruby-lang.org/issues/19411
* Author: luke-gru (Luke Gruber)
* Status: Open
* Priority: Normal
* ruby -v: 3.2.0
* Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN
----------------------------------------
This crashes:
```ruby
class Obj
def initialize
@obj = 3
end
end
GC.stress = true
r = Ractor.new do
obj = receive
p obj
end
obj = Obj.new
r.send(obj, move: true)
r.take
```
It only crashes with nested objects, if you remove the ivar set in `initialize` it works fine. Maybe missing `RB_GC_GUARD`?
--
https://bugs.ruby-lang.org/
Issue #20054 has been reported by sawa (Tsuyoshi Sawada).
----------------------------------------
Feature #20054: Replace the use of `def` in endless method definitions with a new sigil
https://bugs.ruby-lang.org/issues/20054
* Author: sawa (Tsuyoshi Sawada)
* Status: Open
* Priority: Normal
----------------------------------------
I propose to remove the use of keyword `def` from the syntax of endless method definition, and introduce a new sigil instead of it. There are several possibilities for what character to use as the sigil, but the most seemingly promising one to me at this point is the colon. So, instead of:
```rb
def foo = method_body
```
I propose to write
```rb
:foo = method_body
```
There a few reasons to dispense with `def` in endless method definition.
First, the current syntax for endless method definition looks too similar to conventional method definition. Without endless method definition, we could already define a method in a single line:
```rb
def foo; method_body end
```
and compared to this, what the endless method definition does is that, it only saves you from typing the `end` keyword just by replacing the semicolon with an equal sign. This actually had not made much sense to me. Just saving you from typing the keyword `end` looks too small of a change for introducing new syntax. In order for endless method definition syntax to be justified (as a shorthand for conventional method definition), it needs to save more typing.
Second, in #19392, some people are claiming to change the precedence involving endless method definition. I agree with Matz and other developers who support the current precedence in which:
```rb
def foo = bar and baz
```
is interpreted as:
```rb
(def foo = bar) and baz
```
and I understand that the controversy is due to the look and feel of the keyword `def`. `def` has lower precedence than `and` in conventional method definition, although `=` has higher precedence than `and` in variable/constant assignment. Mixing the low-precedence `def` and the high-precedence `=` into a single syntax was the cause of the trouble, according to my opinion.
Thence, we should get rid of `def`. Once we do so, we need to distinguish endless method definition from variable/constant assignment in a new way. What came to my mind was to use a single character: a sigil.
Especially, using the colon seems to make sense to me for several reasons:
Most importantly, assignment to a symbol is impossible, and it currently raises a syntax error, so it would not conflict with variable/constant assignment syntax.
Within Ruby syntax, symbol is naturally used to represent a method name. For example, in `foo(&:bar)` constructions, users are used to passing a method name as a symbol. Also, a method definition returns a symbol representing the method name. So, making the endless method definition syntax look superficially like an "assignment to a symbol" would make sense.
--
https://bugs.ruby-lang.org/
Issue #17815 has been updated by paddor (Patrik Wenger).
This can be closed. I've implemented a Ruby build plugin on v1 and v2.
----------------------------------------
Misc #17815: Snapcraft Ruby plugin
https://bugs.ruby-lang.org/issues/17815#change-105943
* Author: paddor (Patrik Wenger)
* Status: Open
* Priority: Normal
----------------------------------------
I'm working on a Ruby build plugin for the Snapcraft v2 plugin AP, since the v1 Ruby plugin does not work on the Ubuntu `core20` image. The v2 API only allows to influence the build step, which means there's no nice way for the plugin to set environment variables that apply during runtime. The problem is that the final paths to Ruby's stdlib are unknown during compile time.
I've got it to work by renaming the `ruby` executable to `ruby.bare` and creating a wrapper script `ruby` that sets the `$RUBYLIB` and `$GEM_PATH` env variables based on `$SNAP` before `exec`ing `ruby.bare`. I don't like that it requires a wrapper script though, since it would have to awkwardly determine the arch-specific stdlib directory to support any platform. The current wrapper script is this:
```sh
#!/bin/sh
export RUBYLIB="$SNAP/lib/ruby/snap:$SNAP/lib/ruby/snap/x86_64-linux:$RUBYLIB"
export GEM_PATH="$SNAP/lib/ruby/gems/snap:$GEM_PATH"
exec `dirname $0`/ruby.bare "$@"
```
What possible solutions are there? I could maybe fix the shebangs of Ruby executables like `gem` and `bundle`, but not `ruby` itself since it's binary. Also, I'm not sure if env variable expansion even works in shebang.
Can the configure script's `--with-search-path` be made to support env variable expansion during runtime? That still wouldn't solve the issue about the arch-specific directory.
Python apparently determines its load paths relative to the executable being run. Does Ruby support anything like that?
Any other ideas?
Current state: https://github.com/paddor/snapcraft-ruby-plugin-v2
--
https://bugs.ruby-lang.org/
Issue #19375 has been reported by luke-gru (Luke Gruber).
----------------------------------------
Bug #19375: File objects are currently shareable
https://bugs.ruby-lang.org/issues/19375
* Author: luke-gru (Luke Gruber)
* Status: Open
* Priority: Normal
* Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN
----------------------------------------
I don't know the internals of file.c but I don't think files are thread-safe.
--
https://bugs.ruby-lang.org/
Issue #20093 has been reported by tagomoris (Satoshi Tagomori).
----------------------------------------
Feature #20093: Syntax or keyword to reopen existing classs/modules, never to define new classs/modules
https://bugs.ruby-lang.org/issues/20093
* Author: tagomoris (Satoshi Tagomori)
* Status: Open
* Priority: Normal
----------------------------------------
`class A` and `module B` will reopen existing class A or module B to add/re-define methods if A/B exists. Otherwise, these will define the new class/module A/B.
But, in my opinion, the code of `class A` for patching existing classes doesn't work expectedly when `A` is not defined beforehand. It expects other codes to define `A` before being called.
For example:
```ruby
# string_exclude.rb
class String
def exclude?(string)
!include?(string)
end
end
```
This code expects that there is the `String` class, and it has the `include?` method. This code doesn't work if the file is loaded in the way below:
```ruby
load('string_exclude.rb', true)
```
This code doesn't raise errors and will define an almost empty class (only with a method `exclude?` to raise NameError). It should be unexpected for every user.
So, I want to propose a new syntax to reopen the existing class/module or raise errors if the specified class/module is not defined.
```ruby
class extension String
def exclude?(string)
!include?(string)
end
end # adds #exclude? to String class
class extension Stroooong
def exclude?(string)
!include?(string)
end
end # will raise NameError (or something else)
```
Some additional things:
* `class extension String` (and `module extension String`) causes a compile error (SyntaxError) on Ruby 3.3. So we have space to add a keyword between class/module and the class/module name.
* I don't have a strong opinion about the keyword name `extension`. An alternative idea is `reopen`.
--
https://bugs.ruby-lang.org/