Issue #20153 has been reported by k0kubun (Takashi Kokubun).
----------------------------------------
Bug #20153: Backport 7f9c174102 to fix --yjit-stats with RubyVM::YJIT.enable
https://bugs.ruby-lang.org/issues/20153
* Author: k0kubun (Takashi Kokubun)
* Status: Open
* Priority: Normal
* Backport: 3.0: DONTNEED, 3.1: DONTNEED, 3.2: DONTNEED, 3.3: REQUIRED
----------------------------------------
Ruby 3.3.0 ignored --yjit-stats when `RubyVM::YJIT.enable` (no argument) is used, which was an unintended behavior.
https://github.com/ruby/ruby/pull/9415 should be backported to ruby_3_3.
--
https://bugs.ruby-lang.org/
Issue #20214 has been reported by k0kubun (Takashi Kokubun).
----------------------------------------
Bug #20214: Backport https://github.com/ruby/ruby/pull/9711 to fix exits on Ruby 3.3's new instruction
https://bugs.ruby-lang.org/issues/20214
* Author: k0kubun (Takashi Kokubun)
* Status: Closed
* Priority: Normal
* Assignee: naruse (Yui NARUSE)
* Backport: 3.0: DONTNEED, 3.1: DONTNEED, 3.2: DONTNEED, 3.3: REQUIRED
----------------------------------------
Ruby 3.3.0 YJIT missed the support for the instruction that was added shortly before the 3.3.0 release. It's used in Rails, and we didn't mean to exit on such method calls.
It'd be nice if we can fix the issue in Ruby 3.3.1 by backporting https://github.com/ruby/ruby/pull/9711.
--
https://bugs.ruby-lang.org/
Issue #20276 has been reported by forthoney (Seong-Heon Jung).
----------------------------------------
Feature #20276: Introduce Fiber interfaces for Ractors
https://bugs.ruby-lang.org/issues/20276
* Author: forthoney (Seong-Heon Jung)
* Status: Open
* Priority: Normal
----------------------------------------
## Motivation
I am trying to build a web server with Ractors. The lifecycle for a request in the current implementation is
1. main ractor buffers request
2. main ractor sends request to worker ractor
3. worker ractor sends response to main ractor
4. main ractor writes response
5. repeat
The main ractor utilizes the Async gem (specifically async-http) to handle connections concurrently, meaning each request is handled on a separate fiber.
The issue I am running into is after I send a request to a worker ractor, I need to do a blocking wait until I receive a response.
While I am waiting for the response, I cannot take any more connections.
## Solution
If the fiber scheduler had a hook for `Ractor.receive` or `Ractor#take` (both of which are blocking), the main ractor can send the message, handle other connections while the worker processes the request. When the worker produces a message, it will then take the reqeust and write it in the socket.
--
https://bugs.ruby-lang.org/
Issue #19787 has been reported by joshuay03 (Joshua Young).
----------------------------------------
Feature #19787: Add Enumerable#uniq_map, Enumerable::Lazy#uniq_map, Array#uniq_map and Array#uniq_map!
https://bugs.ruby-lang.org/issues/19787
* Author: joshuay03 (Joshua Young)
* Status: Open
* Priority: Normal
----------------------------------------
I would like to propose a collection of new methods, `Enumerable#uniq_map`, `Enumerable::Lazy#uniq_map`, `Array#uniq_map` and `Array#uniq_map!`.
TL;DR: It's a drop in replacement for `.map { ... }.uniq`, with better performance.
I've quite often had to map over an array and get its unique elements. It occurred to me when doing so recently that Ruby doesn't have a short form method for doing that, similar to how `flat_map { ... }` replaces `.map { ... }.flatten` and `filter_map { ... }` replaces `.map { ... }.compact` (with minor differences). I think these new methods could be beneficial both in terms of better performance and writing more succinct code.
I have already got a draft PR up with some initial benchmarks in the description: https://github.com/ruby/ruby/pull/8140.
--
https://bugs.ruby-lang.org/
Issue #20301 has been reported by AMomchilov (Alexander Momchilov).
----------------------------------------
Bug #20301: `Set#add?` does two hash look-ups
https://bugs.ruby-lang.org/issues/20301
* Author: AMomchilov (Alexander Momchilov)
* Status: Open
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN
----------------------------------------
A common usage of `Set`s is to keep track of seen objects, and do something different whenever an object is seen for the first time, e.g.:
```ruby
SEEN_VALUES = Set.new
def receive_value(value)
if SEEN_VALUES.add?(value)
puts "Saw #{value} for the first time."
else
puts "Already seen #{value}, ignoring."
end
end
receive_value(1) # Saw 1 for the first time.
receive_value(2) # Saw 2 for the first time.
receive_value(3) # Saw 3 for the first time.
receive_value(1) # Already seen 1, ignoring.
```
Readers might reasonably assume that `add?` is only looking up into the set a single time, but it's actually doing two separate look-ups! ([source](https://github.com/ruby/ruby/blob/c976cb5/lib/set.rb#L517-L525))
```rb
class Set
def add?(o
# 1. `include?(o)` looks up into `@hash`
# 2. if the value isn't there, `add(o)` does a second look-up into `@hash`
add(o) unless include?(o)
end
end
```
This gets especially expensive if the values are large hash/arrays/objects, whose `#hash` is expensive to compute.
We can optimize this if it was possible to set a value in hash, *and* retrieve the value that was already there, in a single go. I propose adding `Hash#update_value` to do exactly that. If that existed, we can re-implement `#add?` as:
```rb
class Set
def add?(o)
# Only requires a single look-up into `@hash`!
self unless @hash.update_value(o, true)
end
```
Here's a PR: https://github.com/ruby/ruby/pull/10093
How much of a benefit this has depends on two things:
1. How much `#hash` is called, which depends on how many new objects are added to the set.
* If every object is new, then `#hash` is called twice on every `#add?`. This is where this improvement makes the biggest (2x!) change.
* If every object has already been seen, then `#hash` was never being called twice before anyway, so there would be no improvement
* Every other case lies somewhere in between those two.
2. How slow `#hash` is to compute for the key
* If the hash is slow to compute, this change will make a bigger improvement
* If the hash value is fast to compute, then it won't matter as much. Even if we called it half as much, it's a minority of the total time, so it won't have much net impact.
Here is a summary of the benchmark results:
| | All objects are new | All objects are preexisting |
|---------------------------|-------:|------:|
| objects with slow `#hash` | 100.0% | ~0.0% |
| objects with fast `#hash` | 24.5% | 4.6% |
--
https://bugs.ruby-lang.org/
Issue #20225 has been reported by make_now_just (Hiroya Fujinami).
----------------------------------------
Bug #20225: Inconsistent behavior of regex matching for a regex has a null loop
https://bugs.ruby-lang.org/issues/20225
* Author: make_now_just (Hiroya Fujinami)
* Status: Open
* Priority: Normal
* Assignee: make_now_just (Hiroya Fujinami)
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN
----------------------------------------
Usually, in Ruby (Onigmo), when a null loop (a loop consuming no characters) occurs on regex matching, this loop is terminated. But, if a loop has a capture and some complex condition is satisfied, this causes backtracking. This behavior invokes unexpected results, for example,
```ruby
p /(?:.B.(?<a>(?:[C-Z]|.)*)+){2}/ =~ "ABCABC" # => nil
p /(?:.B.(?:(?:[C-Z]|.)*)+){2}/ =~ "ABCABC" # => 0
```
Because the above regex has a capture and the below does not, different matching results are returned. It is not very intuitive that the presence of a capture changes the matching result.
The detailed condition for changing the null-loop behavior is 1) a previous capture in this loop holds the empty string, and 2) this capture's position is different from the current matching position. This condition is checked in `STACK_NULL_CHECK_MEMST` (https://github.com/ruby/ruby/blob/bbb7ab906ec64b963bd4b5d37e47b14796d64371/…).
Perhaps, you cannot understand what this condition means. Don't worry, I also cannot understand. This condition has been introduced for at least 20 years, and no one may remember the reason for this necessity. (If you know, please tell me!) Even if there is a reason, I believe that there is no reasonable authority for allowing counter-intuitive behavior, such as the above example.
This behavior can also cause memoization to be buggy. Memoization relies on the fact that backtracking only depends on positions and states (byte-code offsets of a regex). However, this condition additionally refers to captures, and the memoization is broken.
My proposal is to **correct this inconsistent behavior**. Specifically, a null loop should be determined solely on the basis of whether the matching position has changed, without referring to captures.
This fix changes the behavior of regex matching, but I believe that the probability that this will actually cause backward compatibility problems is remarkably low. This is because I have never seen any mention of this puzzling behavior before.
--
https://bugs.ruby-lang.org/
Issue #20309 has been reported by hsbt (Hiroshi SHIBATA).
----------------------------------------
Feature #20309: Bundled gems for Ruby 3.5
https://bugs.ruby-lang.org/issues/20309
* Author: hsbt (Hiroshi SHIBATA)
* Status: Assigned
* Assignee: hsbt (Hiroshi SHIBATA)
----------------------------------------
I propose migrate the following default gems to bundled gems at Ruby 3.5. So, It means users will get warnings if users try to load them.
* ostruct
* irb
* reline
* readline (wrapper file for readline-ext and reline)
* io-console
* logger
* fiddle
* pstore
* open-uri
* yaml (wrapper file for psych)
* win32ole
I have a plan to migrate the following default gems too. But I need to more feedback from other committers about them.
* rdoc
* We need to change build task like download rdoc gem before document generation.
* or We make document generation is optional from Ruby 3.5
* We explicitly separate `make install` and `make install-doc`
* un
* `ruby -run` is one of cool feature of Ruby. Should we avoid uninstalling `un` gem?
* singleton
* This is famous design pattern. Should we enforce users add them to their Gemfile?
* forwadable
* `reline` needs to add forwardable their `runtime_dependency` after migration.
* weakref
* I'm not sure how impact after migrating bundled gems.
* fcntl
* Should we integrate these constants into ruby core?
I would like to migrate `ipaddr` and `uri` too. But these are used by webrick that is mock server for our test suite. We need to rewrite `webrick` with `TCPSocker` or extract `ipaddr` and `uri` dependency from `webrick`
Other default gems depend on our build process or other libraries deeply. I will update this proposal if I could extract them from default gems.
--
https://bugs.ruby-lang.org/
Issue #20150 has been reported by peterzhu2118 (Peter Zhu).
----------------------------------------
Bug #20150: Memory leak in grapheme clusters
https://bugs.ruby-lang.org/issues/20150
* Author: peterzhu2118 (Peter Zhu)
* Status: Open
* Priority: Normal
* Backport: 3.0: UNKNOWN, 3.1: REQUIRED, 3.2: REQUIRED, 3.3: REQUIRED
----------------------------------------
GitHub PR: https://github.com/ruby/ruby/pull/9414
String#grapheme_cluters and String#each_grapheme_cluster leaks memory because if the string is not UTF-8, then the created regex will not be freed.
For example:
```ruby
str = "hello world".encode(Encoding::UTF_32LE)
10.times do
1_000.times do
str.grapheme_clusters
end
puts `ps -o rss= -p #{$$}`
end
```
Before:
```
26000
42256
59008
75792
92528
109232
125936
142672
159392
176160
```
After:
```
9264
9504
9808
10000
10128
10224
10352
10544
10704
10896
```
--
https://bugs.ruby-lang.org/
Issue #20090 has been reported by willcosgrove (Will Cosgrove).
----------------------------------------
Bug #20090: Anonymous arguments are now syntax errors in unambiguous cases
https://bugs.ruby-lang.org/issues/20090
* Author: willcosgrove (Will Cosgrove)
* Status: Open
* Priority: Normal
* ruby -v: ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [arm64-darwin23]
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN
----------------------------------------
It looks like the changes that were made in #19370 may have gone further than intended. It's also possible I'm misunderstanding what decision was made. But it was my understanding that the goal was to make ambiguous cases a syntax error. The test cases added are all testing the ambiguous cases:
```rb
assert_syntax_error("def b(&) ->(&) {c(&)} end", /anonymous block parameter is also used/)
# ...
assert_syntax_error("def b(*) ->(*) {c(*)} end", /anonymous rest parameter is also used/)
assert_syntax_error("def b(a, *) ->(*) {c(1, *)} end", /anonymous rest parameter is also used/)
assert_syntax_error("def b(*) ->(a, *) {c(*)} end", /anonymous rest parameter is also used/)
# ...
assert_syntax_error("def b(**) ->(**) {c(**)} end", /anonymous keyword rest parameter is also used/)
assert_syntax_error("def b(k:, **) ->(**) {c(k: 1, **)} end", /anonymous keyword rest parameter is also used/)
assert_syntax_error("def b(**) ->(k:, **) {c(**)} end", /anonymous keyword rest parameter is also used/)
```
However it is now also producing syntax errors in all of these cases:
```rb
def b(&) -> { c(&) } end
def b(*) -> { c(*) } end
def b(a, *) -> { c(1, *) } end
def b(*) ->(a) { c(a, *) } end
def b(**) -> { c(**) } end
def b(k:, **) -> { c(k: 1, **) } end
def b(**) ->(k:) { c(k:, **) } end
```
Again, it's possible I misunderstood the scope of the previous change. But it would be sad to lose the unambiguous case, as I've used that pattern quite a bit in my own projects.
This is my first time opening an issue here, so I apologize in advance if I've done anything non-standard.
--
https://bugs.ruby-lang.org/
Issue #20198 has been reported by kjtsanaktsidis (KJ Tsanaktsidis).
----------------------------------------
Bug #20198: Threaded DNS resolver does not propagate errno to the calling thread
https://bugs.ruby-lang.org/issues/20198
* Author: kjtsanaktsidis (KJ Tsanaktsidis)
* Status: Open
* Priority: Normal
* Assignee: kjtsanaktsidis (KJ Tsanaktsidis)
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN
----------------------------------------
If we get a return value of `EAI_SYSTEM` from `getaddrinfo`, we transform that into an appropriate `Errno::` exception on the Ruby side. However, because we now run the actual call to `getaddrinfo` in a thread, we lose that `errno` value (because `errno` is thread-local). So, what we actually raise in case of `EAI_SYSTEM` is just the last error which happened on the calling thread - e.g. this `ECHILD` which presumably got set in the bowels of pthreads somewhere:
```
1)
Socket::IPSocket#getaddress raises an error on unknown hostnames ERROR
Expected SocketError
but got: Errno::ECHILD (No child processes - getaddrinfo)
/home/runner/work/ruby/ruby/src/spec/ruby/library/socket/ipsocket/getaddress_spec.rb:22:in `getaddress'
/home/runner/work/ruby/ruby/src/spec/ruby/library/socket/ipsocket/getaddress_spec.rb:22:in `block (3 levels) in <top (required)>'
/home/runner/work/ruby/ruby/src/spec/ruby/library/socket/ipsocket/getaddress_spec.rb:21:in `block (2 levels) in <top (required)>'
/home/runner/work/ruby/ruby/src/spec/ruby/library/socket/ipsocket/getaddress_spec.rb:4:in `<top (required)>'
```
--
https://bugs.ruby-lang.org/