February 2024 - ruby-core - ml.ruby-lang.org

[ruby-core:116037] [Ruby master Bug#20153] Backport 7f9c174102 to fix --yjit-stats with RubyVM::YJIT.enable

by k0kubun (Takashi Kokubun)

Issue #20153 has been reported by k0kubun (Takashi Kokubun). ---------------------------------------- Bug #20153: Backport 7f9c174102 to fix --yjit-stats with RubyVM::YJIT.enable https://bugs.ruby-lang.org/issues/20153 * Author: k0kubun (Takashi Kokubun) * Status: Open * Priority: Normal * Backport: 3.0: DONTNEED, 3.1: DONTNEED, 3.2: DONTNEED, 3.3: REQUIRED ---------------------------------------- Ruby 3.3.0 ignored --yjit-stats when `RubyVM::YJIT.enable` (no argument) is used, which was an unintended behavior. https://github.com/ruby/ruby/pull/9415 should be backported to ruby_3_3. -- https://bugs.ruby-lang.org/

2 months

2
2
0 0

[ruby-core:116454] [Ruby master Bug#20214] Backport https://github.com/ruby/ruby/pull/9711 to fix exits on Ruby 3.3's new instruction

by k0kubun (Takashi Kokubun)

Issue #20214 has been reported by k0kubun (Takashi Kokubun). ---------------------------------------- Bug #20214: Backport https://github.com/ruby/ruby/pull/9711 to fix exits on Ruby 3.3's new instruction https://bugs.ruby-lang.org/issues/20214 * Author: k0kubun (Takashi Kokubun) * Status: Closed * Priority: Normal * Assignee: naruse (Yui NARUSE) * Backport: 3.0: DONTNEED, 3.1: DONTNEED, 3.2: DONTNEED, 3.3: REQUIRED ---------------------------------------- Ruby 3.3.0 YJIT missed the support for the instruction that was added shortly before the 3.3.0 release. It's used in Rails, and we didn't mean to exit on such method calls. It'd be nice if we can fix the issue in Ruby 3.3.1 by backporting https://github.com/ruby/ruby/pull/9711. -- https://bugs.ruby-lang.org/

2 months

2
1
0 0

[ruby-core:116827] [Ruby master Feature#20276] Introduce Fiber interfaces for Ractors

by forthoney (Seong-Heon Jung)

Issue #20276 has been reported by forthoney (Seong-Heon Jung). ---------------------------------------- Feature #20276: Introduce Fiber interfaces for Ractors https://bugs.ruby-lang.org/issues/20276 * Author: forthoney (Seong-Heon Jung) * Status: Open * Priority: Normal ---------------------------------------- ## Motivation I am trying to build a web server with Ractors. The lifecycle for a request in the current implementation is 1. main ractor buffers request 2. main ractor sends request to worker ractor 3. worker ractor sends response to main ractor 4. main ractor writes response 5. repeat The main ractor utilizes the Async gem (specifically async-http) to handle connections concurrently, meaning each request is handled on a separate fiber. The issue I am running into is after I send a request to a worker ractor, I need to do a blocking wait until I receive a response. While I am waiting for the response, I cannot take any more connections. ## Solution If the fiber scheduler had a hook for `Ractor.receive` or `Ractor#take` (both of which are blocking), the main ractor can send the message, handle other connections while the worker processes the request. When the worker produces a message, it will then take the reqeust and write it in the socket. -- https://bugs.ruby-lang.org/

2 months

3
7
0 0

[ruby-core:114309] [Ruby master Feature#19787] Add Enumerable#uniq_map, Enumerable::Lazy#uniq_map, Array#uniq_map and Array#uniq_map!

by joshuay03 (Joshua Young)

Issue #19787 has been reported by joshuay03 (Joshua Young). ---------------------------------------- Feature #19787: Add Enumerable#uniq_map, Enumerable::Lazy#uniq_map, Array#uniq_map and Array#uniq_map! https://bugs.ruby-lang.org/issues/19787 * Author: joshuay03 (Joshua Young) * Status: Open * Priority: Normal ---------------------------------------- I would like to propose a collection of new methods, `Enumerable#uniq_map`, `Enumerable::Lazy#uniq_map`, `Array#uniq_map` and `Array#uniq_map!`. TL;DR: It's a drop in replacement for `.map { ... }.uniq`, with better performance. I've quite often had to map over an array and get its unique elements. It occurred to me when doing so recently that Ruby doesn't have a short form method for doing that, similar to how `flat_map { ... }` replaces `.map { ... }.flatten` and `filter_map { ... }` replaces `.map { ... }.compact` (with minor differences). I think these new methods could be beneficial both in terms of better performance and writing more succinct code. I have already got a draft PR up with some initial benchmarks in the description: https://github.com/ruby/ruby/pull/8140. -- https://bugs.ruby-lang.org/

2 months

4
6
0 0

[ruby-core:116941] [Ruby master Bug#20301] `Set#add?` does two hash look-ups

by AMomchilov (Alexander Momchilov)

Issue #20301 has been reported by AMomchilov (Alexander Momchilov). ---------------------------------------- Bug #20301: `Set#add?` does two hash look-ups https://bugs.ruby-lang.org/issues/20301 * Author: AMomchilov (Alexander Momchilov) * Status: Open * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN ---------------------------------------- A common usage of `Set`s is to keep track of seen objects, and do something different whenever an object is seen for the first time, e.g.: ```ruby SEEN_VALUES = Set.new def receive_value(value) if SEEN_VALUES.add?(value) puts "Saw #{value} for the first time." else puts "Already seen #{value}, ignoring." end end receive_value(1) # Saw 1 for the first time. receive_value(2) # Saw 2 for the first time. receive_value(3) # Saw 3 for the first time. receive_value(1) # Already seen 1, ignoring. ``` Readers might reasonably assume that `add?` is only looking up into the set a single time, but it's actually doing two separate look-ups! ([source](https://github.com/ruby/ruby/blob/c976cb5/lib/set.rb#L517-L525)) ```rb class Set def add?(o # 1. `include?(o)` looks up into `@hash` # 2. if the value isn't there, `add(o)` does a second look-up into `@hash` add(o) unless include?(o) end end ``` This gets especially expensive if the values are large hash/arrays/objects, whose `#hash` is expensive to compute. We can optimize this if it was possible to set a value in hash, *and* retrieve the value that was already there, in a single go. I propose adding `Hash#update_value` to do exactly that. If that existed, we can re-implement `#add?` as: ```rb class Set def add?(o) # Only requires a single look-up into `@hash`! self unless @hash.update_value(o, true) end ``` Here's a PR: https://github.com/ruby/ruby/pull/10093 How much of a benefit this has depends on two things: 1. How much `#hash` is called, which depends on how many new objects are added to the set. * If every object is new, then `#hash` is called twice on every `#add?`. This is where this improvement makes the biggest (2x!) change. * If every object has already been seen, then `#hash` was never being called twice before anyway, so there would be no improvement * Every other case lies somewhere in between those two. 2. How slow `#hash` is to compute for the key * If the hash is slow to compute, this change will make a bigger improvement * If the hash value is fast to compute, then it won't matter as much. Even if we called it half as much, it's a minority of the total time, so it won't have much net impact. Here is a summary of the benchmark results: | | All objects are new | All objects are preexisting | |---------------------------|-------:|------:| | objects with slow `#hash` | 100.0% | ~0.0% | | objects with fast `#hash` | 24.5% | 4.6% | -- https://bugs.ruby-lang.org/

2 months

5
9
0 0

[ruby-core:116491] [Ruby master Bug#20225] Inconsistent behavior of regex matching for a regex has a null loop

by make_now_just (Hiroya Fujinami)

Issue #20225 has been reported by make_now_just (Hiroya Fujinami). ---------------------------------------- Bug #20225: Inconsistent behavior of regex matching for a regex has a null loop https://bugs.ruby-lang.org/issues/20225 * Author: make_now_just (Hiroya Fujinami) * Status: Open * Priority: Normal * Assignee: make_now_just (Hiroya Fujinami) * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN ---------------------------------------- Usually, in Ruby (Onigmo), when a null loop (a loop consuming no characters) occurs on regex matching, this loop is terminated. But, if a loop has a capture and some complex condition is satisfied, this causes backtracking. This behavior invokes unexpected results, for example, ```ruby p /(?:.B.(?<a>(?:[C-Z]|.)*)+){2}/ =~ "ABCABC" # => nil p /(?:.B.(?:(?:[C-Z]|.)*)+){2}/ =~ "ABCABC" # => 0 ``` Because the above regex has a capture and the below does not, different matching results are returned. It is not very intuitive that the presence of a capture changes the matching result. The detailed condition for changing the null-loop behavior is 1) a previous capture in this loop holds the empty string, and 2) this capture's position is different from the current matching position. This condition is checked in `STACK_NULL_CHECK_MEMST` (https://github.com/ruby/ruby/blob/bbb7ab906ec64b963bd4b5d37e47b14796d64371/…). Perhaps, you cannot understand what this condition means. Don't worry, I also cannot understand. This condition has been introduced for at least 20 years, and no one may remember the reason for this necessity. (If you know, please tell me!) Even if there is a reason, I believe that there is no reasonable authority for allowing counter-intuitive behavior, such as the above example. This behavior can also cause memoization to be buggy. Memoization relies on the fact that backtracking only depends on positions and states (byte-code offsets of a regex). However, this condition additionally refers to captures, and the memoization is broken. My proposal is to **correct this inconsistent behavior**. Specifically, a null loop should be determined solely on the basis of whether the matching position has changed, without referring to captures. This fix changes the behavior of regex matching, but I believe that the probability that this will actually cause backward compatibility problems is remarkably low. This is because I have never seen any mention of this puzzling behavior before. -- https://bugs.ruby-lang.org/

2 months

4
9
0 0

[ruby-core:116983] [Ruby master Feature#20309] Bundled gems for Ruby 3.5

by hsbt (Hiroshi SHIBATA)

Issue #20309 has been reported by hsbt (Hiroshi SHIBATA). ---------------------------------------- Feature #20309: Bundled gems for Ruby 3.5 https://bugs.ruby-lang.org/issues/20309 * Author: hsbt (Hiroshi SHIBATA) * Status: Assigned * Assignee: hsbt (Hiroshi SHIBATA) ---------------------------------------- I propose migrate the following default gems to bundled gems at Ruby 3.5. So, It means users will get warnings if users try to load them. * ostruct * irb * reline * readline (wrapper file for readline-ext and reline) * io-console * logger * fiddle * pstore * open-uri * yaml (wrapper file for psych) * win32ole I have a plan to migrate the following default gems too. But I need to more feedback from other committers about them. * rdoc * We need to change build task like download rdoc gem before document generation. * or We make document generation is optional from Ruby 3.5 * We explicitly separate `make install` and `make install-doc` * un * `ruby -run` is one of cool feature of Ruby. Should we avoid uninstalling `un` gem? * singleton * This is famous design pattern. Should we enforce users add them to their Gemfile? * forwadable * `reline` needs to add forwardable their `runtime_dependency` after migration. * weakref * I'm not sure how impact after migrating bundled gems. * fcntl * Should we integrate these constants into ruby core? I would like to migrate `ipaddr` and `uri` too. But these are used by webrick that is mock server for our test suite. We need to rewrite `webrick` with `TCPSocker` or extract `ipaddr` and `uri` dependency from `webrick` Other default gems depend on our build process or other libraries deeply. I will update this proposal if I could extract them from default gems. -- https://bugs.ruby-lang.org/

2 months

5
12
0 0

[ruby-core:116016] [Ruby master Bug#20150] Memory leak in grapheme clusters

by peterzhu2118 (Peter Zhu)

Issue #20150 has been reported by peterzhu2118 (Peter Zhu). ---------------------------------------- Bug #20150: Memory leak in grapheme clusters https://bugs.ruby-lang.org/issues/20150 * Author: peterzhu2118 (Peter Zhu) * Status: Open * Priority: Normal * Backport: 3.0: UNKNOWN, 3.1: REQUIRED, 3.2: REQUIRED, 3.3: REQUIRED ---------------------------------------- GitHub PR: https://github.com/ruby/ruby/pull/9414 String#grapheme_cluters and String#each_grapheme_cluster leaks memory because if the string is not UTF-8, then the created regex will not be freed. For example: ```ruby str = "hello world".encode(Encoding::UTF_32LE) 10.times do 1_000.times do str.grapheme_clusters end puts `ps -o rss= -p #{$$}` end ``` Before: ``` 26000 42256 59008 75792 92528 109232 125936 142672 159392 176160 ``` After: ``` 9264 9504 9808 10000 10128 10224 10352 10544 10704 10896 ``` -- https://bugs.ruby-lang.org/

2 months

5
6
0 0

[ruby-core:115912] [Ruby master Bug#20090] Anonymous arguments are now syntax errors in unambiguous cases

by willcosgrove (Will Cosgrove)

Issue #20090 has been reported by willcosgrove (Will Cosgrove). ---------------------------------------- Bug #20090: Anonymous arguments are now syntax errors in unambiguous cases https://bugs.ruby-lang.org/issues/20090 * Author: willcosgrove (Will Cosgrove) * Status: Open * Priority: Normal * ruby -v: ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [arm64-darwin23] * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN ---------------------------------------- It looks like the changes that were made in #19370 may have gone further than intended. It's also possible I'm misunderstanding what decision was made. But it was my understanding that the goal was to make ambiguous cases a syntax error. The test cases added are all testing the ambiguous cases: ```rb assert_syntax_error("def b(&) ->(&) {c(&)} end", /anonymous block parameter is also used/) # ... assert_syntax_error("def b(*) ->(*) {c(*)} end", /anonymous rest parameter is also used/) assert_syntax_error("def b(a, *) ->(*) {c(1, *)} end", /anonymous rest parameter is also used/) assert_syntax_error("def b(*) ->(a, *) {c(*)} end", /anonymous rest parameter is also used/) # ... assert_syntax_error("def b(**) ->(**) {c(**)} end", /anonymous keyword rest parameter is also used/) assert_syntax_error("def b(k:, **) ->(**) {c(k: 1, **)} end", /anonymous keyword rest parameter is also used/) assert_syntax_error("def b(**) ->(k:, **) {c(**)} end", /anonymous keyword rest parameter is also used/) ``` However it is now also producing syntax errors in all of these cases: ```rb def b(&) -> { c(&) } end def b(*) -> { c(*) } end def b(a, *) -> { c(1, *) } end def b(*) ->(a) { c(a, *) } end def b(**) -> { c(**) } end def b(k:, **) -> { c(k: 1, **) } end def b(**) ->(k:) { c(k:, **) } end ``` Again, it's possible I misunderstood the scope of the previous change. But it would be sad to lose the unambiguous case, as I've used that pattern quite a bit in my own projects. This is my first time opening an issue here, so I apologize in advance if I've done anything non-standard. -- https://bugs.ruby-lang.org/

2 months

6
7
0 0

[ruby-core:116356] [Ruby master Bug#20198] Threaded DNS resolver does not propagate errno to the calling thread

by kjtsanaktsidis (KJ Tsanaktsidis)

Issue #20198 has been reported by kjtsanaktsidis (KJ Tsanaktsidis). ---------------------------------------- Bug #20198: Threaded DNS resolver does not propagate errno to the calling thread https://bugs.ruby-lang.org/issues/20198 * Author: kjtsanaktsidis (KJ Tsanaktsidis) * Status: Open * Priority: Normal * Assignee: kjtsanaktsidis (KJ Tsanaktsidis) * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN ---------------------------------------- If we get a return value of `EAI_SYSTEM` from `getaddrinfo`, we transform that into an appropriate `Errno::` exception on the Ruby side. However, because we now run the actual call to `getaddrinfo` in a thread, we lose that `errno` value (because `errno` is thread-local). So, what we actually raise in case of `EAI_SYSTEM` is just the last error which happened on the calling thread - e.g. this `ECHILD` which presumably got set in the bowels of pthreads somewhere: ``` 1) Socket::IPSocket#getaddress raises an error on unknown hostnames ERROR Expected SocketError but got: Errno::ECHILD (No child processes - getaddrinfo) /home/runner/work/ruby/ruby/src/spec/ruby/library/socket/ipsocket/getaddress_spec.rb:22:in `getaddress' /home/runner/work/ruby/ruby/src/spec/ruby/library/socket/ipsocket/getaddress_spec.rb:22:in `block (3 levels) in <top (required)>' /home/runner/work/ruby/ruby/src/spec/ruby/library/socket/ipsocket/getaddress_spec.rb:21:in `block (2 levels) in <top (required)>' /home/runner/work/ruby/ruby/src/spec/ruby/library/socket/ipsocket/getaddress_spec.rb:4:in `<top (required)>' ``` -- https://bugs.ruby-lang.org/

2 months

2
3
0 0