December 2023 - ruby-core - ml.ruby-lang.org

[ruby-core:112207] [Ruby master Bug#19408] Object no longer frozen after moved from a ractor

by luke-gru (Luke Gruber)

Issue #19408 has been reported by luke-gru (Luke Gruber). ---------------------------------------- Bug #19408: Object no longer frozen after moved from a ractor https://bugs.ruby-lang.org/issues/19408 * Author: luke-gru (Luke Gruber) * Status: Open * Priority: Normal * Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- I think frozen objects should still be frozen after a move. ```ruby r = Ractor.new do obj = receive p obj.frozen? # should be true but is false p obj end obj = [Object.new].freeze r.send(obj, move: true) r.take ``` -- https://bugs.ruby-lang.org/

2 months, 3 weeks

2
2
0 0

[ruby-core:115864] [Ruby master Feature#20080] Implement #begin_and_end method on Range

by stuyam (Stuart Yamartino)

Issue #20080 has been reported by stuyam (Stuart Yamartino). ---------------------------------------- Feature #20080: Implement #begin_and_end method on Range https://bugs.ruby-lang.org/issues/20080 * Author: stuyam (Stuart Yamartino) * Status: Open * Priority: Normal ---------------------------------------- Followup Reference: #20027 This feature request is to implement a method called `#begin_and_end` on `Range` that returns an array of the first and last value stored in a range: ```ruby (1..300).begin_and_end #=> [1, 300] first, last = (300..1).begin_and_end first #=> 300 last #=> 1 ``` I believe this would be a great addition to Ranges as they are often used to pass around a single object used to hold endpoints, and this allows easier retrieval of those endpoints. This would allow easier deconstruction into start and end values using array deconstruction as well as a simpler way to serialize to a more primitive object such as an array for database storage. This implementation was suggested by @mame in my initial feature suggestion regarding range deconstruction: https://bugs.ruby-lang.org/issues/20027 This implementation would work similar to how `#minmax` works where it returns an array of two numbers, however the difference is that `#minmax` doesn't work with reverse ranges as @Dan0042 pointed out in the link above: ```ruby (1..42).minmax #=> [1, 42] (42..1).minmax #=> [nil, nil] ``` -- https://bugs.ruby-lang.org/

2 months, 3 weeks

9
16
0 0

[ruby-core:114915] [Ruby master Feature#19905] Introduce `Queue#peek`

by hi＠joaofernandes.me (Joao Fernandes)

Issue #19905 has been reported by hi(a)joaofernandes.me (Joao Fernandes). ---------------------------------------- Feature #19905: Introduce `Queue#peek` https://bugs.ruby-lang.org/issues/19905 * Author: hi(a)joaofernandes.me (Joao Fernandes) * Status: Open * Priority: Normal ---------------------------------------- This ticket proposes the introduction of the `Queue#peek` method, similar to what we can find in other object oriented languages such as Java and C#. This method is similar to `Queue#pop`, but does not change the data, nor does it require a lock. ``` q = Queue.new([1,2,3]) => #<Thread::Queue:0x00000001065d7148> q.peek => 1 q.peek => 1 ``` I have felt the need of this for debugging, but I think that it can also be of practical use for presentation. I believe that the only drawback could be that newcomers could misuse it in multi-threaded work without taking into account that this method is not thread safe. I also volunteer myself to implement this method. -- https://bugs.ruby-lang.org/

2 months, 3 weeks

6
7
0 0

[ruby-core:115286] [Ruby master Bug#19991] rb_register_postponed_job async-signal-unsafety causes crash in GC

by kjtsanaktsidis (KJ Tsanaktsidis)

Issue #19991 has been reported by kjtsanaktsidis (KJ Tsanaktsidis). ---------------------------------------- Bug #19991: rb_register_postponed_job async-signal-unsafety causes crash in GC https://bugs.ruby-lang.org/issues/19991 * Author: kjtsanaktsidis (KJ Tsanaktsidis) * Status: Open * Priority: Normal * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- Our production application experienced an interpreter crash today that I’m fairly sure is the result of a bug in the `rb_register_postponed_job` infrastructure. ## Diagnosis I’ve attached a more complete version of the crash dump, but the key information I think is the following backtrace: ``` <internal:gc>:35: [BUG] Segmentation fault at 0x00000000000000c8 .... /usr/lib/libruby.so.3.1(rb_vm_bugreport+0x60c) [0xffffaaf57a6c] vm_dump.c:759 /usr/lib/libruby.so.3.1(rb_bug_for_fatal_signal+0xd8) [0xffffaad6d120] error.c:821 /usr/lib/libruby.so.3.1(sigsegv+0x58) [0xffffaaeb4350] signal.c:964 /usr/lib/libruby.so.3.1(sigill) (null):0 linux-vdso.so.1(__kernel_rt_sigreturn+0x0) [0xffffab20f78c] /usr/lib/libruby.so.3.1(rbimpl_atomic_exchange+0x0) [0xffffaad94018] gc.c:4081 /usr/lib/libruby.so.3.1(gc_finalize_deferred) gc.c:4081 /usr/lib/libruby.so.3.1(rb_postponed_job_flush+0x218) [0xffffaaf5dd60] vm_trace.c:1728 /usr/lib/libruby.so.3.1(rb_threadptr_execute_interrupts+0x328) [0xffffaaef7520] thread.c:2444 /usr/lib/libruby.so.3.1(vm_exec_core+0x4e54) [0xffffaaf41c6c] vm.inc:588 /usr/lib/libruby.so.3.1(rb_vm_exec+0x144) [0xffffaaf427cc] vm.c:2211 ``` I disassembled `gc_finalize_deferred` in GDB (different execution, so the addresses don’t line up - this is NOT from a core dump): ``` (gdb) disassemble gc_finalize_deferred Dump of assembler code for function gc_finalize_deferred: 0x0000fffff7cc3ff8 <+0>: stp x29, x30, [sp, #-48]! 0x0000fffff7cc3ffc <+4>: mov w1, #0x1 // #1 0x0000fffff7cc4000 <+8>: mov x29, sp 0x0000fffff7cc4004 <+12>: stp x19, x20, [sp, #16] 0x0000fffff7cc4008 <+16>: mov x19, x0 0x0000fffff7cc400c <+20>: add x20, x0, #0xc8 0x0000fffff7cc4010 <+24>: ldaxr w0, [x20] .... continues .... ``` Based on the line numbers from the debuginfo, the faulting instruction is likely to be `gc_finalize_deferred+24` (which is the inlined `rbimpl_atomic_exchange`). That would attempt a load of 0xc8 if x0 was zero - i.e. if `gc_finalize_deferred` was called with a NULL objspace. The enqueuing of `gc_finalize_deferred` with postponed job only happens in one place (in gc_sweep_page, [here](https://github.com/ruby/ruby/blob/5cff4c5aa375787924e2df5c0b981dd922b…) and if `objspace` was null there then the crash would have had to have already happened in `gc_sweep_page`. Thus, I think that `gc_finalize_deferred` was _enqueued_ into the postponed job list with a not-NULL argument, but the argument was corrupted whilst it was in `vm->postponed_job_buffer`, and if objspace was null there then the crash would have had to have already happened in `gc_sweep_page`. Thus, I think that `gc_finalize_deferred` was enqueued into the postponed job list with a not-NULL argument, but the argument was corrupted whilst it was in `vm->postponed_job_buffer`. I had a look at the postponed job code, which is of course very tricky because it needs to be async-signal-safe. More specifically: * It needs to work if run from a thread not holding the GVL * It needs to work if run from a thread, whilst another thread is actually executing `rb_postponed_job_flush` * It needs to work if run from a signal caught in a thread that is currently executing `rb_postponed_job_flush` (this rules out trivial mutex-based solutions) We use the Datadog continuous profiler in our application (CC: @ivoanjo ;), which calls `rb_postponed_job_register` to capture profiling samples. Thus, I think our application is likely to hit all of those scenarios semi-reguarly. My best guess at a plausible sequence of events, to generate this particular crash, is that: 1. `rb_postponed_job_flush` was running on thread T1. 2. There is a queued call to gc_finalize_deferred sitting in `vm->postponed_job_buffer[vm->postponed_job_index-1]`. 3. T1 executed the `ATOMIC_CAS` at vm_trace.c:1800, decrementing `vm->postponed_job_index` (which now equals `index - 1`) and determining that a job at index index should be run. 4. Thread T2 received a signal, and the Datadog continuous profiler called `rb_postponed_job_register_one(0, sample_from_postponed_job, NULL)` 5. T2 executed the `ATOMIC_CAS` at vm_trace.c:1679, re-incrementing `vm->postponed_job_index`. It’s now equal to `index` from T1 again. 6. T2 then executes the sets on `pjob->func` and `pjob->data` at vm_trace.c:1687. It sets `->func` to `sample_from_postponed_job` (from ddtrace), and `->data` to 0. 7. T1 then goes to call `(*pjob->func)(pjob->data)` at vm_trace.c:1802 8. Since there is no memory barrier between 6 & 7, T1 is allowed to see the effects of the set on `pjob->data` and not see that of `pjob->func`. 9, T1 thus calls `gc_finalize_deferred` (which it was meant to do) with an argument of 0 (which it was not). ## Solution Managing a thread-safe list of too-big-to-be-atomic objects (like `rb_postponed_job_t`) is really tricky. I think it might be best for this code if we use a combination of masking signals (to prevent manipulation of the postponed job list occurring during `rb_postponed_job_flush`, and using a semaphore to protect the list (since semaphores are required to be async-signal-safe on POSIX systems). I've implemented this in a PR here: https://github.com/ruby/ruby/pull/8856 It seems _slightly_ slower to do it this way - semaphores require system calls in the uncontended case, which is why they're async-signal-safe but also makes them more expensive than pthread mutexes, which don't, on most systems. I ran my branch through yjit bench: With my patch: ``` interp: ruby 3.3.0dev (2023-11-07T08:14:09Z ktsanaktsidis/post.. 342f30f566) [arm64-darwin22] yjit: ruby 3.3.0dev (2023-11-07T08:14:09Z ktsanaktsidis/post.. 342f30f566) +YJIT [arm64-darwin22] -------------- ----------- ---------- --------- ---------- ------------ ----------- bench interp (ms) stddev (%) yjit (ms) stddev (%) yjit 1st itr interp/yjit activerecord 31.2 3.4 17.0 5.7 1.29 1.83 chunky-png 543.5 0.5 367.0 0.7 1.40 1.48 erubi-rails 1044.2 0.6 564.7 1.3 1.69 1.85 hexapdf 1517.6 3.1 917.6 1.2 1.46 1.65 liquid-c 37.1 1.3 28.9 1.4 0.89 1.29 liquid-compile 39.0 1.4 29.9 1.6 0.76 1.30 liquid-render 89.9 1.8 39.6 1.4 1.37 2.27 lobsters 598.2 1.7 435.4 5.2 0.63 1.37 mail 79.8 3.1 52.5 1.0 0.79 1.52 psych-load 1441.5 1.7 885.4 0.5 1.60 1.63 railsbench 1010.8 1.0 609.3 1.3 1.24 1.66 ruby-lsp 40.9 3.4 29.2 30.0 0.66 1.40 sequel 39.8 1.8 33.0 2.4 1.18 1.21 -------------- ----------- ---------- --------- ---------- ------------ ----------- ``` Without the patch: ``` interp: ruby 3.3.0dev (2023-11-07T07:56:43Z master 5a2779d40f) [arm64-darwin22] yjit: ruby 3.3.0dev (2023-11-07T07:56:43Z master 5a2779d40f) +YJIT [arm64-darwin22] -------------- ----------- ---------- --------- ---------- ------------ ----------- bench interp (ms) stddev (%) yjit (ms) stddev (%) yjit 1st itr interp/yjit activerecord 31.3 3.3 16.7 5.5 1.36 1.88 chunky-png 521.6 0.6 348.8 0.7 1.40 1.50 erubi-rails 1038.9 0.9 566.3 1.2 1.70 1.83 hexapdf 1501.9 1.1 951.7 3.9 1.42 1.58 liquid-c 36.7 1.2 29.3 1.7 0.86 1.25 liquid-compile 38.8 1.1 29.7 3.7 0.73 1.31 liquid-render 92.2 0.9 38.3 1.0 1.47 2.40 lobsters 582.5 2.0 429.8 5.6 0.59 1.36 mail 77.9 1.3 54.8 0.9 0.76 1.42 psych-load 1419.1 0.7 887.7 0.5 1.60 1.60 railsbench 1017.8 1.1 609.9 1.2 1.24 1.67 ruby-lsp 41.0 2.2 28.8 28.8 0.64 1.43 sequel 36.0 1.5 30.4 1.8 1.11 1.18 -------------- ----------- ---------- --------- ---------- ------------ ----------- ``` Maybe this is within the noise floor, but I thought I should bring it up. -- https://bugs.ruby-lang.org/

2 months, 3 weeks

2
3
0 0

[ruby-core:115143] [Ruby master Bug#19970] Eval leaks callcache and callinfo objects on arm32 (linux)

by larsin (Lars Ingjer)

Issue #19970 has been reported by larsin (Lars Ingjer). ---------------------------------------- Bug #19970: Eval leaks callcache and callinfo objects on arm32 (linux) https://bugs.ruby-lang.org/issues/19970 * Author: larsin (Lars Ingjer) * Status: Open * Priority: Normal * ruby -v: ruby 3.2.2 (2023-03-30 revision e51014f9c0) [armv7l-linux-eabihf] * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- The following script demonstrates a memory leak on arm 32 (linux): ``` ruby def gcdiff(n) GC.start if @last_gc_stat puts "GC.stat #{n} diff old_objects: #{GC.stat(:old_objects) - @last_gc_stat}" end @last_gc_stat = GC.stat(:old_objects) end def foo end 10.times do |i| 10_000.times do eval 'foo' end gcdiff(i) puts "Number of live objects: #{GC.stat(:heap_live_slots)}" puts "Memory usage: #{`ps -o rss= -p #{$$}`}" puts end ``` Output: ``` Number of live objects: 41303 Memory usage: 11900 GC.stat 1 diff old_objects: 20037 Number of live objects: 61317 Memory usage: 13604 GC.stat 2 diff old_objects: 20001 Number of live objects: 81317 Memory usage: 14880 GC.stat 3 diff old_objects: 20000 Number of live objects: 101317 Memory usage: 16596 GC.stat 4 diff old_objects: 20000 Number of live objects: 121317 Memory usage: 17248 GC.stat 5 diff old_objects: 20000 Number of live objects: 141317 Memory usage: 18760 GC.stat 6 diff old_objects: 20000 Number of live objects: 161317 Memory usage: 19540 GC.stat 7 diff old_objects: 20000 Number of live objects: 181317 Memory usage: 21752 GC.stat 8 diff old_objects: 20000 Number of live objects: 201317 Memory usage: 21828 GC.stat 9 diff old_objects: 20000 Number of live objects: 221317 Memory usage: 24896 ``` ObjectSpace.count_imemo_objects shows that imemo_callcache and imemo_callinfo are leaking. The issue does not occur on arm64 mac or x86_64 linux with the same ruby version. The issue has also been reproduced with the latest 3.2.2 snapshot (2023-09-30). -- https://bugs.ruby-lang.org/

2 months, 3 weeks

2
2
0 0

[ruby-core:112947] [Ruby master Bug#19542] Operations on zero-sized IO::Buffer are raising

by hanazuki (Kasumi Hanazuki)

Issue #19542 has been reported by hanazuki (Kasumi Hanazuki). ---------------------------------------- Bug #19542: Operations on zero-sized IO::Buffer are raising https://bugs.ruby-lang.org/issues/19542 * Author: hanazuki (Kasumi Hanazuki) * Status: Open * Priority: Normal * ruby -v: ruby 3.2.1 (2023-02-08 revision 31819e82c8) [x86_64-linux] * Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- I found that IO::Buffer of zero length is not cloneable. ``` % ruby -v ruby 3.2.1 (2023-02-08 revision 31819e82c8) [x86_64-linux] % ruby -e 'p IO::Buffer.for("").dup' -e:1:in `initialize_copy': The buffer is not allocated! (IO::Buffer::AllocationError) from -e:1:in `initialize_dup' from -e:1:in `dup' from -e:1:in `<main>' % ruby -e 'p IO::Buffer.new(0).dup' -e:1: warning: IO::Buffer is experimental and both the Ruby and C interface may change in the future! -e:1:in `initialize_copy': The buffer is not allocated! (IO::Buffer::AllocationError) from -e:1:in `initialize_dup' from -e:1:in `dup' from -e:1:in `<main>' ``` It seems `IO::Buffer.new(0)` allocates no memory for buffer on object creation and thus prohibits reading from or writing to it. So `#dup` method copying zero bytes into the new IO::Buffer raises the exception. Empty buffers, however, often appear in corner cases of usual operations (encrypting an empty string, encoding an empty list of items into binary, etc.) and it would be easy if such cases could be handled consistently. Other operations on NULL IO::Buffers are also useful but currently raising. ``` IO::Buffer.new(0) <=> IO::Buffer.new(1) IO::Buffer.new(0).each(:U8).to_a IO::Buffer.new(0).get_values([], 0) IO::Buffer.new(0).set_values([], 0, []) ``` I'm not sure this is a bug or by design, but at least I don't want cloning and comparison to raise. -- https://bugs.ruby-lang.org/

3 months

3
5
0 0

[ruby-core:112552] [Ruby master Bug#19461] Time.local performance tanks in forked process (on macOS only?)

by ioquatix (Samuel Williams)

Issue #19461 has been reported by ioquatix (Samuel Williams). ---------------------------------------- Bug #19461: Time.local performance tanks in forked process (on macOS only?) https://bugs.ruby-lang.org/issues/19461 * Author: ioquatix (Samuel Williams) * Status: Open * Priority: Normal * Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- The following program demonstrates a performance regression in forked child processes when invoking `Time.local`: ```ruby require 'benchmark' require 'time' def sir_local_alot result = Benchmark.measure do 10_000.times do tm = ::Time.local(2023) end end $stderr.puts result end sir_local_alot pid = fork do sir_local_alot end Process.wait(pid) ``` On Linux the performance is similar, but on macOS, the performance is over 100x worse on my M1 laptop. -- https://bugs.ruby-lang.org/

3 months, 1 week

4
8
0 0

[ruby-core:115978] [Ruby master Bug#20104] Regexp#match returns nil but allocates T_MATCH objects

by jeremyevans0 (Jeremy Evans)

Issue #20104 has been reported by jeremyevans0 (Jeremy Evans). ---------------------------------------- Bug #20104: Regexp#match returns nil but allocates T_MATCH objects https://bugs.ruby-lang.org/issues/20104 * Author: jeremyevans0 (Jeremy Evans) * Status: Open * Priority: Normal * ruby -v: ruby 3.4.0dev (2023-12-30T03:14:38Z master 8e32c01742) [x86_64-openbsd7.4] * Backport: 3.0: DONTNEED, 3.1: DONTNEED, 3.2: DONTNEED, 3.3: REQUIRED ---------------------------------------- Between Ruby 3.2 and 3.3, behavior changed so that Regexp#match will allocate a T_MATCH object even when there is no match. Example code: ```ruby h = {} GC.start GC.disable ObjectSpace.count_objects(h) matches = h[:T_MATCH] || 0 md = /\A[A-Z]+\Z/.match('1') ObjectSpace.count_objects(h) new_matches = h[:T_MATCH] || 0 puts "/\\A[A-Z]+\\Z/.match('1') => #{md.inspect} generates #{new_matches - matches} T_MATCH objects" ``` Result with Ruby 1.9-3.2: ``` /\A[A-Z]+\Z/.match('1') => nil generates 0 T_MATCH objects ``` Results with Ruby 3.3.0 and current master branch: ``` /\A[A-Z]+\Z/.match('1') => nil generates 1 T_MATCH objects ``` This results in a measurable performance decrease for both Sinatra and Roda web applications, as reported at: https://old.reddit.com/r/ruby/comments/18sxtv9/ruby_330_performance_ups_and… Thanks to GitHub users kiskoza and tagliala for producing a minimal example showing this issue: https://github.com/caxlsx/caxlsx/issues/336 -- https://bugs.ruby-lang.org/

3 months, 1 week

3
4
0 0

[ruby-core:115338] [Ruby master Bug#19999] Backport: .travis.yml and fixed commits

by jaruga (Jun Aruga)

Issue #19999 has been reported by jaruga (Jun Aruga). ---------------------------------------- Bug #19999: Backport: .travis.yml and fixed commits https://bugs.ruby-lang.org/issues/19999 * Author: jaruga (Jun Aruga) * Status: Open * Priority: Normal * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: REQUIRED ---------------------------------------- This is a backport suggestion to make Travis CI stable on the Ruby stable branches ruby_3_2 and etc. Travis CI for Ruby master branch became stable recently with arm64, ppc64le, s390x and arm32 cases. All the CI cases are running without `allow_failures` option. And more importantly, I simplified the `.travis.yml` to maintain it easily without sacrificing performance. So, I think it may be a good time to backport the Travis CI configuration file `.travis.yml` without some commits to fix some issues. https://app.travis-ci.com/github/ruby/ruby/builds/267166336 I can see ruby_3_2 and ruby_3_1 branches on the Travis CI page below. There are no ruby_3_0 and ruby_27 branches there. I am not sure how the branches are used. https://app.travis-ci.com/github/ruby/ruby/branches But it's beneficial to fix at least Travis CI for ruby_3_2 branch to save the Travis infra resource. Seeing the ruby_3_2 log, the ppc64le is already timeout after running maximum 50 minutes. https://app.travis-ci.com/github/ruby/ruby/builds/267156055 https://app.travis-ci.com/github/ruby/ruby/jobs/613043898#L2325 ``` [1/2] TestFiberQueue#test_pop_with_timeout_and_value = 0.00 s [2/2] TestFiberQueue#test_pop_with_timeout====[ 540 seconds still running ]==== ====[ 1080 seconds still running ]==== ====[ 1620 seconds still running ]==== ====[ 2160 seconds still running ]==== ``` First you can make the tests fail rather than stucking by the commit below. https://github.com/ruby/ruby/commit/3eaae72855b23158e2148566bb8a7667bfb395cb As this stucking issue happened with the combination of the `optflags=-O1` on ppc64le, I think you avoid the issue by porting the `.travis.yml` from the `master` branch where the `optflags=-O1` is not used in ppc64le. -- https://bugs.ruby-lang.org/

3 months, 1 week

2
6
0 0

[ruby-core:114089] [Ruby master Misc#19758] Statically link ext/json

by MyCo (Maik Menz)

Issue #19758 has been reported by MyCo (Maik Menz). ---------------------------------------- Misc #19758: Statically link ext/json https://bugs.ruby-lang.org/issues/19758 * Author: MyCo (Maik Menz) * Status: Open * Priority: Normal ---------------------------------------- Hi, I'm building Ruby both as dynamic and static library with MSVC for a project. Everything appears to work fine, but now I'm trying to use the json ext, and it only works with the dynamically linked version. In the statically linked version it says it's missing json/pure but on closer inspection the reason it says that is because it can't find json/ext/parser (and probably also json/ext/generator) in the first place. I can see that both parser & generator created static libs in the build directory but they aren't linked into the ruby lib. With my limited knowledge of the building processes, my first attempt was to add both of those libs into `LOCAL_LIBS` and now they apear in the linking process. But this still doesn't change anything. It's still not finding those 2 libs in the statically linked Ruby build. What am I missing? What do I have to do, to get those linked into the static lib? Regards Maik -- https://bugs.ruby-lang.org/

3 months, 1 week

4
6
0 0

2024

2023

2022

ruby-core December 2023