Issue #19408 has been reported by luke-gru (Luke Gruber).
----------------------------------------
Bug #19408: Object no longer frozen after moved from a ractor
https://bugs.ruby-lang.org/issues/19408
* Author: luke-gru (Luke Gruber)
* Status: Open
* Priority: Normal
* Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN
----------------------------------------
I think frozen objects should still be frozen after a move.
```ruby
r = Ractor.new do
obj = receive
p obj.frozen? # should be true but is false
p obj
end
obj = [Object.new].freeze
r.send(obj, move: true)
r.take
```
--
https://bugs.ruby-lang.org/
Issue #20080 has been reported by stuyam (Stuart Yamartino).
----------------------------------------
Feature #20080: Implement #begin_and_end method on Range
https://bugs.ruby-lang.org/issues/20080
* Author: stuyam (Stuart Yamartino)
* Status: Open
* Priority: Normal
----------------------------------------
Followup Reference: #20027
This feature request is to implement a method called `#begin_and_end` on `Range` that returns an array of the first and last value stored in a range:
```ruby
(1..300).begin_and_end #=> [1, 300]
first, last = (300..1).begin_and_end
first #=> 300
last #=> 1
```
I believe this would be a great addition to Ranges as they are often used to pass around a single object used to hold endpoints, and this allows easier retrieval of those endpoints.
This would allow easier deconstruction into start and end values using array deconstruction as well as a simpler way to serialize to a more primitive object such as an array for database storage.
This implementation was suggested by @mame in my initial feature suggestion regarding range deconstruction: https://bugs.ruby-lang.org/issues/20027
This implementation would work similar to how `#minmax` works where it returns an array of two numbers, however the difference is that `#minmax` doesn't work with reverse ranges as @Dan0042 pointed out in the link above:
```ruby
(1..42).minmax #=> [1, 42]
(42..1).minmax #=> [nil, nil]
```
--
https://bugs.ruby-lang.org/
Issue #19905 has been reported by hi(a)joaofernandes.me (Joao Fernandes).
----------------------------------------
Feature #19905: Introduce `Queue#peek`
https://bugs.ruby-lang.org/issues/19905
* Author: hi(a)joaofernandes.me (Joao Fernandes)
* Status: Open
* Priority: Normal
----------------------------------------
This ticket proposes the introduction of the `Queue#peek` method, similar to what we can find in other object oriented languages such as Java and C#. This method is similar to `Queue#pop`, but does not change the data, nor does it require a lock.
```
q = Queue.new([1,2,3])
=> #<Thread::Queue:0x00000001065d7148>
q.peek
=> 1
q.peek
=> 1
```
I have felt the need of this for debugging, but I think that it can also be of practical use for presentation. I believe that the only drawback could be that newcomers could misuse it in multi-threaded work without taking into account that this method is not thread safe.
I also volunteer myself to implement this method.
--
https://bugs.ruby-lang.org/
Issue #19991 has been reported by kjtsanaktsidis (KJ Tsanaktsidis).
----------------------------------------
Bug #19991: rb_register_postponed_job async-signal-unsafety causes crash in GC
https://bugs.ruby-lang.org/issues/19991
* Author: kjtsanaktsidis (KJ Tsanaktsidis)
* Status: Open
* Priority: Normal
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN
----------------------------------------
Our production application experienced an interpreter crash today that I’m fairly sure is the result of a bug in the `rb_register_postponed_job` infrastructure.
## Diagnosis
I’ve attached a more complete version of the crash dump, but the key information I think is the following backtrace:
```
<internal:gc>:35: [BUG] Segmentation fault at 0x00000000000000c8
....
/usr/lib/libruby.so.3.1(rb_vm_bugreport+0x60c) [0xffffaaf57a6c] vm_dump.c:759
/usr/lib/libruby.so.3.1(rb_bug_for_fatal_signal+0xd8) [0xffffaad6d120] error.c:821
/usr/lib/libruby.so.3.1(sigsegv+0x58) [0xffffaaeb4350] signal.c:964
/usr/lib/libruby.so.3.1(sigill) (null):0
linux-vdso.so.1(__kernel_rt_sigreturn+0x0) [0xffffab20f78c]
/usr/lib/libruby.so.3.1(rbimpl_atomic_exchange+0x0) [0xffffaad94018] gc.c:4081
/usr/lib/libruby.so.3.1(gc_finalize_deferred) gc.c:4081
/usr/lib/libruby.so.3.1(rb_postponed_job_flush+0x218) [0xffffaaf5dd60] vm_trace.c:1728
/usr/lib/libruby.so.3.1(rb_threadptr_execute_interrupts+0x328) [0xffffaaef7520] thread.c:2444
/usr/lib/libruby.so.3.1(vm_exec_core+0x4e54) [0xffffaaf41c6c] vm.inc:588
/usr/lib/libruby.so.3.1(rb_vm_exec+0x144) [0xffffaaf427cc] vm.c:2211
```
I disassembled `gc_finalize_deferred` in GDB (different execution, so the addresses don’t line up - this is NOT from a core dump):
```
(gdb) disassemble gc_finalize_deferred
Dump of assembler code for function gc_finalize_deferred:
0x0000fffff7cc3ff8 <+0>: stp x29, x30, [sp, #-48]!
0x0000fffff7cc3ffc <+4>: mov w1, #0x1 // #1
0x0000fffff7cc4000 <+8>: mov x29, sp
0x0000fffff7cc4004 <+12>: stp x19, x20, [sp, #16]
0x0000fffff7cc4008 <+16>: mov x19, x0
0x0000fffff7cc400c <+20>: add x20, x0, #0xc8
0x0000fffff7cc4010 <+24>: ldaxr w0, [x20]
.... continues ....
```
Based on the line numbers from the debuginfo, the faulting instruction is likely to be `gc_finalize_deferred+24` (which is the inlined `rbimpl_atomic_exchange`). That would attempt a load of 0xc8 if x0 was zero - i.e. if `gc_finalize_deferred` was called with a NULL objspace.
The enqueuing of `gc_finalize_deferred` with postponed job only happens in one place (in gc_sweep_page, [here](https://github.com/ruby/ruby/blob/5cff4c5aa375787924e2df5c0b981dd922b…) and if `objspace` was null there then the crash would have had to have already happened in `gc_sweep_page`. Thus, I think that `gc_finalize_deferred` was _enqueued_ into the postponed job list with a not-NULL argument, but the argument was corrupted whilst it was in `vm->postponed_job_buffer`, and if objspace was null there then the crash would have had to have already happened in `gc_sweep_page`. Thus, I think that `gc_finalize_deferred` was enqueued into the postponed job list with a not-NULL argument, but the argument was corrupted whilst it was in `vm->postponed_job_buffer`.
I had a look at the postponed job code, which is of course very tricky because it needs to be async-signal-safe. More specifically:
* It needs to work if run from a thread not holding the GVL
* It needs to work if run from a thread, whilst another thread is actually executing `rb_postponed_job_flush`
* It needs to work if run from a signal caught in a thread that is currently executing `rb_postponed_job_flush` (this rules out trivial mutex-based solutions)
We use the Datadog continuous profiler in our application (CC: @ivoanjo ;), which calls `rb_postponed_job_register` to capture profiling samples. Thus, I think our application is likely to hit all of those scenarios semi-reguarly.
My best guess at a plausible sequence of events, to generate this particular crash, is that:
1. `rb_postponed_job_flush` was running on thread T1.
2. There is a queued call to gc_finalize_deferred sitting in `vm->postponed_job_buffer[vm->postponed_job_index-1]`.
3. T1 executed the `ATOMIC_CAS` at vm_trace.c:1800, decrementing `vm->postponed_job_index` (which now equals `index - 1`) and determining that a job at index index should be run.
4. Thread T2 received a signal, and the Datadog continuous profiler called `rb_postponed_job_register_one(0, sample_from_postponed_job, NULL)`
5. T2 executed the `ATOMIC_CAS` at vm_trace.c:1679, re-incrementing `vm->postponed_job_index`. It’s now equal to `index` from T1 again.
6. T2 then executes the sets on `pjob->func` and `pjob->data` at vm_trace.c:1687. It sets `->func` to `sample_from_postponed_job` (from ddtrace), and `->data` to 0.
7. T1 then goes to call `(*pjob->func)(pjob->data)` at vm_trace.c:1802
8. Since there is no memory barrier between 6 & 7, T1 is allowed to see the effects of the set on `pjob->data` and not see that of `pjob->func`.
9, T1 thus calls `gc_finalize_deferred` (which it was meant to do) with an argument of 0 (which it was not).
## Solution
Managing a thread-safe list of too-big-to-be-atomic objects (like `rb_postponed_job_t`) is really tricky. I think it might be best for this code if we use a combination of masking signals (to prevent manipulation of the postponed job list occurring during `rb_postponed_job_flush`, and using a semaphore to protect the list (since semaphores are required to be async-signal-safe on POSIX systems). I've implemented this in a PR here: https://github.com/ruby/ruby/pull/8856
It seems _slightly_ slower to do it this way - semaphores require system calls in the uncontended case, which is why they're async-signal-safe but also makes them more expensive than pthread mutexes, which don't, on most systems. I ran my branch through yjit bench:
With my patch:
```
interp: ruby 3.3.0dev (2023-11-07T08:14:09Z ktsanaktsidis/post.. 342f30f566) [arm64-darwin22]
yjit: ruby 3.3.0dev (2023-11-07T08:14:09Z ktsanaktsidis/post.. 342f30f566) +YJIT [arm64-darwin22]
-------------- ----------- ---------- --------- ---------- ------------ -----------
bench interp (ms) stddev (%) yjit (ms) stddev (%) yjit 1st itr interp/yjit
activerecord 31.2 3.4 17.0 5.7 1.29 1.83
chunky-png 543.5 0.5 367.0 0.7 1.40 1.48
erubi-rails 1044.2 0.6 564.7 1.3 1.69 1.85
hexapdf 1517.6 3.1 917.6 1.2 1.46 1.65
liquid-c 37.1 1.3 28.9 1.4 0.89 1.29
liquid-compile 39.0 1.4 29.9 1.6 0.76 1.30
liquid-render 89.9 1.8 39.6 1.4 1.37 2.27
lobsters 598.2 1.7 435.4 5.2 0.63 1.37
mail 79.8 3.1 52.5 1.0 0.79 1.52
psych-load 1441.5 1.7 885.4 0.5 1.60 1.63
railsbench 1010.8 1.0 609.3 1.3 1.24 1.66
ruby-lsp 40.9 3.4 29.2 30.0 0.66 1.40
sequel 39.8 1.8 33.0 2.4 1.18 1.21
-------------- ----------- ---------- --------- ---------- ------------ -----------
```
Without the patch:
```
interp: ruby 3.3.0dev (2023-11-07T07:56:43Z master 5a2779d40f) [arm64-darwin22]
yjit: ruby 3.3.0dev (2023-11-07T07:56:43Z master 5a2779d40f) +YJIT [arm64-darwin22]
-------------- ----------- ---------- --------- ---------- ------------ -----------
bench interp (ms) stddev (%) yjit (ms) stddev (%) yjit 1st itr interp/yjit
activerecord 31.3 3.3 16.7 5.5 1.36 1.88
chunky-png 521.6 0.6 348.8 0.7 1.40 1.50
erubi-rails 1038.9 0.9 566.3 1.2 1.70 1.83
hexapdf 1501.9 1.1 951.7 3.9 1.42 1.58
liquid-c 36.7 1.2 29.3 1.7 0.86 1.25
liquid-compile 38.8 1.1 29.7 3.7 0.73 1.31
liquid-render 92.2 0.9 38.3 1.0 1.47 2.40
lobsters 582.5 2.0 429.8 5.6 0.59 1.36
mail 77.9 1.3 54.8 0.9 0.76 1.42
psych-load 1419.1 0.7 887.7 0.5 1.60 1.60
railsbench 1017.8 1.1 609.9 1.2 1.24 1.67
ruby-lsp 41.0 2.2 28.8 28.8 0.64 1.43
sequel 36.0 1.5 30.4 1.8 1.11 1.18
-------------- ----------- ---------- --------- ---------- ------------ -----------
```
Maybe this is within the noise floor, but I thought I should bring it up.
--
https://bugs.ruby-lang.org/
Issue #19970 has been reported by larsin (Lars Ingjer).
----------------------------------------
Bug #19970: Eval leaks callcache and callinfo objects on arm32 (linux)
https://bugs.ruby-lang.org/issues/19970
* Author: larsin (Lars Ingjer)
* Status: Open
* Priority: Normal
* ruby -v: ruby 3.2.2 (2023-03-30 revision e51014f9c0) [armv7l-linux-eabihf]
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN
----------------------------------------
The following script demonstrates a memory leak on arm 32 (linux):
``` ruby
def gcdiff(n)
GC.start
if @last_gc_stat
puts "GC.stat #{n} diff old_objects: #{GC.stat(:old_objects) - @last_gc_stat}"
end
@last_gc_stat = GC.stat(:old_objects)
end
def foo
end
10.times do |i|
10_000.times do
eval 'foo'
end
gcdiff(i)
puts "Number of live objects: #{GC.stat(:heap_live_slots)}"
puts "Memory usage: #{`ps -o rss= -p #{$$}`}"
puts
end
```
Output:
```
Number of live objects: 41303
Memory usage: 11900
GC.stat 1 diff old_objects: 20037
Number of live objects: 61317
Memory usage: 13604
GC.stat 2 diff old_objects: 20001
Number of live objects: 81317
Memory usage: 14880
GC.stat 3 diff old_objects: 20000
Number of live objects: 101317
Memory usage: 16596
GC.stat 4 diff old_objects: 20000
Number of live objects: 121317
Memory usage: 17248
GC.stat 5 diff old_objects: 20000
Number of live objects: 141317
Memory usage: 18760
GC.stat 6 diff old_objects: 20000
Number of live objects: 161317
Memory usage: 19540
GC.stat 7 diff old_objects: 20000
Number of live objects: 181317
Memory usage: 21752
GC.stat 8 diff old_objects: 20000
Number of live objects: 201317
Memory usage: 21828
GC.stat 9 diff old_objects: 20000
Number of live objects: 221317
Memory usage: 24896
```
ObjectSpace.count_imemo_objects shows that imemo_callcache and imemo_callinfo are leaking.
The issue does not occur on arm64 mac or x86_64 linux with the same ruby version.
The issue has also been reproduced with the latest 3.2.2 snapshot (2023-09-30).
--
https://bugs.ruby-lang.org/
Issue #19542 has been reported by hanazuki (Kasumi Hanazuki).
----------------------------------------
Bug #19542: Operations on zero-sized IO::Buffer are raising
https://bugs.ruby-lang.org/issues/19542
* Author: hanazuki (Kasumi Hanazuki)
* Status: Open
* Priority: Normal
* ruby -v: ruby 3.2.1 (2023-02-08 revision 31819e82c8) [x86_64-linux]
* Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN
----------------------------------------
I found that IO::Buffer of zero length is not cloneable.
```
% ruby -v
ruby 3.2.1 (2023-02-08 revision 31819e82c8) [x86_64-linux]
% ruby -e 'p IO::Buffer.for("").dup'
-e:1:in `initialize_copy': The buffer is not allocated! (IO::Buffer::AllocationError)
from -e:1:in `initialize_dup'
from -e:1:in `dup'
from -e:1:in `<main>'
% ruby -e 'p IO::Buffer.new(0).dup'
-e:1: warning: IO::Buffer is experimental and both the Ruby and C interface may change in the future!
-e:1:in `initialize_copy': The buffer is not allocated! (IO::Buffer::AllocationError)
from -e:1:in `initialize_dup'
from -e:1:in `dup'
from -e:1:in `<main>'
```
It seems `IO::Buffer.new(0)` allocates no memory for buffer on object creation and thus prohibits reading from or writing to it. So `#dup` method copying zero bytes into the new IO::Buffer raises the exception.
Empty buffers, however, often appear in corner cases of usual operations (encrypting an empty string, encoding an empty list of items into binary, etc.) and it would be easy if such cases could be handled consistently.
Other operations on NULL IO::Buffers are also useful but currently raising.
```
IO::Buffer.new(0) <=> IO::Buffer.new(1)
IO::Buffer.new(0).each(:U8).to_a
IO::Buffer.new(0).get_values([], 0)
IO::Buffer.new(0).set_values([], 0, [])
```
I'm not sure this is a bug or by design, but at least I don't want cloning and comparison to raise.
--
https://bugs.ruby-lang.org/
Issue #19461 has been reported by ioquatix (Samuel Williams).
----------------------------------------
Bug #19461: Time.local performance tanks in forked process (on macOS only?)
https://bugs.ruby-lang.org/issues/19461
* Author: ioquatix (Samuel Williams)
* Status: Open
* Priority: Normal
* Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN
----------------------------------------
The following program demonstrates a performance regression in forked child processes when invoking `Time.local`:
```ruby
require 'benchmark'
require 'time'
def sir_local_alot
result = Benchmark.measure do
10_000.times do
tm = ::Time.local(2023)
end
end
$stderr.puts result
end
sir_local_alot
pid = fork do
sir_local_alot
end
Process.wait(pid)
```
On Linux the performance is similar, but on macOS, the performance is over 100x worse on my M1 laptop.
--
https://bugs.ruby-lang.org/
Issue #20104 has been reported by jeremyevans0 (Jeremy Evans).
----------------------------------------
Bug #20104: Regexp#match returns nil but allocates T_MATCH objects
https://bugs.ruby-lang.org/issues/20104
* Author: jeremyevans0 (Jeremy Evans)
* Status: Open
* Priority: Normal
* ruby -v: ruby 3.4.0dev (2023-12-30T03:14:38Z master 8e32c01742) [x86_64-openbsd7.4]
* Backport: 3.0: DONTNEED, 3.1: DONTNEED, 3.2: DONTNEED, 3.3: REQUIRED
----------------------------------------
Between Ruby 3.2 and 3.3, behavior changed so that Regexp#match will allocate a T_MATCH object even when there is no match. Example code:
```ruby
h = {}
GC.start
GC.disable
ObjectSpace.count_objects(h)
matches = h[:T_MATCH] || 0
md = /\A[A-Z]+\Z/.match('1')
ObjectSpace.count_objects(h)
new_matches = h[:T_MATCH] || 0
puts "/\\A[A-Z]+\\Z/.match('1') => #{md.inspect} generates #{new_matches - matches} T_MATCH objects"
```
Result with Ruby 1.9-3.2:
```
/\A[A-Z]+\Z/.match('1') => nil generates 0 T_MATCH objects
```
Results with Ruby 3.3.0 and current master branch:
```
/\A[A-Z]+\Z/.match('1') => nil generates 1 T_MATCH objects
```
This results in a measurable performance decrease for both Sinatra and Roda web applications, as reported at: https://old.reddit.com/r/ruby/comments/18sxtv9/ruby_330_performance_ups_and…
Thanks to GitHub users kiskoza and tagliala for producing a minimal example showing this issue: https://github.com/caxlsx/caxlsx/issues/336
--
https://bugs.ruby-lang.org/
Issue #19999 has been reported by jaruga (Jun Aruga).
----------------------------------------
Bug #19999: Backport: .travis.yml and fixed commits
https://bugs.ruby-lang.org/issues/19999
* Author: jaruga (Jun Aruga)
* Status: Open
* Priority: Normal
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: REQUIRED
----------------------------------------
This is a backport suggestion to make Travis CI stable on the Ruby stable branches ruby_3_2 and etc.
Travis CI for Ruby master branch became stable recently with arm64, ppc64le, s390x and arm32 cases. All the CI cases are running without `allow_failures` option. And more importantly, I simplified the `.travis.yml` to maintain it easily without sacrificing performance. So, I think it may be a good time to backport the Travis CI configuration file `.travis.yml` without some commits to fix some issues.
https://app.travis-ci.com/github/ruby/ruby/builds/267166336
I can see ruby_3_2 and ruby_3_1 branches on the Travis CI page below. There are no ruby_3_0 and ruby_27 branches there. I am not sure how the branches are used.
https://app.travis-ci.com/github/ruby/ruby/branches
But it's beneficial to fix at least Travis CI for ruby_3_2 branch to save the Travis infra resource.
Seeing the ruby_3_2 log, the ppc64le is already timeout after running maximum 50 minutes.
https://app.travis-ci.com/github/ruby/ruby/builds/267156055https://app.travis-ci.com/github/ruby/ruby/jobs/613043898#L2325
```
[1/2] TestFiberQueue#test_pop_with_timeout_and_value = 0.00 s
[2/2] TestFiberQueue#test_pop_with_timeout====[ 540 seconds still running ]====
====[ 1080 seconds still running ]====
====[ 1620 seconds still running ]====
====[ 2160 seconds still running ]====
```
First you can make the tests fail rather than stucking by the commit below.
https://github.com/ruby/ruby/commit/3eaae72855b23158e2148566bb8a7667bfb395cb
As this stucking issue happened with the combination of the `optflags=-O1` on ppc64le, I think you avoid the issue by porting the `.travis.yml` from the `master` branch where the `optflags=-O1` is not used in ppc64le.
--
https://bugs.ruby-lang.org/
Issue #19758 has been reported by MyCo (Maik Menz).
----------------------------------------
Misc #19758: Statically link ext/json
https://bugs.ruby-lang.org/issues/19758
* Author: MyCo (Maik Menz)
* Status: Open
* Priority: Normal
----------------------------------------
Hi,
I'm building Ruby both as dynamic and static library with MSVC for a project. Everything appears to work fine, but now I'm trying to use the json ext, and it only works with the dynamically linked version.
In the statically linked version it says it's missing json/pure but on closer inspection the reason it says that is because it can't find json/ext/parser (and probably also json/ext/generator) in the first place.
I can see that both parser & generator created static libs in the build directory but they aren't linked into the ruby lib.
With my limited knowledge of the building processes, my first attempt was to add both of those libs into `LOCAL_LIBS` and now they apear in the linking process.
But this still doesn't change anything. It's still not finding those 2 libs in the statically linked Ruby build.
What am I missing? What do I have to do, to get those linked into the static lib?
Regards
Maik
--
https://bugs.ruby-lang.org/