Issue #19231 has been reported by andrykonchin (Andrew Konchin).
----------------------------------------
Bug #19231: Integer#step and Float::INFINITY - inconsistent behaviour when called with and without a block
https://bugs.ruby-lang.org/issues/19231
* Author: andrykonchin (Andrew Konchin)
* Status: Open
* Priority: Normal
* ruby -v: 3.1.2
* Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN
----------------------------------------
The initial issue was reported here https://github.com/oracle/truffleruby/issues/2797.
`0.step(Float::INFINITY, 10)` returns:
- `Integers` when called with a block
- `Floats` when called without a block
I would expect `Floats` to be returned in both cases.
Examples:
```ruby
0.step(100.0, 10).take(1).map(&:class)
# => [Float]
```
```ruby
0.step(Float::INFINITY, 10) { |offset| p offset.class; break }
# Integer
```
When `to` argument is a finite `Float` value then calling with a block returns `Floats` as well:
```ruby
0.step(100.0, 10) { |offset| p offset.class; break }
# Float
```
Wondering whether it's intentional behaviour.
I've found a related issue https://bugs.ruby-lang.org/issues/15518.
--
https://bugs.ruby-lang.org/
Issue #19742 has been reported by ioquatix (Samuel Williams).
----------------------------------------
Bug #19742: Introduce `Module#anonymous?`
https://bugs.ruby-lang.org/issues/19742
* Author: ioquatix (Samuel Williams)
* Status: Open
* Priority: Normal
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN
----------------------------------------
As a follow-on <from https://bugs.ruby-lang.org/issues/19521>, I'd like propose we introduce `Module#anonymous?`.
In some situations, like logging/formatting, serialisation/deserialization, debugging or meta-programming, we might like to know if a class is a proper constant or not.
However, this brings about some other issues which might need to be discussed.
After assigning a constant, then removing it, the internal state of Ruby still believes that the class name is permanent, even thought it's no longer true.
e.g.
```
m = Module.new
m.anonymous? # true
M = m
m.anonyomous # false
Object.send(:remove_const, :M)
M # uninitialized constant M (NameError)
M.anonymous? # false
```
Because RCLASS data structure is not updated after the constant is removed, internally the state still has a "permanent class name".
I want to use this proposal to discuss this issue and whether there is anything we should do about such behaviour (or even if it's desirable).
Proposed PR: https://github.com/ruby/ruby/pull/7966
cc @fxn
--
https://bugs.ruby-lang.org/
Issue #19973 has been reported by tenderlovemaking (Aaron Patterson).
----------------------------------------
Bug #19973: Duplicate keyword argument names don't always warn
https://bugs.ruby-lang.org/issues/19973
* Author: tenderlovemaking (Aaron Patterson)
* Status: Open
* Priority: Normal
* ruby -v: ruby 3.3.0dev (2023-10-24T19:38:50Z cleanup 3525a9bd22) [arm64-darwin23]
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN
----------------------------------------
Calling a method with duplicate keyword arguments doesn't warn when it could:
```ruby
def bar a:, b:
a + b
end
# Warning
bar(a: 1, b: 3, a: 2)
z = { b: 123 }
# No warning
bar(a: 1, **z, a: 2)
```
The first call to `bar` gives a warning about duplicate keyword args, but the second call doesn't. I think both cases should emit a warning.
--
https://bugs.ruby-lang.org/
Issue #19967 has been reported by nobu (Nobuyoshi Nakada).
----------------------------------------
Bug #19967: Already installed libruby.dylib is used for test on macOS
https://bugs.ruby-lang.org/issues/19967
* Author: nobu (Nobuyoshi Nakada)
* Status: Assigned
* Priority: Normal
* Assignee: nobu (Nobuyoshi Nakada)
* Backport: 3.0: REQUIRED, 3.1: REQUIRED, 3.2: REQUIRED
----------------------------------------
On macOS, in the case of `--enable-shared` and `--disable-load-relative`, already install libruby.dylib is used during tests if exists.
This is because since [Bug #14992], `DYLD_FALLBACK_LIBRARY_PATH` is used instead of `DYLD_LIBRARY_PATH`.
The latter environment variable is used preferentially, whereas the former is used as a fallback, as the name implies.
--
https://bugs.ruby-lang.org/
Issue #19365 has been reported by luke-gru (Luke Gruber).
----------------------------------------
Bug #19365: Ractors can access non-shareable values through enumerators
https://bugs.ruby-lang.org/issues/19365
* Author: luke-gru (Luke Gruber)
* Status: Open
* Priority: Normal
* Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN
----------------------------------------
I don't think enumerators should be able to be passed to `Ractor.new`
```ruby
obj = Object.new # unshareable value
p obj
Ractor.new([obj].each) {|f| p f.first }.take
```
--
https://bugs.ruby-lang.org/
Issue #19395 has been reported by luke-gru (Luke Gruber).
----------------------------------------
Bug #19395: Process forking within non-main Ractor creates child stuck in busy loop
https://bugs.ruby-lang.org/issues/19395
* Author: luke-gru (Luke Gruber)
* Status: Open
* Priority: Normal
* Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN
----------------------------------------
```ruby
def test_fork_in_ractor
r2 = Ractor.new do
pid = fork do
exit Ractor.count
end
pid
end
pid = r2.take
puts "Process #{Process.pid} waiting for #{pid}"
_pid, status = Process.waitpid2(pid) # stuck forever
if status.exitstatus != 1
raise "status is #{status.exitstatus}"
end
end
test_fork_in_ractor()
```
$ top # shows CPU usage is high for child process
--
https://bugs.ruby-lang.org/
Issue #19408 has been reported by luke-gru (Luke Gruber).
----------------------------------------
Bug #19408: Object no longer frozen after moved from a ractor
https://bugs.ruby-lang.org/issues/19408
* Author: luke-gru (Luke Gruber)
* Status: Open
* Priority: Normal
* Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN
----------------------------------------
I think frozen objects should still be frozen after a move.
```ruby
r = Ractor.new do
obj = receive
p obj.frozen? # should be true but is false
p obj
end
obj = [Object.new].freeze
r.send(obj, move: true)
r.take
```
--
https://bugs.ruby-lang.org/
Issue #19905 has been reported by hi(a)joaofernandes.me (Joao Fernandes).
----------------------------------------
Feature #19905: Introduce `Queue#peek`
https://bugs.ruby-lang.org/issues/19905
* Author: hi(a)joaofernandes.me (Joao Fernandes)
* Status: Open
* Priority: Normal
----------------------------------------
This ticket proposes the introduction of the `Queue#peek` method, similar to what we can find in other object oriented languages such as Java and C#. This method is similar to `Queue#pop`, but does not change the data, nor does it require a lock.
```
q = Queue.new([1,2,3])
=> #<Thread::Queue:0x00000001065d7148>
q.peek
=> 1
q.peek
=> 1
```
I have felt the need of this for debugging, but I think that it can also be of practical use for presentation. I believe that the only drawback could be that newcomers could misuse it in multi-threaded work without taking into account that this method is not thread safe.
I also volunteer myself to implement this method.
--
https://bugs.ruby-lang.org/
Issue #19991 has been reported by kjtsanaktsidis (KJ Tsanaktsidis).
----------------------------------------
Bug #19991: rb_register_postponed_job async-signal-unsafety causes crash in GC
https://bugs.ruby-lang.org/issues/19991
* Author: kjtsanaktsidis (KJ Tsanaktsidis)
* Status: Open
* Priority: Normal
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN
----------------------------------------
Our production application experienced an interpreter crash today that I’m fairly sure is the result of a bug in the `rb_register_postponed_job` infrastructure.
## Diagnosis
I’ve attached a more complete version of the crash dump, but the key information I think is the following backtrace:
```
<internal:gc>:35: [BUG] Segmentation fault at 0x00000000000000c8
....
/usr/lib/libruby.so.3.1(rb_vm_bugreport+0x60c) [0xffffaaf57a6c] vm_dump.c:759
/usr/lib/libruby.so.3.1(rb_bug_for_fatal_signal+0xd8) [0xffffaad6d120] error.c:821
/usr/lib/libruby.so.3.1(sigsegv+0x58) [0xffffaaeb4350] signal.c:964
/usr/lib/libruby.so.3.1(sigill) (null):0
linux-vdso.so.1(__kernel_rt_sigreturn+0x0) [0xffffab20f78c]
/usr/lib/libruby.so.3.1(rbimpl_atomic_exchange+0x0) [0xffffaad94018] gc.c:4081
/usr/lib/libruby.so.3.1(gc_finalize_deferred) gc.c:4081
/usr/lib/libruby.so.3.1(rb_postponed_job_flush+0x218) [0xffffaaf5dd60] vm_trace.c:1728
/usr/lib/libruby.so.3.1(rb_threadptr_execute_interrupts+0x328) [0xffffaaef7520] thread.c:2444
/usr/lib/libruby.so.3.1(vm_exec_core+0x4e54) [0xffffaaf41c6c] vm.inc:588
/usr/lib/libruby.so.3.1(rb_vm_exec+0x144) [0xffffaaf427cc] vm.c:2211
```
I disassembled `gc_finalize_deferred` in GDB (different execution, so the addresses don’t line up - this is NOT from a core dump):
```
(gdb) disassemble gc_finalize_deferred
Dump of assembler code for function gc_finalize_deferred:
0x0000fffff7cc3ff8 <+0>: stp x29, x30, [sp, #-48]!
0x0000fffff7cc3ffc <+4>: mov w1, #0x1 // #1
0x0000fffff7cc4000 <+8>: mov x29, sp
0x0000fffff7cc4004 <+12>: stp x19, x20, [sp, #16]
0x0000fffff7cc4008 <+16>: mov x19, x0
0x0000fffff7cc400c <+20>: add x20, x0, #0xc8
0x0000fffff7cc4010 <+24>: ldaxr w0, [x20]
.... continues ....
```
Based on the line numbers from the debuginfo, the faulting instruction is likely to be `gc_finalize_deferred+24` (which is the inlined `rbimpl_atomic_exchange`). That would attempt a load of 0xc8 if x0 was zero - i.e. if `gc_finalize_deferred` was called with a NULL objspace.
The enqueuing of `gc_finalize_deferred` with postponed job only happens in one place (in gc_sweep_page, [here](https://github.com/ruby/ruby/blob/5cff4c5aa375787924e2df5c0b981dd922b…) and if `objspace` was null there then the crash would have had to have already happened in `gc_sweep_page`. Thus, I think that `gc_finalize_deferred` was _enqueued_ into the postponed job list with a not-NULL argument, but the argument was corrupted whilst it was in `vm->postponed_job_buffer`, and if objspace was null there then the crash would have had to have already happened in `gc_sweep_page`. Thus, I think that `gc_finalize_deferred` was enqueued into the postponed job list with a not-NULL argument, but the argument was corrupted whilst it was in `vm->postponed_job_buffer`.
I had a look at the postponed job code, which is of course very tricky because it needs to be async-signal-safe. More specifically:
* It needs to work if run from a thread not holding the GVL
* It needs to work if run from a thread, whilst another thread is actually executing `rb_postponed_job_flush`
* It needs to work if run from a signal caught in a thread that is currently executing `rb_postponed_job_flush` (this rules out trivial mutex-based solutions)
We use the Datadog continuous profiler in our application (CC: @ivoanjo ;), which calls `rb_postponed_job_register` to capture profiling samples. Thus, I think our application is likely to hit all of those scenarios semi-reguarly.
My best guess at a plausible sequence of events, to generate this particular crash, is that:
1. `rb_postponed_job_flush` was running on thread T1.
2. There is a queued call to gc_finalize_deferred sitting in `vm->postponed_job_buffer[vm->postponed_job_index-1]`.
3. T1 executed the `ATOMIC_CAS` at vm_trace.c:1800, decrementing `vm->postponed_job_index` (which now equals `index - 1`) and determining that a job at index index should be run.
4. Thread T2 received a signal, and the Datadog continuous profiler called `rb_postponed_job_register_one(0, sample_from_postponed_job, NULL)`
5. T2 executed the `ATOMIC_CAS` at vm_trace.c:1679, re-incrementing `vm->postponed_job_index`. It’s now equal to `index` from T1 again.
6. T2 then executes the sets on `pjob->func` and `pjob->data` at vm_trace.c:1687. It sets `->func` to `sample_from_postponed_job` (from ddtrace), and `->data` to 0.
7. T1 then goes to call `(*pjob->func)(pjob->data)` at vm_trace.c:1802
8. Since there is no memory barrier between 6 & 7, T1 is allowed to see the effects of the set on `pjob->data` and not see that of `pjob->func`.
9, T1 thus calls `gc_finalize_deferred` (which it was meant to do) with an argument of 0 (which it was not).
## Solution
Managing a thread-safe list of too-big-to-be-atomic objects (like `rb_postponed_job_t`) is really tricky. I think it might be best for this code if we use a combination of masking signals (to prevent manipulation of the postponed job list occurring during `rb_postponed_job_flush`, and using a semaphore to protect the list (since semaphores are required to be async-signal-safe on POSIX systems). I've implemented this in a PR here: https://github.com/ruby/ruby/pull/8856
It seems _slightly_ slower to do it this way - semaphores require system calls in the uncontended case, which is why they're async-signal-safe but also makes them more expensive than pthread mutexes, which don't, on most systems. I ran my branch through yjit bench:
With my patch:
```
interp: ruby 3.3.0dev (2023-11-07T08:14:09Z ktsanaktsidis/post.. 342f30f566) [arm64-darwin22]
yjit: ruby 3.3.0dev (2023-11-07T08:14:09Z ktsanaktsidis/post.. 342f30f566) +YJIT [arm64-darwin22]
-------------- ----------- ---------- --------- ---------- ------------ -----------
bench interp (ms) stddev (%) yjit (ms) stddev (%) yjit 1st itr interp/yjit
activerecord 31.2 3.4 17.0 5.7 1.29 1.83
chunky-png 543.5 0.5 367.0 0.7 1.40 1.48
erubi-rails 1044.2 0.6 564.7 1.3 1.69 1.85
hexapdf 1517.6 3.1 917.6 1.2 1.46 1.65
liquid-c 37.1 1.3 28.9 1.4 0.89 1.29
liquid-compile 39.0 1.4 29.9 1.6 0.76 1.30
liquid-render 89.9 1.8 39.6 1.4 1.37 2.27
lobsters 598.2 1.7 435.4 5.2 0.63 1.37
mail 79.8 3.1 52.5 1.0 0.79 1.52
psych-load 1441.5 1.7 885.4 0.5 1.60 1.63
railsbench 1010.8 1.0 609.3 1.3 1.24 1.66
ruby-lsp 40.9 3.4 29.2 30.0 0.66 1.40
sequel 39.8 1.8 33.0 2.4 1.18 1.21
-------------- ----------- ---------- --------- ---------- ------------ -----------
```
Without the patch:
```
interp: ruby 3.3.0dev (2023-11-07T07:56:43Z master 5a2779d40f) [arm64-darwin22]
yjit: ruby 3.3.0dev (2023-11-07T07:56:43Z master 5a2779d40f) +YJIT [arm64-darwin22]
-------------- ----------- ---------- --------- ---------- ------------ -----------
bench interp (ms) stddev (%) yjit (ms) stddev (%) yjit 1st itr interp/yjit
activerecord 31.3 3.3 16.7 5.5 1.36 1.88
chunky-png 521.6 0.6 348.8 0.7 1.40 1.50
erubi-rails 1038.9 0.9 566.3 1.2 1.70 1.83
hexapdf 1501.9 1.1 951.7 3.9 1.42 1.58
liquid-c 36.7 1.2 29.3 1.7 0.86 1.25
liquid-compile 38.8 1.1 29.7 3.7 0.73 1.31
liquid-render 92.2 0.9 38.3 1.0 1.47 2.40
lobsters 582.5 2.0 429.8 5.6 0.59 1.36
mail 77.9 1.3 54.8 0.9 0.76 1.42
psych-load 1419.1 0.7 887.7 0.5 1.60 1.60
railsbench 1017.8 1.1 609.9 1.2 1.24 1.67
ruby-lsp 41.0 2.2 28.8 28.8 0.64 1.43
sequel 36.0 1.5 30.4 1.8 1.11 1.18
-------------- ----------- ---------- --------- ---------- ------------ -----------
```
Maybe this is within the noise floor, but I thought I should bring it up.
--
https://bugs.ruby-lang.org/
Issue #19970 has been reported by larsin (Lars Ingjer).
----------------------------------------
Bug #19970: Eval leaks callcache and callinfo objects on arm32 (linux)
https://bugs.ruby-lang.org/issues/19970
* Author: larsin (Lars Ingjer)
* Status: Open
* Priority: Normal
* ruby -v: ruby 3.2.2 (2023-03-30 revision e51014f9c0) [armv7l-linux-eabihf]
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN
----------------------------------------
The following script demonstrates a memory leak on arm 32 (linux):
``` ruby
def gcdiff(n)
GC.start
if @last_gc_stat
puts "GC.stat #{n} diff old_objects: #{GC.stat(:old_objects) - @last_gc_stat}"
end
@last_gc_stat = GC.stat(:old_objects)
end
def foo
end
10.times do |i|
10_000.times do
eval 'foo'
end
gcdiff(i)
puts "Number of live objects: #{GC.stat(:heap_live_slots)}"
puts "Memory usage: #{`ps -o rss= -p #{$$}`}"
puts
end
```
Output:
```
Number of live objects: 41303
Memory usage: 11900
GC.stat 1 diff old_objects: 20037
Number of live objects: 61317
Memory usage: 13604
GC.stat 2 diff old_objects: 20001
Number of live objects: 81317
Memory usage: 14880
GC.stat 3 diff old_objects: 20000
Number of live objects: 101317
Memory usage: 16596
GC.stat 4 diff old_objects: 20000
Number of live objects: 121317
Memory usage: 17248
GC.stat 5 diff old_objects: 20000
Number of live objects: 141317
Memory usage: 18760
GC.stat 6 diff old_objects: 20000
Number of live objects: 161317
Memory usage: 19540
GC.stat 7 diff old_objects: 20000
Number of live objects: 181317
Memory usage: 21752
GC.stat 8 diff old_objects: 20000
Number of live objects: 201317
Memory usage: 21828
GC.stat 9 diff old_objects: 20000
Number of live objects: 221317
Memory usage: 24896
```
ObjectSpace.count_imemo_objects shows that imemo_callcache and imemo_callinfo are leaking.
The issue does not occur on arm64 mac or x86_64 linux with the same ruby version.
The issue has also been reproduced with the latest 3.2.2 snapshot (2023-09-30).
--
https://bugs.ruby-lang.org/