Issue #20252 has been reported by koic (Koichi ITO).
----------------------------------------
Bug #20252: Incompatibility with the `-h` option in optparse on Ruby 3.4.0dev
https://bugs.ruby-lang.org/issues/20252
* Author: koic (Koichi ITO)
* Status: Open
* Priority: Normal
* ruby -v: ruby 3.4.0dev (2024-02-09T12:28:26Z master 08b77dd682) [x86_64-darwin23]
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN
----------------------------------------
An incompatibility has arisen when using optparse with Ruby 3.4.0dev. Below are the steps to reproduce:
```ruby
# example.rb
require 'optparse'
OptionParser.new do |opts|
opts.on('--[no-]foo')
end.parse!
```
## Expected (Ruby 3.3 or lower)
It is represented as `--[no-]foo`.
```console
$ ruby -v
ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [x86_64-darwin22]
$ ruby example.rb -h
Usage: example [options]
--[no-]foo
```
## Actual (Ruby 3.4)
It is represented as `--foo, --no-foo`.
```console
$ ruby -v
ruby 3.4.0dev (2024-02-09T12:28:26Z master 08b77dd682) [x86_64-darwin23]
$ ruby example.rb -h
Usage: example [options]
--foo, --no-foo
```
This change is likely due to https://github.com/ruby/optparse/pull/60.
I have a question. Is the alteration in the representation of the `-h` option from Ruby 3.4.0dev intentional, or should the representation as it was up to Ruby 3.3 be maintained?
This incompatibility was encountered during RuboCop's CI.
https://github.com/rubocop/rubocop/actions/runs/7845618444/job/21410458812?…
--
https://bugs.ruby-lang.org/
Issue #20222 has been reported by kddnewton (Kevin Newton).
----------------------------------------
Misc #20222: Dedup-ing clarification
https://bugs.ruby-lang.org/issues/20222
* Author: kddnewton (Kevin Newton)
* Status: Open
* Priority: Normal
----------------------------------------
``` ruby
source = %q{"foo".freeze.equal?("foo".freeze)}
RubyVM::InstructionSequence.compile(source).eval # => true
RubyVM::InstructionSequence.compile_option = false
RubyVM::InstructionSequence.compile(source).eval # => false
```
`"foo".freeze` uses `opt_str_freeze` when optimizations are turned on, which also deduplicates. This means this code has different behavior depending on if optimizations are turned on or off.
To be clear, I'm not saying whether or not this is a problem. I'm asking if this is desired behavior?
--
https://bugs.ruby-lang.org/
Issue #20244 has been reported by nobu (Nobuyoshi Nakada).
----------------------------------------
Feature #20244: Show the conflicting another chdir block
https://bugs.ruby-lang.org/issues/20244
* Author: nobu (Nobuyoshi Nakada)
* Status: Open
* Priority: Normal
----------------------------------------
`Dir.chdir` is warning when in another `chdir` block.
```sh-session
$ ruby -e 'Dir.chdir {' -e 'Dir.chdir("/")' -e '}'
-e:2: warning: conflicting chdir during another chdir block
```
If two `chdir`s are far apart, it can be difficult to find conflicting blocks.
To help the debugging, I propose to improve the warning message to show the conflicting block.
```sh-session
$ ./ruby -e 'Dir.chdir {' -e 'Dir.chdir("/")' -e '}'
-e:2: warning: conflicting chdir during another chdir block
-e:1: warning: here
```
https://github.com/ruby/ruby/pull/9870
--
https://bugs.ruby-lang.org/
Issue #20243 has been reported by kjtsanaktsidis (KJ Tsanaktsidis).
----------------------------------------
Bug #20243: M:N threading VM_ASSERT failure in rb_current_execution_context with clang 17 (on Linux)
https://bugs.ruby-lang.org/issues/20243
* Author: kjtsanaktsidis (KJ Tsanaktsidis)
* Status: Open
* Priority: Normal
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN
----------------------------------------
When building with Clang 17 and `-DVM_CHECK_MODE=1` (with the following configure)
```
optflags="-ggdb3 -fno-omit-frame-pointer -fno-optimize-sibling-calls -O3" cflags="-DVM_CHECK_MODE=1" CC=clang ../configure --prefix=/home/kj/ruby/installed --enable-yjit=dev --disable-install-doc
```
And then running the following script with the built `./miniruby` (which is actually from `bootstraptest/test_ractor.rb`):
```ruby
counts = []
counts << Ractor.count
ractors = (1..3).map { Ractor.new { Ractor.receive } }
counts << Ractor.count
ractors[0].send('End 0').take
sleep 0.1 until ractors[0].inspect =~ /terminated/
counts << Ractor.count
ractors[1].send('End 1').take
sleep 0.1 until ractors[1].inspect =~ /terminated/
counts << Ractor.count
ractors[2].send('End 2').take
sleep 0.1 until ractors[2].inspect =~ /terminated/
counts << Ractor.count
counts.inspect
```
I get the following crash:
```
Assertion Failed: ../vm_core.h:1957:rb_current_execution_context:ec == rb_current_ec_noinline()
ruby 3.4.0dev (2024-02-07T07:52:06Z ktsanaktsidis/igno.. 5cc6d944c2) [x86_64-linux]
-- Control frame information -----------------------------------------------
c:0003 p:0003 s:0010 e:000009 METHOD <internal:ractor>:431
c:0002 p:0004 s:0006 e:000005 BLOCK ractor_crash.rb:3 [FINISH]
c:0001 p:---- s:0003 e:000002 DUMMY [FINISH]
-- Ruby level backtrace information ----------------------------------------
ractor_crash.rb:3:in `block (2 levels) in <main>'
<internal:ractor>:431:in `receive'
-- Threading information ---------------------------------------------------
Total ractor count: 2
Ruby thread count for this ractor: 1
-- C level backtrace information -------------------------------------------
/home/kj/ruby/build/miniruby(rb_print_backtrace+0x14) [0x55faa97a4ebd] ../vm_dump.c:820
/home/kj/ruby/build/miniruby(rb_vm_bugreport) ../vm_dump.c:1151
/home/kj/ruby/build/miniruby(rb_assert_failure+0x81) [0x55faa94d2719] ../error.c:1131
./miniruby(thread_sched_wait_running_turn+0x2e9) [0x55faa9726f59]
/home/kj/ruby/build/miniruby(rb_ractor_sched_sleep+0x10b) [0x55faa972687b] ../thread_pthread.c:1348
/home/kj/ruby/build/miniruby(ractor_check_ints+0x0) [0x55faa968b328] ../ractor.c:683
/home/kj/ruby/build/miniruby(ractor_sleep_with_cleanup) ../ractor.c:684
/home/kj/ruby/build/miniruby(ractor_sleep+0x15) [0x55faa968adf4] ../ractor.c:701
/home/kj/ruby/build/miniruby(ractor_wait_receive) ../ractor.c:748
/home/kj/ruby/build/miniruby(ractor_receive+0x1f) [0x55faa968768e] ../ractor.c:762
/home/kj/ruby/build/miniruby(builtin_inline_class_431) ../ractor.rb:432
/home/kj/ruby/build/miniruby(builtin_invoker0+0x6) [0x55faa978fc66] ../vm_insnhelper.c:6746
/home/kj/ruby/build/miniruby(invoke_bf+0x39) [0x55faa979816e] ../vm_insnhelper.c:6886
/home/kj/ruby/build/miniruby(vm_invoke_builtin_delegate) ../vm_insnhelper.c:6909
/home/kj/ruby/build/miniruby(rb_vm_check_ints+0x0) [0x55faa9771fac] ../insns.def:1533
/home/kj/ruby/build/miniruby(vm_pop_frame) ../vm_insnhelper.c:419
/home/kj/ruby/build/miniruby(vm_exec_core) ../insns.def:1537
/home/kj/ruby/build/miniruby(vm_exec_loop+0x0) [0x55faa9767f02] ../vm.c:2489
/home/kj/ruby/build/miniruby(rb_vm_exec) ../vm.c:2492
/home/kj/ruby/build/miniruby(invoke_block+0x6f) [0x55faa9781a58] ../vm.c:1512
/home/kj/ruby/build/miniruby(invoke_iseq_block_from_c) ../vm.c:1582
/home/kj/ruby/build/miniruby(invoke_block_from_c_proc) ../vm.c:1680
/home/kj/ruby/build/miniruby(vm_invoke_proc) ../vm.c:1710
/home/kj/ruby/build/miniruby(rb_vm_invoke_proc_with_self+0x5a) [0x55faa9781eaa] ../vm.c:1745
/home/kj/ruby/build/miniruby(thread_do_start_proc+0x199) [0x55faa9739e19] ../thread.c:574
/home/kj/ruby/build/miniruby(thread_do_start+0x6c) [0x55faa973933f] ../thread.c:618
/home/kj/ruby/build/miniruby(thread_start_func_2) ../thread.c:668
/home/kj/ruby/build/miniruby(rb_native_mutex_lock+0x0) [0x55faa973a141] ../thread_pthread.c:2234
/home/kj/ruby/build/miniruby(thread_sched_lock_) ../thread_pthread.c:387
/home/kj/ruby/build/miniruby(call_thread_start_func_2) ../thread_pthread_mn.c:436
/home/kj/ruby/build/miniruby(co_start) ../thread_pthread_mn.c:434
```
The failing assertion is this one in vm_core.h: https://github.com/ruby/ruby/blob/42c36269403baac67b0d5dc1d6d6e31168cf6a1f/…. It actually has a very helpful comment.
```
/* On the shared objects, `__tls_get_addr()` is used to access the TLS
* and the address of the `ruby_current_ec` can be stored on a function
* frame. However, this address can be mis-used after native thread
* migration of a coroutine.
* 1) Get `ptr =&ruby_current_ec` op NT1 and store it on the frame.
* 2) Context switch and resume it on the NT2.
* 3) `ptr` is used on NT2 but it accesses to the TLS on NT1.
* This assertion checks such misusage.
*
* To avoid accidents, `GET_EC()` should be called once on the frame.
* Note that inlining can produce the problem.
*/
VM_ASSERT(ec == rb_current_ec_noinline());
```
What seems to be happening is exactly that. This is a disassembly of the relevant bits of `thread_sched_wait_running_turn`:
```
........
# This is the only bits of the entire function which access the TLS base register %fs.
# It seems to have spilled the value of ruby_current_ec into %r13.
0x000055603d2e1cf8 <+136>: mov $0xffffffffffffff90,%rax
0x000055603d2e1cff <+143>: mov %fs:0x0,%r12
0x000055603d2e1d08 <+152>: add %rax,%r12
0x000055603d2e1d0b <+155>: mov %fs:(%rax),%r13
........
# There's a call to coroutine_transfer, so after this point we're returned to on a
# different thread
0x000055603d2e1e90 <+544>: call 0x55603d7fce84 <coroutine_transfer>
# But nothing ever loads the address of ruby_current_ec from %fs again (i didn't trace
# exactly the data flow from %r13 at 0x000055603d2e1d0b to here, but i assume it spilled
# somewhere and now got loaded back into %r15 here). In any case, that means %r15 here
# contains the value of ruby_current_ec from the _old_ thread, not the current one.
0x000055603d2e1e95 <+549>: mov %rbx,0x28(%r14)
0x000055603d2e1e99 <+553>: mov (%r12),%r15
0x000055603d2e1e9d <+557>: call 0x55603d33a010 <rb_current_ec_noinline>
0x000055603d2e1ea2 <+562>: cmp %rax,%r15
=> 0x000055603d2e1ea5 <+565>: jne 0x55603d2e1f3a <thread_sched_wait_running_turn+714>
........
# assertion failure code path.
0x000055603d2e1f3a <+714>: lea 0x542c0c(%rip),%rdi # 0x55603d824b4d
0x000055603d2e1f41 <+721>: lea 0x542c12(%rip),%rdx # 0x55603d824b5a
0x000055603d2e1f48 <+728>: lea 0x542c28(%rip),%rcx # 0x55603d824b77
0x000055603d2e1f4f <+735>: mov $0x7a5,%esi
0x000055603d2e1f54 <+740>: call 0x55603d08d698 <rb_assert_failure>
```
if we look at the register values from `0x000055603d2e1ea2`:
```
(rr) print/x $rax
$2 = 0x55603e159ad0
(rr) print/x $r15
$3 = 0x0
```
So the value from `%rax` which came from `ruby_current_ec_noinline` is correctly the value of `ruby_current_ec` for this thread, and `%r15` contains a stale value from a previous thread.
-
Now, what can we _do_ about this, is a different question :/ There's a really good stackoverflow answer about it here: https://stackoverflow.com/questions/75592038/how-to-disable-clang-expressio…, but to summarise
* longstanding GCC and Clang bugs for this exist and have been marked as WONTFIX (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26461, https://github.com/llvm/llvm-project/issues/19551)
* It's even worse than this EC problem - things like `errno` also might be incorrectly persisted across coroutine switches (so e.g. an inlined C library function could in theory set `errno` in another thread, for example)
* C++ actually has coroutines now, so this _must_ work for those. Clang at least has fixed some TLS problems in their C++ coroutine implementation (https://github.com/llvm/llvm-project/issues/47179)
Other than reimplementing all of our coroutine stuff on top of C++ coroutines, I'm not sure what else we can do. AFAICT there's no way to tell the compiler that we clobbered the `%fs` register because that's just not a thing in its model (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66631, but i assume clang is similar).
Thoughts? For now I think my workaround is to disable M:N at build time when building with ASAN (or turn optimizations down). At least this isn't a problem with `Fiber` because we never move them across threads (probably for this reason in part).
--
https://bugs.ruby-lang.org/
Issue #9113 has been updated by hackeron (Roman Gaufman).
We've been using jemalloc since 2018 and it cut our memory use to 1/5, it also stopped us needing to restart our app every 3-4 days as the memory slowly leaked. It is staggering to me why this isn't the default, until we discovered jemalloc we were actively looking to stop using Ruby for its memory bloating and leaking.
----------------------------------------
Feature #9113: Ship Ruby for Linux with jemalloc out-of-the-box
https://bugs.ruby-lang.org/issues/9113#change-106627
* Author: sam.saffron (Sam Saffron)
* Status: Closed
* Priority: Normal
----------------------------------------
libc's malloc is a problem, it fragments badly meaning forks share less memory and is slow compared to tcmalloc or jemalloc.
both jemalloc and tcmalloc are heavily battle tested and stable.
2 years ago redis picked up the jemalloc dependency see: http://oldblog.antirez.com/post/everything-about-redis-24.html
To quote antirez:
``
But an allocator is a serious thing. Since we introduced the specially encoded data types Redis started suffering from fragmentation. We tried different things to fix the problem, but basically the Linux default allocator in glibc sucks really, really hard.
``
---
I recently benched Discourse with tcmalloc / jemalloc and default and noticed 2 very important thing:
median request time reduce by up to 10% (under both)
PSS (proportional share size) is reduced by 10% under jemalloc and 8% under tcmalloc.
We can always use LD_PRELOAD to yank these in, but my concern is that standard distributions are using a far from optimal memory allocator. It would be awesome if the build, out-of-the-box, just checked if it was on Linux (eg: https://github.com/antirez/redis/blob/unstable/src/Makefile#L30-L34 ) and then used jemalloc instead.
---Files--------------------------------
0001-configure.in-add-with-jemalloc-option.patch (1.29 KB)
--
https://bugs.ruby-lang.org/
Issue #20240 has been reported by jmarrec (Julien Marrec).
----------------------------------------
Misc #20240: Unable to build ruby 3.1.0 on macOS when shared due to dylibs (libgmp) not found when running miniruby
https://bugs.ruby-lang.org/issues/20240
* Author: jmarrec (Julien Marrec)
* Status: Open
* Priority: Normal
----------------------------------------
I am trying to develop a conan (the C/C++ package manager) recipe for Ruby. The recipe would allow downstream users to 1) get a runnable ruby executable, and 2) be able to link to ruby, or embbed it in a C/C++ program if built statically, in an easy way.
Currently there is an existing ruby 3.1.0 recipe that I'm trying to adapt, so I have to support this version.
First off, let me say that I can succesfully build with 3.3.0, so I know something has changed for the better since then. I'm just at a lost when figuring out what I need to backport to make 3.1.0 work.
The original issue is that it appears miniruby is looking for some dylibs and not finding them. Even if I do define `LD_LIBRARY_PATH`, `DYLD_LIBRARY_PATH` or `DYLD_FALLBACK_LIBRARY_PATH` (any combinations of these three) in my env.
``` shell
dsymutil exe/ruby; { test -z '' || codesign -s '' -f exe/ruby; }
./miniruby \
-e 'prog, dest, inst = ARGV; dest += "/ruby"' \
-e 'exit unless prog==inst' \
-e 'unless prog=="ruby"' \
-e ' begin File.unlink(dest); rescue Errno::ENOENT; end' \
-e ' File.symlink(prog, dest)' \
-e 'end' \
ruby exe ruby
dyld[59344]: Library not loaded: @rpath/libgmp.10.dylib
Referenced from: <356E0011-6223-321A-9179-D55618D248D0> /Users/julien/.conan2/p/b/ruby9cafa28a7060d/b/build-release/miniruby
Reason: no LC_RPATH's found
make: *** [exe/ruby] Abort trap: 6
make: *** Deleting file `exe/ruby'
```
It seems that something is unsetting the variables, because this for eg works fine
```shell
DYLD_LIBRARY_PATH=/Users/julien/.conan2/p/b/zlib1f8e7d96319f0/p/lib:/Users/julien/.conan2/p/b/opense854e464e8ff6/p/lib:/Users/julien/.conan2/p/b/libyae2f0aa15c9e92/p/lib:/Users/julien/.conan2/p/b/libff05fe9d5b96f79/p/lib:/Users/julien/.conan2/p/b/readl0d0041a63fa03/p/lib:/Users/julien/.conan2/p/b/termc22b5bb1515971/p/lib:/Users/julien/.conan2/p/b/gmp676fa41eaa3d6/p/lib: /Users/julien/.conan2/p/b/ruby9cafa28a7060d/b/build-release/miniruby -e "puts 'Hello, world'"
```
My configure call is like this:
```shell
./configure --enable-shared --disable-static --prefix=/ '--bindir=${prefix}/bin' '--sbindir=${prefix}/bin' '--libdir=${prefix}/lib' '--includedir=${prefix}/include' '--oldincludedir=${prefix}/include' --disable-install-doc --enable-load-relative --with-zlib-dir=/Users/julien/.conan2/p/b/zlib1f8e7d96319f0/p --with-openssl-dir=/Users/julien/.conan2/p/b/opense854e464e8ff6/p --with-libffi-dir=/Users/julien/.conan2/p/b/libff05fe9d5b96f79/p --with-libyaml-dir=/Users/julien/.conan2/p/b/libyae2f0aa15c9e92/p --with-readline-dir=/Users/julien/.conan2/p/b/readl0d0041a63fa03/p --with-gmp-dir=/Users/julien/.conan2/p/b/gmp676fa41eaa3d6/p --with-opt-dir=/Users/julien/.conan2/p/b/opense854e464e8ff6/p:/Users/julien/.conan2/p/b/libff05fe9d5b96f79/p:/Users/julien/.conan2/p/b/libyae2f0aa15c9e92/p:/Users/julien/.conan2/p/b/readl0d0041a63fa03/p:/Users/julien/.conan2/p/b/gmp676fa41eaa3d6/p --disable-jit-support
```
I have tried to backport https://github.com/ruby/ruby/pull/6296/files and https://github.com/ruby/ruby/commit/48644e71096c70132be9dfdcbfb414ec2e68d18b and https://github.com/ruby/ruby/pull/8730 amongst other things but I can't make it work. (I even tried a more brute force approach patching a lot of files by diffing 3.3.0 with 3.1.0, but please note I don't know what I'm doing... and I can get to the install step but then I get some errors about Psych / libymal and undefined Gem::Install:Zlib).
I would **greatly** appreciate if someone can spare some time to help me wrap this up (I've been trying to make the recipe work for so long that I'm about to give up...)
--
https://bugs.ruby-lang.org/
Issue #20241 has been reported by kjtsanaktsidis (KJ Tsanaktsidis).
----------------------------------------
Bug #20241: Makefile rule for BUILTIN_ENCOBJS
https://bugs.ruby-lang.org/issues/20241
* Author: kjtsanaktsidis (KJ Tsanaktsidis)
* Status: Open
* Priority: Normal
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN
----------------------------------------
When I build Ruby out-of-source (e.g. in a `build/` subdirectory) by running `mkdir build && cd build && ../configure <args> && make`, the build fails because GCC tries to write `build/enc/ascii.o` but the `build/enc` subdirectory does not exist. If I run `make --debug` from this point, this is the chain of rules causing this to happen:
```
kj@kj-thinkpad build % make --debug V=1
GNU Make 4.4.1
Built for x86_64-redhat-linux-gnu
Copyright (C) 1988-2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Reading makefiles...
Updating makefiles....
Updating goal targets....
File 'all' does not exist.
File 'main' does not exist.
File 'exts' does not exist.
File 'build-ext' does not exist.
File 'exts.mk' does not exist.
File 'ext/configure-ext.mk' does not exist.
File 'miniruby' does not exist.
File 'enc/ascii.o' does not exist.
Must remake target 'enc/ascii.o'.
gcc -DVM_CHECK_MODE=1 -O3 -fno-fast-math -ggdb3 -Wall -Wextra -Wdeprecated-declarations -Wdiv-by-zero -Wduplicated-cond -Wimplicit-function-declaration -Wimplicit-int -Wpointer-arith -Wwrite-strings -Wold-style-definition -Wimplicit-fallthrough=0 -Wmissing-noreturn -Wno-cast-function-type -Wno-constant-logical-operand -Wno-long-long -Wno-missing-field-initializers -Wno-overlength-strings -Wno-packed-bitfield-compat -Wno-parentheses-equality -Wno-self-assign -Wno-tautological-compare -Wno-unused-parameter -Wno-unused-value -Wsuggest-attribute=format -Wsuggest-attribute=noreturn -Wunused-variable -Wmisleading-indentation -Wundef -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -fstack-protector-strong -fno-strict-overflow -fvisibility=hidden -fexcess-precision=standard -DRUBY_EXPORT -fPIE -I. -I.ext/include/x86_64-linux -I../include -I.. -I../prism -I../enc/unicode/15.0.0 -o enc/ascii.o -c ../enc/ascii.c
Assembler messages:
Fatal error: can't create enc/ascii.o: No such file or directory
make: *** [Makefile:448: enc/ascii.o] Error 1
```
I don't know when exactly this started - it could simply be that by random chance my machine did used to run other rules first which did create `enc/`, and now doesn't, since make rules are not a total order.
I'm not sure exactly what the right way to fix this. I think in a "standard GNU autotools project", the right way for these subdirs to be created is for each of them to have their own Makefile.in. That way, ./configure would create the subdirectory in the build dir and template out the Makefile.in into e.g. `build/enc/Makefile`. However, we don't use recursive make, so we don't call `AC_CONFIG_FILES([enc/Makefile])` or anything like that. (There is actually an `enc/Makefile.in`, but it's templated by a Ruby script, not by autoconf).
Has anybody else noticed this?
--
https://bugs.ruby-lang.org/
Issue #18009 has been updated by mjrzasa (Maciek Rząsa).
One more case:
```
[26] pry(main)> ("a".."z").to_a.join.scan(/[\W]/iu)
=> ["st"]
```
----------------------------------------
Bug #18009: Regexps \w and \W with /i option and /u option produce inconsistent results under nested negation and intersection
https://bugs.ruby-lang.org/issues/18009#change-106621
* Author: jirkamarsik (Jirka Marsik)
* Status: Open
* Priority: Normal
* ruby -v: ruby 3.0.1p64 (2021-04-05 revision 0fb782ee38) [x86_64-linux]
* Backport: 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN
----------------------------------------
This is a follow up to [issue 4044](https://bugs.ruby-lang.org/issues/4044). Its fix (https://github.com/k-takata/Onigmo/issues/4) handled the cases that were reported in the original issue, but there are other cases, which were omitted and now produce inconsistent results.
If the `\w` character set is used inside a nested negated character class, it will not be picked up by the part of the character class analyzer that's responsible for limiting the case-folding of certain character sets (like `\w` and `\W`) across the ASCII boundary. We then end up with the situation where `/[^\w]/iu` and `/[[^\w]]/iu` match different sets of characters.
```
irb(main):001:0> ("a".."z").to_a.join.scan(/\W/iu)
=> []
irb(main):002:0> ("a".."z").to_a.join.scan(/[^\w]/iu)
=> []
irb(main):003:0> ("a".."z").to_a.join.scan(/[[^\w]]/iu)
=> ["k", "s"]
```
This can also be demonstrated using the inverted matcher:
```
irb(main):004:0> ("a".."z").to_a.join.scan(/\w/iu).length
=> 26
irb(main):005:0> ("a".."z").to_a.join.scan(/[^[^\w]]/iu).length
=> 24
```
A similar issue also arises when using character class intersection. The idea behind the pattern compiler's analysis is that characters are allowed to case-fold across the ASCII boundary only if they are included in the character class by some other means than just being included in `\w` (or in one of several other character sets which have special treatment). Therefore, in the below, `/[\w]/iu` will not match the Kelvin sign `\u212a`, because that would mean crossing the ASCII boundary from `k` to `\u212a`. However, `/[kx]/iu` will match the Kelvin sign, because the `k` was not contributed by `\w` and therefore is not subject to the ASCII boundary restriction (we have to use `/[kx]/iu` instead of `/[k]/iu` in our examples, or else the pattern analyzer would replace `[k]` with `k` and follow a different code path).
```
irb(main):006:0> /[\w]/iu.match("\u212a")
=> nil
irb(main):007:0> /[kx]/iu.match("\u212a")
=> #<MatchData "K">
```
The problem then is when we perform an intersection of these two character sets. Since `[kx]` is a subset of `\w`, we would expect their intersection to behave the same as `[kx]`, but that is not the case.
```
irb(main):008:0> /[\w&&kx]/i.match("\u212a")
=> nil
```
The underlying issue in these cases is the manner in which the `ascCc` character set is computed during the parsing of character classes. The `ascCc` character set should contain all characters of the character class except those which were contributed by `\w` and similar character sets. This is done in a way that these character sets are essentially ignored in the calculation of `ascCc`, which works well for set union and top-most negation (which is handled explicitly), but it doesn't handle nested set negation and set intersection.
--
https://bugs.ruby-lang.org/
Issue #20235 has been reported by Dan0042 (Daniel DeLorme).
----------------------------------------
Feature #20235: Deprecate CHAR syntax
https://bugs.ruby-lang.org/issues/20235
* Author: Dan0042 (Daniel DeLorme)
* Status: Open
* Priority: Normal
----------------------------------------
I propose deprecating the `?c` syntax. It served a purpose in ruby <= 1.8, but no longer.
The reason I'm proposing this is because today I ran into this error:
```ruby
p $stdin.closed?=>true # comparison of String with true failed (ArgumentError)
```
I was completed mystified, and had to resort to Ripper to figure out what's going on
```
p *Ripper.lex("p $stdin.closed?=>true")
[[1, 0], :on_ident, "p", CMDARG]
[[1, 1], :on_sp, " ", CMDARG]
[[1, 2], :on_gvar, "$stdin", END]
[[1, 8], :on_period, ".", DOT]
[[1, 9], :on_ident, "closed", ARG]
[[1, 15], :on_CHAR, "?=", END] #OOOOHH!!!!!
[[1, 17], :on_op, ">", BEG]
[[1, 18], :on_kw, "true", END]
```
We don't have to commit to a removal schedule right now, but I think it would at least be good to print a deprecation message if $VERBOSE.
--
https://bugs.ruby-lang.org/
Issue #20239 has been reported by martinsp (Martins Polakovs).
----------------------------------------
Bug #20239: Segmentation fault when using Regex on a large String
https://bugs.ruby-lang.org/issues/20239
* Author: martinsp (Martins Polakovs)
* Status: Open
* Priority: Normal
* ruby -v: ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [aarch64-linux]
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN
----------------------------------------
Since v3.2.0 ruby crashes with segmentation fault on the following script with a `[BUG] Segmentation fault at ...`
``` ruby
require "rbconfig/sizeof"
("\u{0101}" + "a" * RbConfig::LIMITS["INT_MAX"] + "b").match(/b/)
```
Crash can be reproduced on the following ruby versions:
- ruby 3.2.0 (2022-12-25 revision a528908271) [aarch64-linux]
- ruby 3.2.3 (2024-01-18 revision 52bb2ac0a6) [aarch64-linux]
- ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [aarch64-linux]
ruby 3.1.4p223 (2023-03-30 revision 957bb7cb81) [aarch64-linux] works as expected
It seems that call to `enclen` inside `str_lower_case_match` returns negative offset in this case https://bugs.ruby-lang.org/projects/ruby-master/repository/git/revisions/v3…
--
https://bugs.ruby-lang.org/