[ruby-core:115888] [Ruby master Bug#20083] Regexp#match? behaving inconsistently with Ruby 3.3.0

Issue #20083 has been reported by jussikos (Jussi Koljonen). ---------------------------------------- Bug #20083: Regexp#match? behaving inconsistently with Ruby 3.3.0 https://bugs.ruby-lang.org/issues/20083 * Author: jussikos (Jussi Koljonen) * Status: Open * Priority: Normal * ruby -v: ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [x86_64-darwin23] * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN ---------------------------------------- From irb, when calling String#match? ``` pattern = /([\s]*ABC)$/i p "1ABC".match?(pattern) # => true p "12ABC".match?(pattern) # => true p "123ABC".match?(pattern) # => true p "1231ABC".match?(pattern) # => true p "12312ABC".match?(pattern) # => false p "123123ABC".match?(pattern) # => false p "1231231ABC".match?(pattern) # => true p "12312312ABC".match?(pattern) # => true p "123123123ABC".match?(pattern) # => false p "1231231231ABC".match?(pattern) # => false p "12312312312ABC".match?(pattern) # => true p "123123123123ABC".match?(pattern) # => true p "1231231231231ABC".match?(pattern) # => false ``` With `ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x86_64-darwin22]` and earlier versions (2.7.8 to 3.2.2) return value is always `true` -- https://bugs.ruby-lang.org/

Issue #20083 has been updated by make_now_just (Hiroya Fujinami). I created a PR for this bug (See https://github.com/ruby/ruby/pull/9367). Thank you for your reporting! The bug reason is a combination of a regex optimization and a bug for atomic groups. First, since `\s` and the following `A` (internally it is treated as `[aA]` on `i` flag.) is mutually disjoint, `\s*ABC` is optimized to `(?>\s*)ABC`. Next, match cache optimization for atomic groups in this case is buggy, so the matching results become wrong. When `i` flag is not given, another optimization is applied and `\s*ABC` is optimized to `(?:(?!A)\s)*ABC`, so the bug is not occurred. ---------------------------------------- Bug #20083: String#match? behaving inconsistently with Ruby 3.3.0 https://bugs.ruby-lang.org/issues/20083#change-105874 * Author: jussikos (Jussi Koljonen) * Status: Open * Priority: Normal * ruby -v: ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [x86_64-darwin23] * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN ---------------------------------------- From irb, when calling String#match? ``` pattern = /([\s]*ABC)$/i # or /(\s*ABC)/i p "1ABC".match?(pattern) # => true p "12ABC".match?(pattern) # => true p "123ABC".match?(pattern) # => true p "1231ABC".match?(pattern) # => true p "12312ABC".match?(pattern) # => false p "123123ABC".match?(pattern) # => false p "1231231ABC".match?(pattern) # => true p "12312312ABC".match?(pattern) # => true p "123123123ABC".match?(pattern) # => false p "1231231231ABC".match?(pattern) # => false p "12312312312ABC".match?(pattern) # => true p "123123123123ABC".match?(pattern) # => true p "1231231231231ABC".match?(pattern) # => false ``` With `ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x86_64-darwin22]` and earlier versions (2.7.8 to 3.2.2) return value is always `true` Update: the problem seems to be somehow related to the `/i` option, as all the above examples work correctly with `/([\s]*ABC)$/` -- https://bugs.ruby-lang.org/

Issue #20083 has been updated by naruse (Yui NARUSE). Backport changed from 3.0: DONTNEED, 3.1: DONTNEED, 3.2: DONTNEED, 3.3: REQUIRED to 3.0: DONTNEED, 3.1: DONTNEED, 3.2: DONTNEED, 3.3: DONE ruby_3_3 5f3dfa1c273c6fb9eae65ceca633b46f7e30f686 merged revision(s) d8702ddbfbe8cc7fc601a9a4d19842ef9c2b76c1. ---------------------------------------- Bug #20083: String#match? behaving inconsistently with Ruby 3.3.0 https://bugs.ruby-lang.org/issues/20083#change-106513 * Author: jussikos (Jussi Koljonen) * Status: Closed * Priority: Normal * ruby -v: ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [x86_64-darwin23] * Backport: 3.0: DONTNEED, 3.1: DONTNEED, 3.2: DONTNEED, 3.3: DONE ---------------------------------------- From irb, when calling String#match? ``` pattern = /([\s]*ABC)$/i # or /(\s*ABC)/i p "1ABC".match?(pattern) # => true p "12ABC".match?(pattern) # => true p "123ABC".match?(pattern) # => true p "1231ABC".match?(pattern) # => true p "12312ABC".match?(pattern) # => false p "123123ABC".match?(pattern) # => false p "1231231ABC".match?(pattern) # => true p "12312312ABC".match?(pattern) # => true p "123123123ABC".match?(pattern) # => false p "1231231231ABC".match?(pattern) # => false p "12312312312ABC".match?(pattern) # => true p "123123123123ABC".match?(pattern) # => true p "1231231231231ABC".match?(pattern) # => false ``` With `ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x86_64-darwin22]` and earlier versions (2.7.8 to 3.2.2) return value is always `true` Update: the problem seems to be somehow related to the `/i` option, as all the above examples work correctly with `/([\s]*ABC)$/` -- https://bugs.ruby-lang.org/
participants (3)
-
jussikos (Jussi Koljonen)
-
make_now_just (Hiroya Fujinami)
-
naruse (Yui NARUSE)