Issue #21870 has been updated by jneen (Jeanine Adkisson). If there are no objections, I'll submit a patch with strategy (a) next week. It's straightforward to implement and maintains the closest to the current behaviour as possible while fixing the issue. ---------------------------------------- Bug #21870: Regexp: Warnings when using slightly overlapping \p{...} classes https://bugs.ruby-lang.org/issues/21870#change-116587 * Author: jneen (Jeanine Adkisson) * Status: Open * ruby -v: 4.0.1 * Backport: 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN, 4.0: UNKNOWN ---------------------------------------- ```ruby $VERBOSE = true # warning: character class has duplicated range: /[\p{Word}\p{S}]/ regex = /[\p{Word}\p{S}]/ ``` As far as I can tell this is a perfectly valid and non-redundant set of unicode properties, but I am still being spammed with warnings. Using `/(?:\p{Word}|\p{S})/` is kind of a workaround, but it is slower (see benchmarks below), and also less clear. They do overlap somewhat, but I think the deeper issue is there is not a convenient way to express this without falling back to raw unicode ranges. For a similar example, consider `/[\p{Word}\p{Cf}]/`, which overlap precisely on ZWJ and ZWNJ. Even with this very small overlap, Ruby issues a warning, despite neither class being removable without changing the meaning of the regexp. The regexp is valid and as far as I can tell has no practical issues - Onigmo seems to be capable of intersecting overlapping codepoint ranges. This warning was introduced back in 2009 with #1831, to help surface instances of things like `/[:lower:]/` instead of `/[[:lower:]]/`, but even then the reporter suggested only warning if the class both begins and ends with `:`. Is it appropriate to warn here? Is this a job best left to a static linter like Rubocop, which didn't exist at the time #1831 was opened? Or perhaps would it be better to warn only in the very specific case that #1831 was opened to address? -- https://bugs.ruby-lang.org/