[ruby-core:114911] [Ruby master Feature#19904] Deprecate or warn on multiple regular expression encodings

Issue #19904 has been reported by tenderlovemaking (Aaron Patterson). ---------------------------------------- Feature #19904: Deprecate or warn on multiple regular expression encodings https://bugs.ruby-lang.org/issues/19904 * Author: tenderlovemaking (Aaron Patterson) * Status: Open * Priority: Normal ---------------------------------------- It seems like you can pass multiple encoding flags to regular expression literals, but I think this should be a warning or possibly syntax error. For example: ```ruby x = /foo/nu p x.encoding ``` `n` says the RE should be ASCII-8BIT, and `u` says it should be UTF-8. The last flag wins, so in this case the regular expression gets UTF-8 encoding. However, I think it should be a warning or even a syntax error if you specify multiple encoding options on a regular expression. It seems like a mistake if programmers specify multiple. Thanks! -- https://bugs.ruby-lang.org/

Issue #19904 has been updated by mame (Yusuke Endoh). I think it's a good idea, but am curious as to what (if any) led you to want to prohibit this. Did you get in trouble because of this? Or did you just notice it (while implementing Prism or something)? ---------------------------------------- Feature #19904: Deprecate or warn on multiple regular expression encodings https://bugs.ruby-lang.org/issues/19904#change-104782 * Author: tenderlovemaking (Aaron Patterson) * Status: Open * Priority: Normal ---------------------------------------- It seems like you can pass multiple encoding flags to regular expression literals, but I think this should be a warning or possibly syntax error. For example: ```ruby x = /foo/nu p x.encoding ``` `n` says the RE should be ASCII-8BIT, and `u` says it should be UTF-8. The last flag wins, so in this case the regular expression gets UTF-8 encoding. However, I think it should be a warning or even a syntax error if you specify multiple encoding options on a regular expression. It seems like a mistake if programmers specify multiple. Thanks! -- https://bugs.ruby-lang.org/

Issue #19904 has been updated by tenderlovemaking (Aaron Patterson). mame (Yusuke Endoh) wrote in #note-1:
I think it's a good idea, but am curious as to what (if any) led you to want to prohibit this. Did you get in trouble because of this? Or did you just notice it (while implementing Prism or something)?
No, it didn't cause any trouble. @eileencodes and I just noticed this while implementing regular expression support with Prism. ---------------------------------------- Feature #19904: Deprecate or warn on multiple regular expression encodings https://bugs.ruby-lang.org/issues/19904#change-104783 * Author: tenderlovemaking (Aaron Patterson) * Status: Open * Priority: Normal ---------------------------------------- It seems like you can pass multiple encoding flags to regular expression literals, but I think this should be a warning or possibly syntax error. For example: ```ruby x = /foo/nu p x.encoding ``` `n` says the RE should be ASCII-8BIT, and `u` says it should be UTF-8. The last flag wins, so in this case the regular expression gets UTF-8 encoding. However, I think it should be a warning or even a syntax error if you specify multiple encoding options on a regular expression. It seems like a mistake if programmers specify multiple. Thanks! -- https://bugs.ruby-lang.org/

Issue #19904 has been updated by nobu (Nobuyoshi Nakada). ```diff diff --git a/parse.y b/parse.y index 3b513d3ade8..278e7eff21b 100644 --- a/parse.y +++ b/parse.y @@ -8032,6 +8032,9 @@ regx_options(struct parser_params *p) else if (rb_char_to_option_kcode(c, &opt, &kc)) { if (kc >= 0) { if (kc != rb_ascii8bit_encindex()) kcode = c; + if (kopt) { + rb_warn0("multiple encoding options, ignored preceding"); + } kopt = opt; } else { ``` ---------------------------------------- Feature #19904: Deprecate or warn on multiple regular expression encodings https://bugs.ruby-lang.org/issues/19904#change-104785 * Author: tenderlovemaking (Aaron Patterson) * Status: Open * Priority: Normal ---------------------------------------- It seems like you can pass multiple encoding flags to regular expression literals, but I think this should be a warning or possibly syntax error. For example: ```ruby x = /foo/nu p x.encoding ``` `n` says the RE should be ASCII-8BIT, and `u` says it should be UTF-8. The last flag wins, so in this case the regular expression gets UTF-8 encoding. However, I think it should be a warning or even a syntax error if you specify multiple encoding options on a regular expression. It seems like a mistake if programmers specify multiple. Thanks! -- https://bugs.ruby-lang.org/

Issue #19904 has been updated by nobu (Nobuyoshi Nakada). https://github.com/nobu/ruby/tree/multiple-regexp-encodings ---------------------------------------- Feature #19904: Deprecate or warn on multiple regular expression encodings https://bugs.ruby-lang.org/issues/19904#change-104787 * Author: tenderlovemaking (Aaron Patterson) * Status: Open * Priority: Normal ---------------------------------------- It seems like you can pass multiple encoding flags to regular expression literals, but I think this should be a warning or possibly syntax error. For example: ```ruby x = /foo/nu p x.encoding ``` `n` says the RE should be ASCII-8BIT, and `u` says it should be UTF-8. The last flag wins, so in this case the regular expression gets UTF-8 encoding. However, I think it should be a warning or even a syntax error if you specify multiple encoding options on a regular expression. It seems like a mistake if programmers specify multiple. Thanks! -- https://bugs.ruby-lang.org/

Issue #19904 has been updated by nobu (Nobuyoshi Nakada). lol ```ruby it "selects last of multiple encoding specifiers" do /foo/ensuensuens.should == /foo/s end ``` ---------------------------------------- Feature #19904: Deprecate or warn on multiple regular expression encodings https://bugs.ruby-lang.org/issues/19904#change-104788 * Author: tenderlovemaking (Aaron Patterson) * Status: Open * Priority: Normal ---------------------------------------- It seems like you can pass multiple encoding flags to regular expression literals, but I think this should be a warning or possibly syntax error. For example: ```ruby x = /foo/nu p x.encoding ``` `n` says the RE should be ASCII-8BIT, and `u` says it should be UTF-8. The last flag wins, so in this case the regular expression gets UTF-8 encoding. However, I think it should be a warning or even a syntax error if you specify multiple encoding options on a regular expression. It seems like a mistake if programmers specify multiple. Thanks! -- https://bugs.ruby-lang.org/
participants (3)
-
mame (Yusuke Endoh)
-
nobu (Nobuyoshi Nakada)
-
tenderlovemaking (Aaron Patterson)