
Issue #19694 has been updated by Eregon (Benoit Daloze). janosch-x (Janosch Müller) wrote in #note-7:
```ruby regexp = Regexp.with_timeout(2.0) { /foo/ } regexp.timeout # => 2.0 ```
That, and as proposed in the description, doesn't really work if literal Regexps are created at parse time, before execution. This is the case on CRuby: ``` $ ruby --disable-gems -e 'pp ObjectSpace.count_objects; O=Object.new; R=/a/' | grep REGEX :T_REGEXP=>3, $ ruby --disable-gems -e 'pp ObjectSpace.count_objects; O=Object.new' | grep REGEX :T_REGEXP=>2, ``` and it is the case on TruffleRuby as well. Also it could be confusing for `[2.0, 4.0].each { |t| Regexp.with_timeout(t) { /foo/ } }` (it would either set it to 2.0 or to the global timeout, never to 4.0). Furthermore, even if that worked, it would then break Regexp interning, where the timeout at one place would affect another literal Regexp with the same pattern. Basically, I think there is no way besides `Regexp.new(pattern, timeout: t)` if you want a custom timeout for a Regexp. Literal Regexp are created too early to set anything. And adding state (timeout=) to Regexp feels wrong, since most instances are already immutable, and they might become all immutable. I would suggest to close this, `Regexp.new("a", timeout: 2.0)` already works and I think there is no alternative that works well to set the timeout per Regexp. ---------------------------------------- Feature #19694: Add Regexp#timeout= setter https://bugs.ruby-lang.org/issues/19694#change-103347 * Author: aharpole (Aaron Harpole) * Status: Open * Priority: Normal ---------------------------------------- # Abstract In addition to allowing for a Regexp timeout to be set on individual instances by setting a `timeout` argument in `Regexp.new`, I'm proposing that we also allow setting the timeout on Regexp objects with a `#timeout=` setter. # Background To be able to roll out a global Regexp timeout for a large application, there are inevitably some individual regexes for which a different timeout is appropriate. While the `timeout` keyword argument was added to `Regexp.new`, this isn't always a viable option. In the case of regex literal syntax (`/ab*/` or `%r{ab*}`, for instance), it's not possible to set a timeout at all right now without converting to `Regexp.new`, which may be awkward depending on the contents of the regex. It also is desirable from time to time to be able to set a timeout for a regex object after it's been initialized. Finally, because we offer a `Regexp#timeout` getter, for consistency it would be nice to also offer a setter. The introduction of a `Regexp#timeout=` setter was mentioned as a possible way to set individual timeouts in https://bugs.ruby-lang.org/issues/19104#Specification. # Proposal I propose that we add the method `Regexp#timeout=`. It works the same way the `timeout` argument works in `Regexp.new`, taking either a float or nil. This makes it relatively easy to add timeouts to specific regex literals (regex literals are frozen by default so you do have to `dup` them first): ``` emoji_filter_pattern = %r{ (?<!#{Regexp.quote(ZERO_WIDTH_JOINER)}) #{EmojiFilter.unicodes_pattern} (?!#{Regexp.union(EmojiFilter::MODIFIER_CHAR_MAP.keys.map { |k| Regexp.quote k })}) }x.dup emoji_filter_pattern.timeout = 1.0 emoji_filter_pattern.freeze ``` # Implementation This setter has been implemented in https://github.com/ruby/ruby/pull/7847. # Evaluation It's just a setter, so pretty straightforward in terms of implementation and use. # Discussion It's worth considering other options for overriding `Regexp.timeout`. I'd love to see something like the following for overriding regexp timeouts as well: ``` Regexp.timeout = 1.0 Regexp.with_timeout(5.0) do evaluate_slower_regexes end ``` It's possible to implement something like `Regexp.with_timeout` but it's not thread-safe by default since it would involve overwriting `Regexp.timeout`. # Summary Regexp instances have a getter for timeout, and adding a corresponding setter adds consistency and will make it easier for developers to adopt adding a global `Regexp.timeout` by making it simpler to adjust timeouts on a regex by regex basis. It's a minor change but the added consistency and flexibility help us optimize for developer happiness. -- https://bugs.ruby-lang.org/