
Issue #19617 has been updated by nevans (Nicholas Evans). Eregon (Benoit Daloze) wrote in #note-2:
Maybe an easier/better/more portable way to do that would be to have a flag to check that all Regexp are linear when they are executed, e.g., by the test suite? Like https://speakerdeck.com/eregon/just-in-time-compiling-ruby-regexps-on-truffl... Maybe it could be a performance warning (or even its own warning category) and then you could easily tweak the behavior, e.g. raise an exception if there is such a warning. ... It could also be a flag that can be set directly by Ruby code, not necessarily or only a command-line flag.
Yes, I like that a lot. Perhaps something like `Regexp.warn_nonlinear = true` or `Regexp.on_nonlinear = ->re { warn "nonlinear" }`. And I think it's especially useful to warn or raise when the regexp is created, not only when it's executed. But that would be a different issue from this one. When discussing this particular use case with others, several people suggested new regexp options to either enforce linear runtime when the regexp is created, or to allow non-linear regexps without warnings or exceptions (e.g. for simple scripts or when the inputs are safely controlled), e.g. `/enforced linear/l` and `/allow non-linear/L`. But that would _also_ be a different issue from this one. ;) Eregon (Benoit Daloze) wrote in #note-4:
one concern I have with https://github.com/ruby/net-imap/blob/92db350b24c388d2a2104f36cac9caa49a1044... is it could only work CRuby due to using RubyVM::InstructionSequence. That seems problematic for a security thing that really isn't CRuby-specific.
It also doesn't work with CRuby 3.0 or 3.1, and it isn't inspecting the iseq for procs that are stored as constants, etc. It can't catch everything, but it's a useful first step: it already found three existing non-linear regexps, and prevented a PR that would have added a fourth. But improvements to that test are also a different issue from this one. I assume TruffleRuby has a comparable mechanism to find regexps within the methods that define them, and PRs are welcome. But TruffleRuby already has a built in approach for warning on non-linear regexps are executed, so it's ahead on this. We should probable enable that TruffleRuby config flag in CI... but that's a different issue from this one. ;) byroot (Jean Boussier) wrote in #note-3:
Or iterate over all regexps through `ObjectSpace.each_object(Regexp)`? But I suppose it doesn't allow to only look at a specific namespace.
Yes, that is one of the reasons I didn't want to use that approach in the tests. Related but more significant for my use-case: ObjectSpace doesn't tell me what created the regexp or is holding a reference to any regexps I find. But it's still a useful technique that I used to manually gauge how thorough the tests were. Another approach would be to use TracePoint while loading and/or testing the library. I think that should allow limiting by namespace... but it's still a different ticket from this one. ---------------------------------------- Feature #19617: Add Method#binding and UnboundMethod#binding, similar to Proc#binding https://bugs.ruby-lang.org/issues/19617#change-102930 * Author: nevans (Nicholas Evans) * Status: Open * Priority: Normal ---------------------------------------- When a method is defined dynamically with `define_method`, it would be useful to obtain access to the defining block's binding and the local variables it has captured, based on the defining block's binding. For methods defined using the `def` keyword, the binding's local variables might be empty or might be all of the names in the method's locals table, with all values set to `nil`. For UnboundMethod, it is unclear (to me) what the appropriate receiver for the binding would be, so perhaps `unbound.binding.receiver` should raise an exception. Alternatively (or additionally), something like `Method#defining_proc` and `UnboundMethod#defining_proc` might be added and return `nil` for `def` definitions and the proc for `define_method` definitions. This would be a useful tool when debugging from the console. As another example, it might be used to scan a code base for dynamically generated regexps which are only reachable via the enclosed local variables and test that they are all linear time (see https://github.com/ruby/net-imap/blob/92db350b24c388d2a2104f36cac9caa49a1044...). -- https://bugs.ruby-lang.org/