[ruby-core:124438] [Ruby Feature#6012] Proc#source_location also return the column
Issue #6012 has been updated by Eregon (Benoit Daloze). mame (Yusuke Endoh) wrote in #note-48:
A "use case" is the ultimate goal that the examples aim to achieve via `Method#source`.
The above snippets show examples of source transformation, instrumentation of code by adding extra code in the method body, displaying the source code of a method to the user, and other use cases which would need more time to find out. In any case I don't think it's our place to judge whether each usage is "valid" or "correct" or not, it's definitely useful functionality since it's used so much. For example, some gems might transform the source code of a method as a string, you might argue it's less safe but it's also much simpler than parsing + unparsing and it obviously works well enough for them.
By the way, I tried the `method_source` gem, and I noticed that it includes heredoc content that lies outside the method definition block.
I think that's just a side effect of `method_source` currently adding lines until it's valid Ruby source code. In practice I think there are very few heredocs like this for methods, so it probably does not matter for 99.99% cases. Note that the current `method_source` approach is much worse than using the new `source_location`, because it's incorrect for e.g. ```ruby def method end; any code ``` e.g. ```ruby def foo end; def bar end ``` where it would return the entire code for the source of `foo`. This is much more common than heredocs spanning across `end`. For example this is common in tests: ```ruby def some_test some assertions end unless /mswin|mingw/ =~ RUBY_PLATFORM ``` The gem currently return the entire snippet, but with `source_location` it would be correct and stop at `end`
My point is that the "better" return value depends entirely on the final use case.
Yes, that's fair but that edge case is very rare. For the final use case of getting an AST, `Prism.node_for(Method)` or `Method#ast` is more direct and works even for that edge case. For the final use case of displaying the method source code it seems nicer to include the heredoc, and OTOH that is somewhat complicated and unintuitive to do from the AST.
Do you have a counter-example where this wouldn't hold?
I don't have one at present. However, such a case could arise through future extensions. For example, @ko1 might add a new TracePoint type in the future, and depending on the type, it might be necessary to distinguish between `StatementsNode` and `CallNode`.
#21795 is specifically about call/return TracePoint, so that's not an issue. I don't think anyone would ever want to get a `StatementsNode` from some `#ast` method, that's "invisible" in user code (IOW, it doesn't "consume" characters in the source to exist). So I think for every node that "consumes" characters in the source to exist there is no problem, it's uniquely identified by start/end line/column.
More importantly, CRuby can implement `Prism.node_for` using `node_id` without the risk of future ambiguity. There is no reason to implement such a heuristic hack that tries to select overlapping nodes based on Ruby object type.
IMO using `node_id` is a hack, and my implementation of `Prism.node_for` in https://github.com/ruby/prism/pull/3808 is AFAIK fully correct and portable (does not depend on CRuby internals, can easily be supported by all Ruby implementations through the extended `source_location` from this issue). There is no "heuristic hack", it's the fully correct and deterministic decision in this context. Nobody expects the `StatementsNode` instead of the `CallNode` for `Prism.node_for(Proc)`. That implementation also works even with `--parser=parse.y`, while it wouldn't if it were using `node_id`. This is a good illustration that start/end line/column are more universal and well-defined, unlike `node_id` which is dependent on the parser and order-of-iteration. I do agree with you that for final use case of getting a Prism AST, #21795 would be more direct and safer (e.g. can check hash of the file). And there it's totally fine to use `node_id` internally on CRuby, as it remains an implementation detail not exposed to the user. So I'd be happy if you help me get that issue approved. But there are other use cases for `source_location` to include end line and start/end column as shown in my previous comment, and the current implementation meets them well, except for very rare edge cases where as you say, there is no correct or wrong way. ---------------------------------------- Feature #6012: Proc#source_location also return the column https://bugs.ruby-lang.org/issues/6012#change-115987 * Author: rogerdpack (Roger Pack) * Status: Closed * Assignee: nobu (Nobuyoshi Nakada) ---------------------------------------- As originally suggested in http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/42418 Suggestion/feature request: have #source_location also return the beginning column where it was defined. ["test.rb", 8, 33] Thanks! -roger- -- https://bugs.ruby-lang.org/
participants (1)
-
Eregon (Benoit Daloze)