Issue #21795 has been updated by mame (Yusuke Endoh). I anticipated that we would consider this eventually, but incorporating it into the core presents significant challenges. Here are two major issues regarding feasibility. (Based on chats with @ko1, @tompng, and @yui-knk, though these are my personal views.) ## The Implementation Approach CRuby currently discards source code and ASTs after ISeq generation. The proposed `#ast` method would have to re-read and re-parse the source, which causes two problems: 1. If the file is modified after loading, `#ast` may return the wrong node. 2. It does not work for `eval` strings. `error_highlight` accepts this fragility because it displays just "hints". But I don't think that it is allowed for a built-in method. At least, we must avoid returning an incorrect node, and clarify when failures occur. I propose two approaches: 1. Keep loaded source in memory (e.g., `RubyVM.keep_script_lines = true` by default). This supports `eval` but increase memory usage. 2. Validate source hash. Store a hash in the ISeq and check it to ensure the file hasn't changed. ## The Parser Switching Problem What is the node definition returned by `#ast`? As noted in #21618, built-in Prism is not exposed as a Ruby API. If `Gemfile.lock` specifies an older version of prism gem, even `require "prism"` won't provide the expected definition. IMO, it would be good to have a node definition that does not depend on prism gem (maybe `Ruby::Node`?). I am not sure how much effort is needed for this. We would also need to consider where to place what in the ruby/prism and ruby/ruby repositories for development. We also need to decide if `#ast` should return `RubyVM::AST::Node` when `--parser=parse.y` is specified. ---------------------------------------- Feature #21795: Methods for retrieving ASTs https://bugs.ruby-lang.org/issues/21795#change-115824 * Author: kddnewton (Kevin Newton) * Status: Open ---------------------------------------- I would like to propose a handful of methods for retrieving ASTs from various objects that correspond to locations in code. This includes: * Proc#ast * Method#ast * UnboundMethod#ast * Thread::Backtrace::Location#ast * TracePoint#ast (on call/return events) The purpose of this is to make tooling easier to write and maintain. Specifically, this would be able to be used in irb, power_assert, error_highlight, and various other tools both in core and not that make use of source code. There have been many previous discussions of retrieving node_id, source_location, source, etc. All of these use cases are covered by returning the AST for some entity. In this case node_id becomes an implementation detail, invisible to the user. Source location can be derived from the information on the AST itself. Similarly, source can be derived from the AST. Internally, I do not think we have to store any more information than we already do (since we have node_id for the first four of these, it becomes rather trivial). For TracePoint we can have a larger discussion about it, but I think it should not be too much work. In terms of implementation, the only caveat I would put is that if the ISEQ were compiled through the old parser/compiler, this should return `nil`, as the node ids do not match up and we do not want to further propagate the RubyVM::AST API. The reason I am opening up this ticket with 5 different methods requested in it is to get approval first for the direction, then I can open individual tickets or just PRs for each method. I believe this feature would ease the maintenance burden of many core libraries, and unify otherwise disparate efforts to achieve the same thing. -- https://bugs.ruby-lang.org/