[ruby-core:124321] [Ruby Feature#21795] Methods for retrieving ASTs

19 Dec 2025

      Issue #21795 has been updated by mame (Yusuke Endoh).

I anticipated that we would consider this eventually, but incorporating it into the core presents significant challenges.

Here are two major issues regarding feasibility.

(Based on chats with @ko1, @tompng, and @yui-knk, though these are my personal views.)

## The Implementation Approach

CRuby currently discards source code and ASTs after ISeq generation. The proposed `#ast` method would have to re-read and re-parse the source, which causes two problems:

1. If the file is modified after loading, `#ast` may return the wrong node.
2. It does not work for `eval` strings.

`error_highlight` accepts this fragility because it displays just "hints". But I don't think that it is allowed for a built-in method. At least, we must avoid returning an incorrect node, and clarify when failures occur.

I propose two approaches:

1. Keep loaded source in memory (e.g., `RubyVM.keep_script_lines = true` by default). This supports `eval` but increase memory usage.
2. Validate source hash. Store a hash in the ISeq and check it to ensure the file hasn't changed.

## The Parser Switching Problem

What is the node definition returned by `#ast`?

As noted in #21618, built-in Prism is not exposed as a Ruby API. If `Gemfile.lock` specifies an older version of prism gem, even `require "prism"` won't provide the expected definition.

IMO, it would be good to have a node definition that does not depend on prism gem (maybe `Ruby::Node`?). I am not sure how much effort is needed for this. We would also need to consider where to place what in the ruby/prism and ruby/ruby repositories for development.

We also need to decide if `#ast` should return `RubyVM::AST::Node` when `--parser=parse.y` is specified.

----------------------------------------
Feature #21795: Methods for retrieving ASTs
https://bugs.ruby-lang.org/issues/21795#change-115824

* Author: kddnewton (Kevin Newton)
* Status: Open
----------------------------------------
I would like to propose a handful of methods for retrieving ASTs from various objects that correspond to locations in code. This includes:

* Proc#ast
* Method#ast
* UnboundMethod#ast
* Thread::Backtrace::Location#ast
* TracePoint#ast (on call/return events)

The purpose of this is to make tooling easier to write and maintain. Specifically, this would be able to be used in irb, power_assert, error_highlight, and various other tools both in core and not that make use of source code.

There have been many previous discussions of retrieving node_id, source_location, source, etc. All of these use cases are covered by returning the AST for some entity. In this case node_id becomes an implementation detail, invisible to the user. Source location can be derived from the information on the AST itself. Similarly, source can be derived from the AST.

Internally, I do not think we have to store any more information than we already do (since we have node_id for the first four of these, it becomes rather trivial). For TracePoint we can have a larger discussion about it, but I think it should not be too much work. In terms of implementation, the only caveat I would put is that if the ISEQ were compiled through the old parser/compiler, this should return `nil`, as the node ids do not match up and we do not want to further propagate the RubyVM::AST API.

The reason I am opening up this ticket with 5 different methods requested in it is to get approval first for the direction, then I can open individual tickets or just PRs for each method. I believe this feature would ease the maintenance burden of many core libraries, and unify otherwise disparate efforts to achieve the same thing.

-- 
https://bugs.ruby-lang.org/