[ruby-core:115491] [Ruby master Feature#20024] SyntaxError subclasses

List overview All Threads
Download

newer

older

[ruby-core:116163] [Ruby master...

[ruby-core:116085] [Ruby master...

kddnewton (Kevin Newton)

27 Nov 2023 27 Nov '23

8:16 p.m.

Issue #20024 has been reported by kddnewton (Kevin Newton). ---------------------------------------- Feature #20024: SyntaxError subclasses https://bugs.ruby-lang.org/issues/20024 * Author: kddnewton (Kevin Newton) * Status: Open * Priority: Normal ---------------------------------------- There are many places around the Ruby ecosystem that handle syntax errors in different ways. Some provide highlighting, others provide recovery of some form, still more provide LSP metadata. In order to provide more rich information, most of them switch on the message of the error being returned, as in: https://github.com/ruby/irb/blob/f86d9dbe2fc05ed62332069a27f4aacc59ba9634/l… Within ruby/spec, specific error messages are required for these kinds of messages in order to support this implicit interface that syntax errors have a hidden type, which is only expressed through their message. For example: https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… It's not clear from these specs or from the parser itself which error messages are permanent/guaranteed versus which are changeable. Either way, relying on the error message itself as opposed to the type of the error is brittle at best. I would like to suggest instead we implement subclasses on `SyntaxError` that would allow tools that depend on specific syntax errors to rescue those subclasses instead of parsing the message. In addition to alleviating the need to parse error messages with regex, this would also allow for the possibility that the error messages could change in the future without breaking external tooling. Allowing these to change would allow them to be potentially enhanced or changed by other tools - for example by providing recovery information or translating them. This is particularly important for Prism since we are getting down to individual spec failures and some of the failures are related to the fact that we have messages like `"Numbered parameter is already used in outer scope"` where the spec requires `/numbered parameter is already used in/`. Even this case-sensitivity is causing failures, which seems like we're testing the wrong thing. -- https://bugs.ruby-lang.org/

Show replies by date

Eregon (Benoit Daloze)

28 Nov 28 Nov

10:28 a.m.

New subject: [ruby-core:115503] [Ruby master Feature#20024] SyntaxError subclasses

Issue #20024 has been updated by Eregon (Benoit Daloze). There are dozens of different error messages for SyntaxError so that would mean dozens of subclasses. That seems a lot to me. Also it would be equally hard to name all these subclasses. And we could not rename them easily either (more easily than changing messages but still).

...

Even this case-sensitivity is causing failures, which seems like we're testing the wrong thing.

I think we are testing what is necessary. As you see IRB does not allow for case changes there because it does not expect it. Many other gems expect a very specific exception message (not just for SyntaxError). So I think it's clear Prism needs to have the exact same messages whenever possible, when used as a Ruby parser for running Ruby code. For tooling I think having another more verbose/clearer/improved set of messages would make sense (as suggested by Andrii on the Prism issue tracker). I.e. the gem could provide improved messages (maybe as an option?). ---------------------------------------- Feature #20024: SyntaxError subclasses https://bugs.ruby-lang.org/issues/20024#change-105432 * Author: kddnewton (Kevin Newton) * Status: Open * Priority: Normal ---------------------------------------- There are many places around the Ruby ecosystem that handle syntax errors in different ways. Some provide highlighting, others provide recovery of some form, still more provide LSP metadata. In order to provide more rich information, most of them switch on the message of the error being returned, as in: https://github.com/ruby/irb/blob/f86d9dbe2fc05ed62332069a27f4aacc59ba9634/l… Within ruby/spec, specific error messages are required for these kinds of messages in order to support this implicit interface that syntax errors have a hidden type, which is only expressed through their message. For example: https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… It's not clear from these specs or from the parser itself which error messages are permanent/guaranteed versus which are changeable. Either way, relying on the error message itself as opposed to the type of the error is brittle at best. I would like to suggest instead we implement subclasses on `SyntaxError` that would allow tools that depend on specific syntax errors to rescue those subclasses instead of parsing the message. In addition to alleviating the need to parse error messages with regex, this would also allow for the possibility that the error messages could change in the future without breaking external tooling. Allowing these to change would allow them to be potentially enhanced or changed by other tools - for example by providing recovery information or translating them. This is particularly important for Prism since we are getting down to individual spec failures and some of the failures are related to the fact that we have messages like `"Numbered parameter is already used in outer scope"` where the spec requires `/numbered parameter is already used in/`. Even this case-sensitivity is causing failures, which seems like we're testing the wrong thing. -- https://bugs.ruby-lang.org/

kddnewton (Kevin Newton)

1:35 p.m.

New subject: [ruby-core:115506] [Ruby master Feature#20024] SyntaxError subclasses

Issue #20024 has been updated by kddnewton (Kevin Newton). I understand the need for compatibility with regard to Prism, but this ticket isn't about Prism specifically. I'm proposing this as a solution to the problem in general, devoid of the Prism context. Regardless of other parsers, there exists a problem that tools are relying on error messages for behavior. We can fix that by introducing subclasses. Of course it won't fix the immediate situation for anyone, but going forward it will be a much better situation. ---------------------------------------- Feature #20024: SyntaxError subclasses https://bugs.ruby-lang.org/issues/20024#change-105436 * Author: kddnewton (Kevin Newton) * Status: Open * Priority: Normal ---------------------------------------- There are many places around the Ruby ecosystem that handle syntax errors in different ways. Some provide highlighting, others provide recovery of some form, still more provide LSP metadata. In order to provide more rich information, most of them switch on the message of the error being returned, as in: https://github.com/ruby/irb/blob/f86d9dbe2fc05ed62332069a27f4aacc59ba9634/l… Within ruby/spec, specific error messages are required for these kinds of messages in order to support this implicit interface that syntax errors have a hidden type, which is only expressed through their message. For example: https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… It's not clear from these specs or from the parser itself which error messages are permanent/guaranteed versus which are changeable. Either way, relying on the error message itself as opposed to the type of the error is brittle at best. I would like to suggest instead we implement subclasses on `SyntaxError` that would allow tools that depend on specific syntax errors to rescue those subclasses instead of parsing the message. In addition to alleviating the need to parse error messages with regex, this would also allow for the possibility that the error messages could change in the future without breaking external tooling. Allowing these to change would allow them to be potentially enhanced or changed by other tools - for example by providing recovery information or translating them. This is particularly important for Prism since we are getting down to individual spec failures and some of the failures are related to the fact that we have messages like `"Numbered parameter is already used in outer scope"` where the spec requires `/numbered parameter is already used in/`. Even this case-sensitivity is causing failures, which seems like we're testing the wrong thing. -- https://bugs.ruby-lang.org/

byroot (Jean Boussier)

1:44 p.m.

New subject: [ruby-core:115507] [Ruby master Feature#20024] SyntaxError subclasses

Issue #20024 has been updated by byroot (Jean Boussier). Rather than sub classes, which have the annoying issue of having to stick around even if for some reason the case no longer exist, could we simply expose `SyntaxError#issue` (the name isn't very inspired) that would return a symbol? e.g. `:void_value`, `:number_parameter_used`, etc etc. ---------------------------------------- Feature #20024: SyntaxError subclasses https://bugs.ruby-lang.org/issues/20024#change-105437 * Author: kddnewton (Kevin Newton) * Status: Open * Priority: Normal ---------------------------------------- There are many places around the Ruby ecosystem that handle syntax errors in different ways. Some provide highlighting, others provide recovery of some form, still more provide LSP metadata. In order to provide more rich information, most of them switch on the message of the error being returned, as in: https://github.com/ruby/irb/blob/f86d9dbe2fc05ed62332069a27f4aacc59ba9634/l… Within ruby/spec, specific error messages are required for these kinds of messages in order to support this implicit interface that syntax errors have a hidden type, which is only expressed through their message. For example: https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… It's not clear from these specs or from the parser itself which error messages are permanent/guaranteed versus which are changeable. Either way, relying on the error message itself as opposed to the type of the error is brittle at best. I would like to suggest instead we implement subclasses on `SyntaxError` that would allow tools that depend on specific syntax errors to rescue those subclasses instead of parsing the message. In addition to alleviating the need to parse error messages with regex, this would also allow for the possibility that the error messages could change in the future without breaking external tooling. Allowing these to change would allow them to be potentially enhanced or changed by other tools - for example by providing recovery information or translating them. This is particularly important for Prism since we are getting down to individual spec failures and some of the failures are related to the fact that we have messages like `"Numbered parameter is already used in outer scope"` where the spec requires `/numbered parameter is already used in/`. Even this case-sensitivity is causing failures, which seems like we're testing the wrong thing. -- https://bugs.ruby-lang.org/

Eregon (Benoit Daloze)

3:19 p.m.

New subject: [ruby-core:115508] [Ruby master Feature#20024] SyntaxError subclasses

Issue #20024 has been updated by Eregon (Benoit Daloze). A Symbol per type of SyntaxError seems easier to maintain and wouldn't create so many classes/constants/etc which would be messy in docs. I think that's a much easier improvement to do than adding a bunch of subclasses. It's also better memory-wise as classes are not small. ---------------------------------------- Feature #20024: SyntaxError subclasses https://bugs.ruby-lang.org/issues/20024#change-105438 * Author: kddnewton (Kevin Newton) * Status: Open * Priority: Normal ---------------------------------------- There are many places around the Ruby ecosystem that handle syntax errors in different ways. Some provide highlighting, others provide recovery of some form, still more provide LSP metadata. In order to provide more rich information, most of them switch on the message of the error being returned, as in: https://github.com/ruby/irb/blob/f86d9dbe2fc05ed62332069a27f4aacc59ba9634/l… Within ruby/spec, specific error messages are required for these kinds of messages in order to support this implicit interface that syntax errors have a hidden type, which is only expressed through their message. For example: https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… It's not clear from these specs or from the parser itself which error messages are permanent/guaranteed versus which are changeable. Either way, relying on the error message itself as opposed to the type of the error is brittle at best. I would like to suggest instead we implement subclasses on `SyntaxError` that would allow tools that depend on specific syntax errors to rescue those subclasses instead of parsing the message. In addition to alleviating the need to parse error messages with regex, this would also allow for the possibility that the error messages could change in the future without breaking external tooling. Allowing these to change would allow them to be potentially enhanced or changed by other tools - for example by providing recovery information or translating them. This is particularly important for Prism since we are getting down to individual spec failures and some of the failures are related to the fact that we have messages like `"Numbered parameter is already used in outer scope"` where the spec requires `/numbered parameter is already used in/`. Even this case-sensitivity is causing failures, which seems like we're testing the wrong thing. -- https://bugs.ruby-lang.org/

Dan0042 (Daniel DeLorme)

3:27 p.m.

New subject: [ruby-core:115509] [Ruby master Feature#20024] SyntaxError subclasses

Issue #20024 has been updated by Dan0042 (Daniel DeLorme). byroot (Jean Boussier) wrote in #note-3:

...

Rather than sub classes, which have the annoying issue of having to stick around even if for some reason the case no longer exist, could we simply expose `SyntaxError#issue` (the name isn't very inspired) that would return a symbol?

That sounds good to me. It's much less cumbersome than dozens of subclasses, and has much better forward-compatibility. If this is introduced in ruby 3.3 but a "foo" syntax error is added to ruby 3.4, then `err.is_a?(FooSyntaxError)` would raise an error in ruby 3.3 but `err.message_code == :foo` would not. ---------------------------------------- Feature #20024: SyntaxError subclasses https://bugs.ruby-lang.org/issues/20024#change-105439 * Author: kddnewton (Kevin Newton) * Status: Open * Priority: Normal ---------------------------------------- There are many places around the Ruby ecosystem that handle syntax errors in different ways. Some provide highlighting, others provide recovery of some form, still more provide LSP metadata. In order to provide more rich information, most of them switch on the message of the error being returned, as in: https://github.com/ruby/irb/blob/f86d9dbe2fc05ed62332069a27f4aacc59ba9634/l… Within ruby/spec, specific error messages are required for these kinds of messages in order to support this implicit interface that syntax errors have a hidden type, which is only expressed through their message. For example: https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… It's not clear from these specs or from the parser itself which error messages are permanent/guaranteed versus which are changeable. Either way, relying on the error message itself as opposed to the type of the error is brittle at best. I would like to suggest instead we implement subclasses on `SyntaxError` that would allow tools that depend on specific syntax errors to rescue those subclasses instead of parsing the message. In addition to alleviating the need to parse error messages with regex, this would also allow for the possibility that the error messages could change in the future without breaking external tooling. Allowing these to change would allow them to be potentially enhanced or changed by other tools - for example by providing recovery information or translating them. This is particularly important for Prism since we are getting down to individual spec failures and some of the failures are related to the fact that we have messages like `"Numbered parameter is already used in outer scope"` where the spec requires `/numbered parameter is already used in/`. Even this case-sensitivity is causing failures, which seems like we're testing the wrong thing. -- https://bugs.ruby-lang.org/

kddnewton (Kevin Newton)

5:11 p.m.

New subject: [ruby-core:115510] [Ruby master Feature#20024] SyntaxError subclasses

Issue #20024 has been updated by kddnewton (Kevin Newton). I think I am okay with that compromise. I don't like that you won't be able to rescue specific errors, but I understand the concern about forward compatibility. My other desire for subclasses is that different syntax errors have different information on them. For example, `_1 = 1` is a syntax error that has the additional information that you're assigning to `_1`. Most libraries right now parse that out of the message, but if we had separate objects then we could define the shapes of the errors to have that information as a field. For example, that would be the `name` or `identifier` field. This could probably be addressed with something like a `metadata` hash or something else, if we're just adding fields to SyntaxError. ---------------------------------------- Feature #20024: SyntaxError subclasses https://bugs.ruby-lang.org/issues/20024#change-105440 * Author: kddnewton (Kevin Newton) * Status: Open * Priority: Normal ---------------------------------------- There are many places around the Ruby ecosystem that handle syntax errors in different ways. Some provide highlighting, others provide recovery of some form, still more provide LSP metadata. In order to provide more rich information, most of them switch on the message of the error being returned, as in: https://github.com/ruby/irb/blob/f86d9dbe2fc05ed62332069a27f4aacc59ba9634/l… Within ruby/spec, specific error messages are required for these kinds of messages in order to support this implicit interface that syntax errors have a hidden type, which is only expressed through their message. For example: https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… It's not clear from these specs or from the parser itself which error messages are permanent/guaranteed versus which are changeable. Either way, relying on the error message itself as opposed to the type of the error is brittle at best. I would like to suggest instead we implement subclasses on `SyntaxError` that would allow tools that depend on specific syntax errors to rescue those subclasses instead of parsing the message. In addition to alleviating the need to parse error messages with regex, this would also allow for the possibility that the error messages could change in the future without breaking external tooling. Allowing these to change would allow them to be potentially enhanced or changed by other tools - for example by providing recovery information or translating them. This is particularly important for Prism since we are getting down to individual spec failures and some of the failures are related to the fact that we have messages like `"Numbered parameter is already used in outer scope"` where the spec requires `/numbered parameter is already used in/`. Even this case-sensitivity is causing failures, which seems like we're testing the wrong thing. -- https://bugs.ruby-lang.org/

zverok (Victor Shepelev)

5:17 p.m.

New subject: [ruby-core:115511] [Ruby master Feature#20024] SyntaxError subclasses

Issue #20024 has been updated by zverok (Victor Shepelev).

...

I don't like that you won't be able to rescue specific errors, but I understand the concern about forward compatibility.

An aside note on this: in general, `rescue` clause is already looking very alike pattern-matching, and if we continue this approach (with adding causes/details to exceptions), we might once want to be able to... ```ruby rescue SyntaxError(issue: :void_value) => ex ``` Just a thought (but might have many interesting consequences). ---------------------------------------- Feature #20024: SyntaxError subclasses https://bugs.ruby-lang.org/issues/20024#change-105441 * Author: kddnewton (Kevin Newton) * Status: Open * Priority: Normal ---------------------------------------- There are many places around the Ruby ecosystem that handle syntax errors in different ways. Some provide highlighting, others provide recovery of some form, still more provide LSP metadata. In order to provide more rich information, most of them switch on the message of the error being returned, as in: https://github.com/ruby/irb/blob/f86d9dbe2fc05ed62332069a27f4aacc59ba9634/l… Within ruby/spec, specific error messages are required for these kinds of messages in order to support this implicit interface that syntax errors have a hidden type, which is only expressed through their message. For example: https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… It's not clear from these specs or from the parser itself which error messages are permanent/guaranteed versus which are changeable. Either way, relying on the error message itself as opposed to the type of the error is brittle at best. I would like to suggest instead we implement subclasses on `SyntaxError` that would allow tools that depend on specific syntax errors to rescue those subclasses instead of parsing the message. In addition to alleviating the need to parse error messages with regex, this would also allow for the possibility that the error messages could change in the future without breaking external tooling. Allowing these to change would allow them to be potentially enhanced or changed by other tools - for example by providing recovery information or translating them. This is particularly important for Prism since we are getting down to individual spec failures and some of the failures are related to the fact that we have messages like `"Numbered parameter is already used in outer scope"` where the spec requires `/numbered parameter is already used in/`. Even this case-sensitivity is causing failures, which seems like we're testing the wrong thing. -- https://bugs.ruby-lang.org/

jeremyevans0 (Jeremy Evans)

5:23 p.m.

New subject: [ruby-core:115513] [Ruby master Feature#20024] SyntaxError subclasses

Issue #20024 has been updated by jeremyevans0 (Jeremy Evans). zverok (Victor Shepelev) wrote in #note-7:

...

That's already valid syntax, which calls the `SyntaxError` method with an `issue` keyword argument (the method should return a class or module to work correctly). ---------------------------------------- Feature #20024: SyntaxError subclasses https://bugs.ruby-lang.org/issues/20024#change-105442 * Author: kddnewton (Kevin Newton) * Status: Open * Priority: Normal ---------------------------------------- There are many places around the Ruby ecosystem that handle syntax errors in different ways. Some provide highlighting, others provide recovery of some form, still more provide LSP metadata. In order to provide more rich information, most of them switch on the message of the error being returned, as in: https://github.com/ruby/irb/blob/f86d9dbe2fc05ed62332069a27f4aacc59ba9634/l… Within ruby/spec, specific error messages are required for these kinds of messages in order to support this implicit interface that syntax errors have a hidden type, which is only expressed through their message. For example: https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… It's not clear from these specs or from the parser itself which error messages are permanent/guaranteed versus which are changeable. Either way, relying on the error message itself as opposed to the type of the error is brittle at best. I would like to suggest instead we implement subclasses on `SyntaxError` that would allow tools that depend on specific syntax errors to rescue those subclasses instead of parsing the message. In addition to alleviating the need to parse error messages with regex, this would also allow for the possibility that the error messages could change in the future without breaking external tooling. Allowing these to change would allow them to be potentially enhanced or changed by other tools - for example by providing recovery information or translating them. This is particularly important for Prism since we are getting down to individual spec failures and some of the failures are related to the fact that we have messages like `"Numbered parameter is already used in outer scope"` where the spec requires `/numbered parameter is already used in/`. Even this case-sensitivity is causing failures, which seems like we're testing the wrong thing. -- https://bugs.ruby-lang.org/

kddnewton (Kevin Newton)

5:25 p.m.

New subject: [ruby-core:115515] [Ruby master Feature#20024] SyntaxError subclasses

Issue #20024 has been updated by kddnewton (Kevin Newton). @zverok That's an interesting proposal but let's move that discussion to another issue. I don't want this to become a syntax discussion because that could take us in a totally different direction. ---------------------------------------- Feature #20024: SyntaxError subclasses https://bugs.ruby-lang.org/issues/20024#change-105444 * Author: kddnewton (Kevin Newton) * Status: Open * Priority: Normal ---------------------------------------- There are many places around the Ruby ecosystem that handle syntax errors in different ways. Some provide highlighting, others provide recovery of some form, still more provide LSP metadata. In order to provide more rich information, most of them switch on the message of the error being returned, as in: https://github.com/ruby/irb/blob/f86d9dbe2fc05ed62332069a27f4aacc59ba9634/l… Within ruby/spec, specific error messages are required for these kinds of messages in order to support this implicit interface that syntax errors have a hidden type, which is only expressed through their message. For example: https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… It's not clear from these specs or from the parser itself which error messages are permanent/guaranteed versus which are changeable. Either way, relying on the error message itself as opposed to the type of the error is brittle at best. I would like to suggest instead we implement subclasses on `SyntaxError` that would allow tools that depend on specific syntax errors to rescue those subclasses instead of parsing the message. In addition to alleviating the need to parse error messages with regex, this would also allow for the possibility that the error messages could change in the future without breaking external tooling. Allowing these to change would allow them to be potentially enhanced or changed by other tools - for example by providing recovery information or translating them. This is particularly important for Prism since we are getting down to individual spec failures and some of the failures are related to the fact that we have messages like `"Numbered parameter is already used in outer scope"` where the spec requires `/numbered parameter is already used in/`. Even this case-sensitivity is causing failures, which seems like we're testing the wrong thing. -- https://bugs.ruby-lang.org/

zverok (Victor Shepelev)

5:31 p.m.

New subject: [ruby-core:115516] [Ruby master Feature#20024] SyntaxError subclasses

Issue #20024 has been updated by zverok (Victor Shepelev). @jeremyevans0

...

That's already valid syntax

I am aware; I just have a feeling that repurposing it with PM might still be possible (the chance that a lot of code uses this syntax with `UpcasedMethods` to produce exception classes dynamically is relatively low—though, I admit, not non-existent). Anyway, I think @kddnewton is correct: here is not the best place, and the idea is very rough, I am sorry for stealing the discussion, just got a sudden idea. ---------------------------------------- Feature #20024: SyntaxError subclasses https://bugs.ruby-lang.org/issues/20024#change-105445 * Author: kddnewton (Kevin Newton) * Status: Open * Priority: Normal ---------------------------------------- There are many places around the Ruby ecosystem that handle syntax errors in different ways. Some provide highlighting, others provide recovery of some form, still more provide LSP metadata. In order to provide more rich information, most of them switch on the message of the error being returned, as in: https://github.com/ruby/irb/blob/f86d9dbe2fc05ed62332069a27f4aacc59ba9634/l… Within ruby/spec, specific error messages are required for these kinds of messages in order to support this implicit interface that syntax errors have a hidden type, which is only expressed through their message. For example: https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… It's not clear from these specs or from the parser itself which error messages are permanent/guaranteed versus which are changeable. Either way, relying on the error message itself as opposed to the type of the error is brittle at best. I would like to suggest instead we implement subclasses on `SyntaxError` that would allow tools that depend on specific syntax errors to rescue those subclasses instead of parsing the message. In addition to alleviating the need to parse error messages with regex, this would also allow for the possibility that the error messages could change in the future without breaking external tooling. Allowing these to change would allow them to be potentially enhanced or changed by other tools - for example by providing recovery information or translating them. This is particularly important for Prism since we are getting down to individual spec failures and some of the failures are related to the fact that we have messages like `"Numbered parameter is already used in outer scope"` where the spec requires `/numbered parameter is already used in/`. Even this case-sensitivity is causing failures, which seems like we're testing the wrong thing. -- https://bugs.ruby-lang.org/

Dan0042 (Daniel DeLorme)

10:41 p.m.

New subject: [ruby-core:115527] [Ruby master Feature#20024] SyntaxError subclasses

Issue #20024 has been updated by Dan0042 (Daniel DeLorme). This idea of subclasses/identifier is coming up in the context of Prism and SyntaxError, but I think it applies to any kind of Exception, so I would like to see this #issue added to Exception rather than just SyntaxError. For pretty much any type of Exception I see things like this: ```ruby rescue SomeError => err if err.message =~ /some pattern/ #ignore the error, or do something else else raise end ``` Ideally this `SomeError` would have a subclass that allows to catch the exact error subtype, but in practice that doesn't alway happen, and is often out of our control. it would be really nice to be able to compare against a Symbol rather than a string message. Especially for a message that might be expensive to generate (#18285). --- (PS: @zverok I think your idea is *very much* worth exploring) ---------------------------------------- Feature #20024: SyntaxError subclasses https://bugs.ruby-lang.org/issues/20024#change-105457 * Author: kddnewton (Kevin Newton) * Status: Open * Priority: Normal ---------------------------------------- There are many places around the Ruby ecosystem that handle syntax errors in different ways. Some provide highlighting, others provide recovery of some form, still more provide LSP metadata. In order to provide more rich information, most of them switch on the message of the error being returned, as in: https://github.com/ruby/irb/blob/f86d9dbe2fc05ed62332069a27f4aacc59ba9634/l… Within ruby/spec, specific error messages are required for these kinds of messages in order to support this implicit interface that syntax errors have a hidden type, which is only expressed through their message. For example: https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… It's not clear from these specs or from the parser itself which error messages are permanent/guaranteed versus which are changeable. Either way, relying on the error message itself as opposed to the type of the error is brittle at best. I would like to suggest instead we implement subclasses on `SyntaxError` that would allow tools that depend on specific syntax errors to rescue those subclasses instead of parsing the message. In addition to alleviating the need to parse error messages with regex, this would also allow for the possibility that the error messages could change in the future without breaking external tooling. Allowing these to change would allow them to be potentially enhanced or changed by other tools - for example by providing recovery information or translating them. This is particularly important for Prism since we are getting down to individual spec failures and some of the failures are related to the fact that we have messages like `"Numbered parameter is already used in outer scope"` where the spec requires `/numbered parameter is already used in/`. Even this case-sensitivity is causing failures, which seems like we're testing the wrong thing. -- https://bugs.ruby-lang.org/

yui-knk (Kaneko Yuichiro)

30 Nov 30 Nov

3:55 a.m.

New subject: [ruby-core:115533] [Ruby master Feature#20024] SyntaxError subclasses

Issue #20024 has been updated by yui-knk (Kaneko Yuichiro). # SyntaxError includes multiple errors `SyntaxError` includes multiple errors like below, in this case two errors are included into one `SyntaxError`. Therefore it needs to consider how to handle such cases. ```ruby begin eval <<~CODE def m retry CODE rescue SyntaxError => e puts e.message end # (eval at test.rb:2):2: Invalid retry without rescue # retry # ^~~~~ # (eval at test.rb:2):3: syntax error, unexpected end-of-input, expecting `end' or dummy end ``` We need to avoid losing information to provide rich information to `SyntaxError` users. Therefore these are not options for this problem: * Merge multiple errors into one `SyntaxError` subclass. Because this is misleading if these errors are different types of errors. * Use only a single error and ignore following erorrs. Because information is lost. # irb use case and error tolerance parser Regarding the [irb use case](https://github.com/ruby/irb/blob/f86d9dbe2fc05ed62332069a27f4aacc59ba…, it categorizes error , which is recoverable by adding tokens to the end, as `recoverable_error` and error, which is recoverable by deleting tokens, as `unrecoverable_error`, so that irb can determine to require more input or not. When irb was created, error tolerance parser didn't exist. Therefore irb needs to take care of how to recover the input using `SyntaxError#message`. However it's parser responsibility to recover errors. irb can categorize syntax error if error tolerance parser provides information how the parser recover errors. If only single error and recovery requires only token insert operations for the last of input, it's `recoverable_error`. By the way, does the parser always raise `SyntaxError` for invalid input? For this use case, it's better to not raise `SyntaxError` but users ask to parser to have syntax errors or not after parsing. # Proposal Considering these points, my proposal is defining parser interface. * Parser can run without SyntaxError mode * Parser provides a method to get syntax errors * Each syntax error includes * message: "syntax error, unexpected end-of-input, expecting `end' or dummy end" * location: (1,0)-(1,1), first/last & line/column * operations for recovery: * insert / delete * location of recovery ---------------------------------------- Feature #20024: SyntaxError subclasses https://bugs.ruby-lang.org/issues/20024#change-105468 * Author: kddnewton (Kevin Newton) * Status: Open * Priority: Normal ---------------------------------------- There are many places around the Ruby ecosystem that handle syntax errors in different ways. Some provide highlighting, others provide recovery of some form, still more provide LSP metadata. In order to provide more rich information, most of them switch on the message of the error being returned, as in: https://github.com/ruby/irb/blob/f86d9dbe2fc05ed62332069a27f4aacc59ba9634/l… Within ruby/spec, specific error messages are required for these kinds of messages in order to support this implicit interface that syntax errors have a hidden type, which is only expressed through their message. For example: https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… It's not clear from these specs or from the parser itself which error messages are permanent/guaranteed versus which are changeable. Either way, relying on the error message itself as opposed to the type of the error is brittle at best. I would like to suggest instead we implement subclasses on `SyntaxError` that would allow tools that depend on specific syntax errors to rescue those subclasses instead of parsing the message. In addition to alleviating the need to parse error messages with regex, this would also allow for the possibility that the error messages could change in the future without breaking external tooling. Allowing these to change would allow them to be potentially enhanced or changed by other tools - for example by providing recovery information or translating them. This is particularly important for Prism since we are getting down to individual spec failures and some of the failures are related to the fact that we have messages like `"Numbered parameter is already used in outer scope"` where the spec requires `/numbered parameter is already used in/`. Even this case-sensitivity is causing failures, which seems like we're testing the wrong thing. -- https://bugs.ruby-lang.org/

ioquatix (Samuel Williams)

8:50 a.m.

New subject: [ruby-core:115544] [Ruby master Feature#20024] SyntaxError subclasses

Issue #20024 has been updated by ioquatix (Samuel Williams). I'm interested in this topic and proposed a while ago to add `SyntaxError#diagnostics` which is a more detailed information relating to parse errors and how to rectify them. I'd be interested in discussing this in more detail. ---------------------------------------- Feature #20024: SyntaxError subclasses https://bugs.ruby-lang.org/issues/20024#change-105479 * Author: kddnewton (Kevin Newton) * Status: Open * Priority: Normal ---------------------------------------- There are many places around the Ruby ecosystem that handle syntax errors in different ways. Some provide highlighting, others provide recovery of some form, still more provide LSP metadata. In order to provide more rich information, most of them switch on the message of the error being returned, as in: https://github.com/ruby/irb/blob/f86d9dbe2fc05ed62332069a27f4aacc59ba9634/l… Within ruby/spec, specific error messages are required for these kinds of messages in order to support this implicit interface that syntax errors have a hidden type, which is only expressed through their message. For example: https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… It's not clear from these specs or from the parser itself which error messages are permanent/guaranteed versus which are changeable. Either way, relying on the error message itself as opposed to the type of the error is brittle at best. I would like to suggest instead we implement subclasses on `SyntaxError` that would allow tools that depend on specific syntax errors to rescue those subclasses instead of parsing the message. In addition to alleviating the need to parse error messages with regex, this would also allow for the possibility that the error messages could change in the future without breaking external tooling. Allowing these to change would allow them to be potentially enhanced or changed by other tools - for example by providing recovery information or translating them. This is particularly important for Prism since we are getting down to individual spec failures and some of the failures are related to the fact that we have messages like `"Numbered parameter is already used in outer scope"` where the spec requires `/numbered parameter is already used in/`. Even this case-sensitivity is causing failures, which seems like we're testing the wrong thing. -- https://bugs.ruby-lang.org/

kddnewton (Kevin Newton)

4 Dec 4 Dec

5:23 p.m.

New subject: [ruby-core:115586] [Ruby master Feature#20024] SyntaxError subclasses

Issue #20024 has been updated by kddnewton (Kevin Newton). @yui-knk your proposal matches how prism works today. Prism never raises a syntax error, it returns a list of errors that have (exactly as you have said) a type, a message, and a location. I agree that a single raised error is not great because it loses information. I like the idea that @ioquatix has proposed of `SyntaxError#diagnostics`. With such a proposal, we could attach all syntax errors in the file to a single raised error, and you could query them appropriately. In this case we would need to design the shape of a `SyntaxError::Diagnostic` class. I would imagine it would have the same fields that @yui-knk is proposing, something like: ```ruby class SyntaxError::Location attr_reader :start_line, :end_line # 1-indexed attr_reader :start_column, :end_column # in bytes end class SyntaxError::Diagnostic attr_reader :location # SyntaxError::Location attr_reader :message # String attr_reader :type # Symbol end class SyntaxError attr_reader :diagnostics # Array[SyntaxError::Diagnostic] end ``` Would that be amenable? ---------------------------------------- Feature #20024: SyntaxError subclasses https://bugs.ruby-lang.org/issues/20024#change-105519 * Author: kddnewton (Kevin Newton) * Status: Open * Priority: Normal ---------------------------------------- There are many places around the Ruby ecosystem that handle syntax errors in different ways. Some provide highlighting, others provide recovery of some form, still more provide LSP metadata. In order to provide more rich information, most of them switch on the message of the error being returned, as in: https://github.com/ruby/irb/blob/f86d9dbe2fc05ed62332069a27f4aacc59ba9634/l… Within ruby/spec, specific error messages are required for these kinds of messages in order to support this implicit interface that syntax errors have a hidden type, which is only expressed through their message. For example: https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… It's not clear from these specs or from the parser itself which error messages are permanent/guaranteed versus which are changeable. Either way, relying on the error message itself as opposed to the type of the error is brittle at best. I would like to suggest instead we implement subclasses on `SyntaxError` that would allow tools that depend on specific syntax errors to rescue those subclasses instead of parsing the message. In addition to alleviating the need to parse error messages with regex, this would also allow for the possibility that the error messages could change in the future without breaking external tooling. Allowing these to change would allow them to be potentially enhanced or changed by other tools - for example by providing recovery information or translating them. This is particularly important for Prism since we are getting down to individual spec failures and some of the failures are related to the fact that we have messages like `"Numbered parameter is already used in outer scope"` where the spec requires `/numbered parameter is already used in/`. Even this case-sensitivity is causing failures, which seems like we're testing the wrong thing. -- https://bugs.ruby-lang.org/

yui-knk (Kaneko Yuichiro)

7 Dec 7 Dec

2:58 a.m.

New subject: [ruby-core:115615] [Ruby master Feature#20024] SyntaxError subclasses

Issue #20024 has been updated by yui-knk (Kaneko Yuichiro). For the [irb use cases](https://github.com/ruby/irb/blob/f86d9dbe2fc05ed62332069a27f4aacc59b…, irb wants to know additional input may be able to recover the current invalid input. For example `end; def m` is not recoverable with additional input because `end;` can not be recovered by adding some tokens. Then irb raises "syntax error" for the input. On the other hand `def m` is recoverable with additional input. Then irb requires next input. Parser needs to provide information how the parser recover errors for removing dependency from irb to syntax error messages. ---------------------------------------- Feature #20024: SyntaxError subclasses https://bugs.ruby-lang.org/issues/20024#change-105546 * Author: kddnewton (Kevin Newton) * Status: Open * Priority: Normal ---------------------------------------- There are many places around the Ruby ecosystem that handle syntax errors in different ways. Some provide highlighting, others provide recovery of some form, still more provide LSP metadata. In order to provide more rich information, most of them switch on the message of the error being returned, as in: https://github.com/ruby/irb/blob/f86d9dbe2fc05ed62332069a27f4aacc59ba9634/l… Within ruby/spec, specific error messages are required for these kinds of messages in order to support this implicit interface that syntax errors have a hidden type, which is only expressed through their message. For example: https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… It's not clear from these specs or from the parser itself which error messages are permanent/guaranteed versus which are changeable. Either way, relying on the error message itself as opposed to the type of the error is brittle at best. I would like to suggest instead we implement subclasses on `SyntaxError` that would allow tools that depend on specific syntax errors to rescue those subclasses instead of parsing the message. In addition to alleviating the need to parse error messages with regex, this would also allow for the possibility that the error messages could change in the future without breaking external tooling. Allowing these to change would allow them to be potentially enhanced or changed by other tools - for example by providing recovery information or translating them. This is particularly important for Prism since we are getting down to individual spec failures and some of the failures are related to the fact that we have messages like `"Numbered parameter is already used in outer scope"` where the spec requires `/numbered parameter is already used in/`. Even this case-sensitivity is causing failures, which seems like we're testing the wrong thing. -- https://bugs.ruby-lang.org/

palkan (Vladimir Dementyev)

10 Jan 10 Jan

10:46 p.m.

New subject: [ruby-core:116162] [Ruby master Feature#20024] SyntaxError subclasses

Issue #20024 has been updated by palkan (Vladimir Dementyev). kddnewton (Kevin Newton) wrote in #note-14:

...

In this case we would need to design the shape of a `SyntaxError::Diagnostic` class. I would imagine it would have the same fields that @yui-knk is proposing, something like: ...

I think, we can get some inspiration from Parser: [diagnostic](https://github.com/ruby-next/parser/blob/master/lib/parser/diag… ([messages](https://github.com/ruby-next/parser/blob/master/lib/parser/messag… — a lot of them, btw). For example, it also includes a _level_ information (error, fatal, warn); something similar to the "recoverable" property proposed above, but more granular. ---------------------------------------- Feature #20024: SyntaxError subclasses https://bugs.ruby-lang.org/issues/20024#change-106168 * Author: kddnewton (Kevin Newton) * Status: Open * Priority: Normal ---------------------------------------- There are many places around the Ruby ecosystem that handle syntax errors in different ways. Some provide highlighting, others provide recovery of some form, still more provide LSP metadata. In order to provide more rich information, most of them switch on the message of the error being returned, as in: https://github.com/ruby/irb/blob/f86d9dbe2fc05ed62332069a27f4aacc59ba9634/l… Within ruby/spec, specific error messages are required for these kinds of messages in order to support this implicit interface that syntax errors have a hidden type, which is only expressed through their message. For example: https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… https://github.com/ruby/spec/blob/c3206f644325c026fc5b700f0ea75ce9bd2e9d02/… It's not clear from these specs or from the parser itself which error messages are permanent/guaranteed versus which are changeable. Either way, relying on the error message itself as opposed to the type of the error is brittle at best. I would like to suggest instead we implement subclasses on `SyntaxError` that would allow tools that depend on specific syntax errors to rescue those subclasses instead of parsing the message. In addition to alleviating the need to parse error messages with regex, this would also allow for the possibility that the error messages could change in the future without breaking external tooling. Allowing these to change would allow them to be potentially enhanced or changed by other tools - for example by providing recovery information or translating them. This is particularly important for Prism since we are getting down to individual spec failures and some of the failures are related to the fact that we have messages like `"Numbered parameter is already used in outer scope"` where the spec requires `/numbered parameter is already used in/`. Even this case-sensitivity is causing failures, which seems like we're testing the wrong thing. -- https://bugs.ruby-lang.org/

130

days inactive

174

days old

ruby-core@ml.ruby-lang.org

Manage subscription

16 comments

9 participants

tags (0)

participants (9)

byroot (Jean Boussier)
Dan0042 (Daniel DeLorme)
Eregon (Benoit Daloze)
ioquatix (Samuel Williams)
jeremyevans0 (Jeremy Evans)
kddnewton (Kevin Newton)
palkan (Vladimir Dementyev)
yui-knk (Kaneko Yuichiro)
zverok (Victor Shepelev)