
Issue #20669 has been updated by olleolleolle (Olle Jonsson). In the meantime, Dalli merged a change that sounds a lot like Eregon's comment. https://github.com/petergoldstein/dalli/pull/1011 My immediate needs are met, and I will close this feature request. ---------------------------------------- Feature #20669: Add Marshal::MarshalError class to differentiate ArgumentErrors https://bugs.ruby-lang.org/issues/20669#change-110046 * Author: olleolleolle (Olle Jonsson) * Status: Open ---------------------------------------- Make it possible to differentiate general non-marshal errors from failures to decode Marshal data using a specific exception class. ## Background There are a variety of error conditions that can cause Marshal to fail, including both corrupt data, and between versions of Ruby/Marshal binary formats, e.g. ```
Marshal.load("foobar") <internal:marshal>:34:in `load': incompatible marshal file format (can't be read) (TypeError)
Marshal.load(Marshal.dump(Object.new).slice(0, 10)) <internal:marshal>:34:in `load': marshal data too short (ArgumentError)
MyThing = Struct.new(:name, :age) => MyThing Marshal.dump(MyThing.new("Alice", 20)) => "\x04\bS:\fMyThing\a:\tnameI\"\nAlice\x06:\x06ET:\bagei\x19"
(in a separate session)
Marshal.load "\x04\bS:\fMyThing\a:\tnameI\"\nAlice\x06:\x06ET:\bagei\x19" <internal:marshal>:34:in `load': undefined class/module MyThing (ArgumentError)
When data is corrupt, or incompatible, Marshal may raise either `ArgumentError` or `TypeError` with various messages. Not all errors are from `marshal.c`.
Generally, `TypeError` and `ArgumentError` are used to indicate some semantic problem with the program, rather than a runtime issue with the data. For example, `ArgumentError` may be used to indicate less than the required number of arguments passed to a method, and `TypeError` may indicate than the wrong data type was given as an argument.
Dalli is a library for accessing memcached servers. By default it uses Marshal to store serialised Ruby objects. However, when updating major versions of Ruby, or application code, `Marshal.load` may start raising `TypeError` or `ArgumentError`. This class of error needs to be handled differently from "normal" `TypeError` and `ArgumentError`. As such, the only way to differentiate right now is to pattern match the exception message, [which is done here](https://github.com/petergoldstein/dalli/blob/1d4cbfc78e6470ad57a883801a783bf30890c400/lib/dalli/protocol/value_serializer.rb#L36-L72):
```ruby
# TODO: Some of these error messages need to be validated. It's not obvious
# that all of them are actually generated by the invoked code
# in current systems
# rubocop:disable Layout/LineLength
TYPE_ERR_REGEXP = %r{needs to have method `_load'|exception class/object expected|instance of IO needed|incompatible marshal file format}.freeze
ARGUMENT_ERR_REGEXP = /undefined class|marshal data too short/.freeze
NAME_ERR_STR = 'uninitialized constant'
# rubocop:enable Layout/LineLength
def retrieve(value, bitflags)
serialized = (bitflags & FLAG_SERIALIZED) != 0
serialized ? serializer.load(value) : value
rescue TypeError => e
filter_type_error(e)
rescue ArgumentError => e
filter_argument_error(e)
rescue NameError => e
filter_name_error(e)
end
def filter_type_error(err)
raise err unless TYPE_ERR_REGEXP.match?(err.message)
raise UnmarshalError, "Unable to unmarshal value: #{err.message}"
end
def filter_argument_error(err)
raise err unless ARGUMENT_ERR_REGEXP.match?(err.message)
raise UnmarshalError, "Unable to unmarshal value: #{err.message}"
end
def filter_name_error(err)
raise err unless err.message.include?(NAME_ERR_STR)
raise UnmarshalError, "Unable to unmarshal value: #{err.message}"
end
Unfortunately, there have been several incidents where the complexity of the current behaviour has lead to outages in production. ## Proposal Ideally, the above code could use a specific (one or more) error class for detecting failed `Marshal.load`. ```ruby def retrieve(value, bitflags) serialized = (bitflags & FLAG_SERIALIZED) != 0 serialized ? serializer.load(value) : value rescue Marshal::LoadError => error raise UnmarshalError, "Unable to unmarshal value!" end ``` One difficulty is retaining backwards compatibility. We may be able to create a sub-class of `class LoadError < ArgumentError` but it fundamentally seems wrong to me (should really be a `RuntimeError`). ## See also - [Dalli PR where another error message would be added](https://github.com/petergoldstein/dalli/pull/1008) - https://github.com/search?q=Marshal.load+ArgumentError&type=code - Git code search showing some variations of the problem (`rescue ArgumentError` and similar). -- https://bugs.ruby-lang.org/