[ruby-core:125342] [Ruby Bug#22013] Array#| deduplication via eql? breaks when total element count exceeds ~16
Issue #22013 has been reported by andreyruby (Andrey Glushkov). ---------------------------------------- Bug #22013: Array#| deduplication via eql? breaks when total element count exceeds ~16 https://bugs.ruby-lang.org/issues/22013 * Author: andreyruby (Andrey Glushkov) * Status: Open * ruby -v: ruby 3.4.4 (2025-05-14 revision a38531fd3f) +PRISM [arm64-darwin24] * Backport: 3.3: UNKNOWN, 3.4: UNKNOWN, 4.0: UNKNOWN ---------------------------------------- ## Problem description The documentation for `Array#|` states:
Returns the union of self and other_array; duplicates are removed; order is preserved; items are compared using `eql?`
However, when the total number of elements across both arrays exceeds ~16, deduplication via `eql?` stops working and duplicates are included in the result. ## Reproducible script ```ruby class Item attr_reader :id, :pos def initialize(id, pos:) @id = id @pos = pos end def inspect "item-#{id}-#{pos}" end def eql?(other) id == other.id end end items1 = [Item.new(1, pos: 1)] items2 = [Item.new(1, pos: 2)] # same id — should be treated as duplicate by eql? puts (items1 | items2 ).inspect # 2 total elements puts (items1 * 8 | items2 * 8 ).inspect # 16 total elements puts (items1 * 15 | items2 * 1 ).inspect # 16 total elements puts (items1 * 9 | items2 * 8 ).inspect # 17 total elements ``` ## Actual result ``` [item-1-1] [item-1-1] [item-1-1] [item-1-1, item-1-2] ``` ## Expected result All four lines should return `[item-1-1]`, since `eql?` returns `true` for both items and the documentation makes no mention of any size-dependent behavior. The last line incorrectly returns two elements. The only change is the total element count crossing ~16 — the objects and `eql?` implementation are identical. -- https://bugs.ruby-lang.org/
Issue #22013 has been updated by byroot (Jean Boussier). You are missing the corresponding `hash` method: ```ruby def hash [self.class, id].hash end ``` It is somewhat implied by the mention of `#eql?` (hash based equality) but the documentation could be more explicit of course. There is also perhaps the question of whether the fast path that doesn't use a Hash for smaller arrays should call `#hash` too as to have a more consistent behavior. ---------------------------------------- Bug #22013: Array#| deduplication via eql? breaks when total element count exceeds ~16 https://bugs.ruby-lang.org/issues/22013#change-117096 * Author: andreyruby (Andrey Glushkov) * Status: Open * ruby -v: ruby 3.4.4 (2025-05-14 revision a38531fd3f) +PRISM [arm64-darwin24] * Backport: 3.3: UNKNOWN, 3.4: UNKNOWN, 4.0: UNKNOWN ---------------------------------------- ## Problem description The documentation for `Array#|` states:
Returns the union of self and other_array; duplicates are removed; order is preserved; items are compared using `eql?`
However, when the total number of elements across both arrays exceeds ~16, deduplication via `eql?` stops working and duplicates are included in the result. ## Reproducible script ```ruby class Item attr_reader :id, :pos def initialize(id, pos:) @id = id @pos = pos end def inspect "item-#{id}-#{pos}" end def eql?(other) id == other.id end end items1 = [Item.new(1, pos: 1)] items2 = [Item.new(1, pos: 2)] # same id — should be treated as duplicate by eql? puts (items1 | items2 ).inspect # 2 total elements puts (items1 * 8 | items2 * 8 ).inspect # 16 total elements puts (items1 * 15 | items2 * 1 ).inspect # 16 total elements puts (items1 * 9 | items2 * 8 ).inspect # 17 total elements ``` ## Actual result ``` [item-1-1] [item-1-1] [item-1-1] [item-1-1, item-1-2] ``` ## Expected result All four lines should return `[item-1-1]`, since `eql?` returns `true` for both items and the documentation makes no mention of any size-dependent behavior. The last line incorrectly returns two elements. The only change is the total element count crossing ~16 — the objects and `eql?` implementation are identical. -- https://bugs.ruby-lang.org/
Issue #22013 has been updated by nobu (Nobuyoshi Nakada). Tags set to doc Is this a document issue? ---------------------------------------- Bug #22013: Array#| deduplication via eql? breaks when total element count exceeds ~16 https://bugs.ruby-lang.org/issues/22013#change-117101 * Author: andreyruby (Andrey Glushkov) * Status: Open * ruby -v: ruby 3.4.4 (2025-05-14 revision a38531fd3f) +PRISM [arm64-darwin24] * Backport: 3.3: UNKNOWN, 3.4: UNKNOWN, 4.0: UNKNOWN ---------------------------------------- ## Problem description The documentation for `Array#|` states:
Returns the union of self and other_array; duplicates are removed; order is preserved; items are compared using `eql?`
However, when the total number of elements across both arrays exceeds ~16, deduplication via `eql?` stops working and duplicates are included in the result. ## Reproducible script ```ruby class Item attr_reader :id, :pos def initialize(id, pos:) @id = id @pos = pos end def inspect "item-#{id}-#{pos}" end def eql?(other) id == other.id end end items1 = [Item.new(1, pos: 1)] items2 = [Item.new(1, pos: 2)] # same id — should be treated as duplicate by eql? puts (items1 | items2 ).inspect # 2 total elements puts (items1 * 8 | items2 * 8 ).inspect # 16 total elements puts (items1 * 15 | items2 * 1 ).inspect # 16 total elements puts (items1 * 9 | items2 * 8 ).inspect # 17 total elements ``` ## Actual result ``` [item-1-1] [item-1-1] [item-1-1] [item-1-1, item-1-2] ``` ## Expected result All four lines should return `[item-1-1]`, since `eql?` returns `true` for both items and the documentation makes no mention of any size-dependent behavior. The last line incorrectly returns two elements. The only change is the total element count crossing ~16 — the objects and `eql?` implementation are identical. -- https://bugs.ruby-lang.org/
Issue #22013 has been updated by andreyruby (Andrey Glushkov). byroot (Jean Boussier) wrote in #note-1:
You are missing the corresponding `hash` method
It would help to add a note similar to `Object#eql?` docs (https://docs.ruby-lang.org/en/master/Object.html#method-i-eql-3F):
For any pair of objects where `eql?` returns true, the `hash` value of both objects must be equal. So any subclass that overrides `eql?` should also override `hash` appropriately.
Someone implementing `eql?` for a domain object for `Array#|` (and maybe `Array#uniq`) will get to those method docs and may never think to check `Object#eql?`. ---------------------------------------- Bug #22013: Array#| deduplication via eql? breaks when total element count exceeds ~16 https://bugs.ruby-lang.org/issues/22013#change-117103 * Author: andreyruby (Andrey Glushkov) * Status: Open * ruby -v: ruby 3.4.4 (2025-05-14 revision a38531fd3f) +PRISM [arm64-darwin24] * Backport: 3.3: UNKNOWN, 3.4: UNKNOWN, 4.0: UNKNOWN ---------------------------------------- ## Problem description The documentation for `Array#|` states:
Returns the union of self and other_array; duplicates are removed; order is preserved; items are compared using `eql?`
However, when the total number of elements across both arrays exceeds ~16, deduplication via `eql?` stops working and duplicates are included in the result. ## Reproducible script ```ruby class Item attr_reader :id, :pos def initialize(id, pos:) @id = id @pos = pos end def inspect "item-#{id}-#{pos}" end def eql?(other) id == other.id end end items1 = [Item.new(1, pos: 1)] items2 = [Item.new(1, pos: 2)] # same id — should be treated as duplicate by eql? puts (items1 | items2 ).inspect # 2 total elements puts (items1 * 8 | items2 * 8 ).inspect # 16 total elements puts (items1 * 15 | items2 * 1 ).inspect # 16 total elements puts (items1 * 9 | items2 * 8 ).inspect # 17 total elements ``` ## Actual result ``` [item-1-1] [item-1-1] [item-1-1] [item-1-1, item-1-2] ``` ## Expected result All four lines should return `[item-1-1]`, since `eql?` returns `true` for both items and the documentation makes no mention of any size-dependent behavior. The last line incorrectly returns two elements. The only change is the total element count crossing ~16 — the objects and `eql?` implementation are identical. -- https://bugs.ruby-lang.org/
Issue #22013 has been updated by byroot (Jean Boussier).
It would help to add a note similar to Object#eql? docs
Yes that is what I said. https://github.com/ruby/ruby/pull/16786 ---------------------------------------- Bug #22013: Array#| deduplication via eql? breaks when total element count exceeds ~16 https://bugs.ruby-lang.org/issues/22013#change-117113 * Author: andreyruby (Andrey Glushkov) * Status: Open * ruby -v: ruby 3.4.4 (2025-05-14 revision a38531fd3f) +PRISM [arm64-darwin24] * Backport: 3.3: UNKNOWN, 3.4: UNKNOWN, 4.0: UNKNOWN ---------------------------------------- ## Problem description The documentation for `Array#|` states:
Returns the union of self and other_array; duplicates are removed; order is preserved; items are compared using `eql?`
However, when the total number of elements across both arrays exceeds ~16, deduplication via `eql?` stops working and duplicates are included in the result. ## Reproducible script ```ruby class Item attr_reader :id, :pos def initialize(id, pos:) @id = id @pos = pos end def inspect "item-#{id}-#{pos}" end def eql?(other) id == other.id end end items1 = [Item.new(1, pos: 1)] items2 = [Item.new(1, pos: 2)] # same id — should be treated as duplicate by eql? puts (items1 | items2 ).inspect # 2 total elements puts (items1 * 8 | items2 * 8 ).inspect # 16 total elements puts (items1 * 15 | items2 * 1 ).inspect # 16 total elements puts (items1 * 9 | items2 * 8 ).inspect # 17 total elements ``` ## Actual result ``` [item-1-1] [item-1-1] [item-1-1] [item-1-1, item-1-2] ``` ## Expected result All four lines should return `[item-1-1]`, since `eql?` returns `true` for both items and the documentation makes no mention of any size-dependent behavior. The last line incorrectly returns two elements. The only change is the total element count crossing ~16 — the objects and `eql?` implementation are identical. -- https://bugs.ruby-lang.org/
participants (3)
-
andreyruby (Andrey Glushkov) -
byroot (Jean Boussier) -
nobu (Nobuyoshi Nakada)