
15 Mar
2024
15 Mar
'24
10:30 p.m.
Issue #19787 has been updated by joshuay03 (Joshua Young). austin (Austin Ziegler) wrote in #note-5:
Wouldn’t it make more sense, then, to do `uniq { … }.map { … }`? Yes, there’s a *small* bit of extra code, but it means that you’re in most cases going to be performing *less work* than either `.map { … }.uniq` or `uniq_map { … }`, because you’re reducing to unique instances before mapping them.
That would be very confusing code to read imo given that the body of both blocks would need to be the same. E.g. ``` # your suggestion
[1, 1, 2, 3, 4].uniq { |i| i % 2 }.map { |i| i % 2 } => [1, 0]
# vs the more common approach
[1, 1, 2, 3, 4].map { |i| i % 2 }.uniq => [1, 0]
# vs my proposal
[1, 1, 2, 3, 4].uniq_map { |i| i % 2 } => [1, 0]
And sure you can abstract it into a proc but I don't see why one would pick that syntax in the first place.
> If your computations are complex enough that they should be done before `#uniq`, I would bypass the need for `#uniq` altogether and use `#reduce`.
Fair, but the beauty of `#uniq` is that it's declarative. With `#reduce` you'd need to infer the intent and therefore result of the body, like with your examples below. Which is why I'm suggesting an alternative which is still easy to comprehend and also more performant by design.
> The `#map` block being complex is even more of an argument for reversing the flow or using `#reduce`.
> Both of these could be shorter if you use brace blocks (which should be preferred when **using** the output of a method like `#map`).
I should point out that that was a very contrived example. Obviously not all map operations are this simple (especially in the context of something like Rails). There's also the varying style preferences to keep in mind (multiline block delimiters, ternaries and numbered block params).
> From a reduced operations perspective, `#reduce` is going to be faster than most anything else, and there are multiple options for the operation, but `Set` is likely to be your best case:
> Uniquely mapped in one pass, although I think still less efficient than `uniq {}.map {}` because you have to map the items before determining that they are unique (via `Set#add`). One could implement this without `Set`, but that would require a bit more work:
> These are both *slightly* less efficient than your C code because I don’t believe that there’s a way to preallocate a Hash size from Ruby (there have been proposals, but I don’t believe any have been accepted).
These are great examples, but I stand by my comment above that the mental overhead is not worth it in most cases unless your goal is simply performance. The primary reason for my request is succinctness, with performance coming second.
> When trying to add a functional shorthand, it is common to compare it against other functional languages to see if it or a close synonym is commonly used because many people working with functional operations find it useful to have such a shorthand. So, not unusual.
That makes sense to me, thanks.
> Consensus-building mostly.
>
> I don’t think that `#uniq_map` is a good addition because it is *only* sugar over `.map {}.uniq` and cannot sugar over `.map {}.uniq {}`, and I think that — with the exception of the intermediate hash-size preallocation — `#reduce` or `.uniq {}.map {}` will be as or more efficient than `#uniq_map`. I don’t have benchmarks, though.
I appreciate your response.
----------------------------------------
Feature #19787: Add Enumerable#uniq_map, Enumerable::Lazy#uniq_map, Array#uniq_map and Array#uniq_map!
https://bugs.ruby-lang.org/issues/19787#change-107287
* Author: joshuay03 (Joshua Young)
* Status: Open
----------------------------------------
I would like to propose a collection of new methods, `Enumerable#uniq_map`, `Enumerable::Lazy#uniq_map`, `Array#uniq_map` and `Array#uniq_map!`.
TL;DR: It's a drop in replacement for `.map { ... }.uniq`, with (hopefully) better performance.
I've quite often had to map over an array and get its unique elements. It occurred to me when doing so recently that Ruby doesn't have a short form method for doing that, similar to how `.flat_map { ... }` replaces `.map { ... }.flatten` and `.filter_map { ... }` replaces `.map { ... }.compact` (with minor differences). I think these new methods could be beneficial both in terms of better performance and writing more succinct code.
I've got a draft PR up with some initial benchmarks in the description: https://github.com/ruby/ruby/pull/10269.
--
https://bugs.ruby-lang.org/