[ruby-core:114309] [Ruby master Feature#19787] Add Enumerable#uniq_map, Enumerable::Lazy#uniq_map, Array#uniq_map and Array#uniq_map!

Issue #19787 has been reported by joshuay03 (Joshua Young). ---------------------------------------- Feature #19787: Add Enumerable#uniq_map, Enumerable::Lazy#uniq_map, Array#uniq_map and Array#uniq_map! https://bugs.ruby-lang.org/issues/19787 * Author: joshuay03 (Joshua Young) * Status: Open * Priority: Normal ---------------------------------------- I would like to propose a collection of new methods, `Enumerable#uniq_map`, `Enumerable::Lazy#uniq_map`, `Array#uniq_map` and `Array#uniq_map!`. TL;DR: It's a drop in replacement for `.map { ... }.uniq`, with better performance. I've quite often had to map over an array and get its unique elements. It occurred to me when doing so recently that Ruby doesn't have a short form method for doing that, similar to how `flat_map { ... }` replaces `.map { ... }.flatten` and `filter_map { ... }` replaces `.map { ... }.compact` (with minor differences). I think these new methods could be beneficial both in terms of better performance and writing more succinct code. I have already got a draft PR up with some initial benchmarks in the description: https://github.com/ruby/ruby/pull/8140. -- https://bugs.ruby-lang.org/

Issue #19787 has been updated by mame (Yusuke Endoh). It does not make sense to me to provide `.foo_map { ... }` for all patterns like `.map { ... }.foo`. `flat_map` was introduced for some reasons: https://blade.ruby-lang.org/ruby-core/26287 Is `.map { ... }.uniq` such a very frequent idiom? `.uniq_map { ... }` is not as concise as `.map { ... }.uniq`. Scala doesn't seem to provide `uniqMap`. I think the only reason for introducing `uniq_map` is to avoid creating an intermediate array. However, there is a `.lazy` for exactly that purpose: `.lazy.map { ... }.uniq.to_a`. Considering the above, I think the motivation is too weak to provide `uniq_map`. ---------------------------------------- Feature #19787: Add Enumerable#uniq_map, Enumerable::Lazy#uniq_map, Array#uniq_map and Array#uniq_map! https://bugs.ruby-lang.org/issues/19787#change-104015 * Author: joshuay03 (Joshua Young) * Status: Open * Priority: Normal ---------------------------------------- I would like to propose a collection of new methods, `Enumerable#uniq_map`, `Enumerable::Lazy#uniq_map`, `Array#uniq_map` and `Array#uniq_map!`. TL;DR: It's a drop in replacement for `.map { ... }.uniq`, with (hopefully) better performance. I've quite often had to map over an array and get its unique elements. It occurred to me when doing so recently that Ruby doesn't have a short form method for doing that, similar to how `.flat_map { ... }` replaces `.map { ... }.flatten` and `.filter_map { ... }` replaces `.map { ... }.compact` (with minor differences). I think these new methods could be beneficial both in terms of better performance and writing more succinct code. I've got a draft PR up with some initial benchmarks in the description: https://github.com/ruby/ruby/pull/8140. -- https://bugs.ruby-lang.org/

Issue #19787 has been updated by joshuay03 (Joshua Young).
Is `.map { ... }.uniq` such a very frequent idiom?
I work on a Rails codebase and it's most commonly used to iterate through foreign keys / associated records and get the unique values (quite often in tests). That's the only real frequent use case I've come across, but I would say that outside of that, it's not an uncommon chaining pattern either.
`.uniq_map { ... }` is not as concise as `.map { ... }.uniq`.
I'm a bit mixed on this point. I see where you're coming from, and I think the same argument can be made about `#flat_map` and `#filter_map`. But it follows a similar naming pattern where the alternative method is in the name, so I personally feel like it's it's equally as concise? I would also like to point out that with `#uniq_map`, you don't need to read all the way to `.uniq` before inferring the output. This might help when the body of the `#map` is quite complex, but you could argue that this is a code quality / style problem... ``` some_array.map do |item| if some_condition some_method(item) else some_other_method(item) end end.uniq # vs some_array.uniq_map do |item| if some_condition some_method(item) else some_other_method(item) end end ```
Scala doesn't seem to provide `uniqMap`.
Sorry, this is the first Ruby issue I've created or being involved with, so I'm not sure why this was pointed out. Is this a usual consideration for new features?
Considering the above, I think the motivation is too weak to provide `uniq_map`.
Your points are very valid, and I appreciate the response. What is the usual process for deciding on whether or not to accept a feature? ---------------------------------------- Feature #19787: Add Enumerable#uniq_map, Enumerable::Lazy#uniq_map, Array#uniq_map and Array#uniq_map! https://bugs.ruby-lang.org/issues/19787#change-104016 * Author: joshuay03 (Joshua Young) * Status: Open * Priority: Normal ---------------------------------------- I would like to propose a collection of new methods, `Enumerable#uniq_map`, `Enumerable::Lazy#uniq_map`, `Array#uniq_map` and `Array#uniq_map!`. TL;DR: It's a drop in replacement for `.map { ... }.uniq`, with (hopefully) better performance. I've quite often had to map over an array and get its unique elements. It occurred to me when doing so recently that Ruby doesn't have a short form method for doing that, similar to how `.flat_map { ... }` replaces `.map { ... }.flatten` and `.filter_map { ... }` replaces `.map { ... }.compact` (with minor differences). I think these new methods could be beneficial both in terms of better performance and writing more succinct code. I've got a draft PR up with some initial benchmarks in the description: https://github.com/ruby/ruby/pull/8140. -- https://bugs.ruby-lang.org/

Issue #19787 has been updated by austin (Austin Ziegler). joshuay03 (Joshua Young) wrote in #note-4:
Is `.map { ... }.uniq` such a very frequent idiom?
I work on a Rails codebase and it's most commonly used to iterate through foreign keys / associated records and get the unique values (quite often in tests). That's the only real frequent use case I've come across, but I would say that outside of that, it's not an uncommon chaining pattern either.
Wouldn’t it make more sense, then, to do `uniq { … }.map { … }`? Yes, there’s a *small* bit of extra code, but it means that you’re in most cases going to be performing *less work* than either `.map { … }.uniq` or `uniq_map { … }`, because you’re reducing to unique instances before mapping them. If your computations are complex enough that they should be done before `#uniq`, I would bypass the need for `#uniq` altogether and use `#reduce`.
`.uniq_map { ... }` is not as concise as `.map { ... }.uniq`.
I'm a bit mixed on this point. I see where you're coming from, and I think the same argument can be made about `#flat_map` and `#filter_map`. But it follows a similar naming pattern where the alternative chained methods are in the name, so I personally feel like it's it's equally as concise?
I would also like to point out that with `#uniq_map`, you don't need to read all the way to `.uniq` before inferring the output. This might help when the body of the `#map` is quite complex, but you could argue that this is a code quality / style problem...
The `#map` block being complex is even more of an argument for reversing the flow or using `#reduce`.
``` some_array.map do |item| if some_condition some_method(item) else some_other_method(item) end end.uniq
# vs
some_array.uniq_map do |item| if some_condition some_method(item) else some_other_method(item) end end ```
Both of these could be shorter if you use brace blocks (which should be preferred when **using** the output of a method like `#map`). ```ruby some_array.map { some_condition ? some_method(_1) : some_other_method(_1) }.uniq ``` From a reduced operations perspective, `#reduce` is going to be faster than most anything else, and there are multiple options for the operation, but `Set` is likely to be your best case: ```ruby some_array.reduce(Set.new) { _2.add(some_condition ? some_method(_2) : some_other_method(_2)) }.to_a ``` Uniquely mapped in one pass, although I think still less efficient than `uniq {}.map {}` because you have to map the items before determining that they are unique (via `Set#add`). One could implement this without `Set`, but that would require a bit more work: ```ruby some_array.reduce({}) { v = some_condition ? some_method(_2) : some_other_method(_2) _1[v] = true unless _1.key?(v) _1 }.keys These are both *slightly* less efficient than your C code because I don’t believe that there’s a way to preallocate a Hash size from Ruby (there have been proposals, but I don’t believe any have been accepted).
Scala doesn't seem to provide `uniqMap`.
Sorry, this is the first Ruby issue I've created or being involved with, so I'm not sure why this was pointed out. Is this a usual consideration for new features?
When trying to add a functional shorthand, it is common to compare it against other functional languages to see if it or a close synonym is commonly used because many people working with functional operations find it useful to have such a shorthand. So, not unusual.
Considering the above, I think the motivation is too weak to provide `uniq_map`.
Your points are very valid, and I appreciate the response. What is the usual process for deciding on whether or not to accept a feature?
Consensus-building mostly. I don’t think that `#uniq_map` is a good addition because it is *only* sugar over `.map {}.uniq` and cannot sugar over `.map {}.uniq {}`, and I think that — with the exception of the intermediate hash-size preallocation — `#reduce` or `.uniq {}.map {}` will be as or more efficient than `#uniq_map`. I don’t have benchmarks, though. ---------------------------------------- Feature #19787: Add Enumerable#uniq_map, Enumerable::Lazy#uniq_map, Array#uniq_map and Array#uniq_map! https://bugs.ruby-lang.org/issues/19787#change-104020 * Author: joshuay03 (Joshua Young) * Status: Open * Priority: Normal ---------------------------------------- I would like to propose a collection of new methods, `Enumerable#uniq_map`, `Enumerable::Lazy#uniq_map`, `Array#uniq_map` and `Array#uniq_map!`. TL;DR: It's a drop in replacement for `.map { ... }.uniq`, with (hopefully) better performance. I've quite often had to map over an array and get its unique elements. It occurred to me when doing so recently that Ruby doesn't have a short form method for doing that, similar to how `.flat_map { ... }` replaces `.map { ... }.flatten` and `.filter_map { ... }` replaces `.map { ... }.compact` (with minor differences). I think these new methods could be beneficial both in terms of better performance and writing more succinct code. I've got a draft PR up with some initial benchmarks in the description: https://github.com/ruby/ruby/pull/8140. -- https://bugs.ruby-lang.org/

Issue #19787 has been updated by rubyFeedback (robert heiler). joshuay03 wrote:
What is the usual process for deciding on whether or not to accept a feature?
Ultimately you only have to convince matz. :) However had, matz may also request additional information and/or use case and "usefulness" of suggestions. Once features are added it is difficult to remove them due to backwards compatibility. I am not really invested in the proposal here, so I will not comment much at all. The way how I use ruby I use .map {} a lot, and .uniq sometimes, but I don't think I really had major use cases for combining the above into one method call. Note that I also don't use .flat_map either - I kind of prefer to stay with one-word methods when possible. They seem to make more sense to my brain. (I understand a rationale for e. g. library authors where efficiency may be more important, but personally I use ruby as kind of "syntax sugar" over C, the operating system and everyday tasks - ruby is really like the ultimate glue language the way how I use it. But that's just a side comment, I completely understand different people using ruby differently; just for my own use cases I don't seem to need .uniq_map or .flat_map. By the way, I also find it harder to remember the method names for two-word methods, e. g. .map_uniq or .map_flat; that's also one reason I stick with oldschool method chaining. Perhaps I am getting old ... .lazy is a bit different in that it also defers using something at "when it is needed", along with the functional use cases it has, which I think is different to both .flat_map and .uniq_map.) ---------------------------------------- Feature #19787: Add Enumerable#uniq_map, Enumerable::Lazy#uniq_map, Array#uniq_map and Array#uniq_map! https://bugs.ruby-lang.org/issues/19787#change-104021 * Author: joshuay03 (Joshua Young) * Status: Open * Priority: Normal ---------------------------------------- I would like to propose a collection of new methods, `Enumerable#uniq_map`, `Enumerable::Lazy#uniq_map`, `Array#uniq_map` and `Array#uniq_map!`. TL;DR: It's a drop in replacement for `.map { ... }.uniq`, with (hopefully) better performance. I've quite often had to map over an array and get its unique elements. It occurred to me when doing so recently that Ruby doesn't have a short form method for doing that, similar to how `.flat_map { ... }` replaces `.map { ... }.flatten` and `.filter_map { ... }` replaces `.map { ... }.compact` (with minor differences). I think these new methods could be beneficial both in terms of better performance and writing more succinct code. I've got a draft PR up with some initial benchmarks in the description: https://github.com/ruby/ruby/pull/8140. -- https://bugs.ruby-lang.org/

Issue #19787 has been updated by joshuay03 (Joshua Young). austin (Austin Ziegler) wrote in #note-5:
Wouldn’t it make more sense, then, to do `uniq { … }.map { … }`? Yes, there’s a *small* bit of extra code, but it means that you’re in most cases going to be performing *less work* than either `.map { … }.uniq` or `uniq_map { … }`, because you’re reducing to unique instances before mapping them.
That would be very confusing code to read imo given that the body of both blocks would need to be the same. E.g. ``` # your suggestion
[1, 1, 2, 3, 4].uniq { |i| i % 2 }.map { |i| i % 2 } => [1, 0]
# vs the more common approach
[1, 1, 2, 3, 4].map { |i| i % 2 }.uniq => [1, 0]
# vs my proposal
[1, 1, 2, 3, 4].uniq_map { |i| i % 2 } => [1, 0]
And sure you can abstract it into a proc but I don't see why one would pick that syntax in the first place.
> If your computations are complex enough that they should be done before `#uniq`, I would bypass the need for `#uniq` altogether and use `#reduce`.
Fair, but the beauty of `#uniq` is that it's declarative. With `#reduce` you'd need to infer the intent and therefore result of the body, like with your examples below. Which is why I'm suggesting an alternative which is still easy to comprehend and also more performant by design.
> The `#map` block being complex is even more of an argument for reversing the flow or using `#reduce`.
> Both of these could be shorter if you use brace blocks (which should be preferred when **using** the output of a method like `#map`).
I should point out that that was a very contrived example. Obviously not all map operations are this simple (especially in the context of something like Rails). There's also the varying style preferences to keep in mind (multiline block delimiters, ternaries and numbered block params).
> From a reduced operations perspective, `#reduce` is going to be faster than most anything else, and there are multiple options for the operation, but `Set` is likely to be your best case:
> Uniquely mapped in one pass, although I think still less efficient than `uniq {}.map {}` because you have to map the items before determining that they are unique (via `Set#add`). One could implement this without `Set`, but that would require a bit more work:
> These are both *slightly* less efficient than your C code because I don’t believe that there’s a way to preallocate a Hash size from Ruby (there have been proposals, but I don’t believe any have been accepted).
These are great examples, but I stand by my comment above that the mental overhead is not worth it in most cases unless your goal is simply performance. The primary reason for my request is succinctness, with performance coming second.
> When trying to add a functional shorthand, it is common to compare it against other functional languages to see if it or a close synonym is commonly used because many people working with functional operations find it useful to have such a shorthand. So, not unusual.
That makes sense to me, thanks.
> Consensus-building mostly.
>
> I don’t think that `#uniq_map` is a good addition because it is *only* sugar over `.map {}.uniq` and cannot sugar over `.map {}.uniq {}`, and I think that — with the exception of the intermediate hash-size preallocation — `#reduce` or `.uniq {}.map {}` will be as or more efficient than `#uniq_map`. I don’t have benchmarks, though.
I appreciate your response.
----------------------------------------
Feature #19787: Add Enumerable#uniq_map, Enumerable::Lazy#uniq_map, Array#uniq_map and Array#uniq_map!
https://bugs.ruby-lang.org/issues/19787#change-107287
* Author: joshuay03 (Joshua Young)
* Status: Open
----------------------------------------
I would like to propose a collection of new methods, `Enumerable#uniq_map`, `Enumerable::Lazy#uniq_map`, `Array#uniq_map` and `Array#uniq_map!`.
TL;DR: It's a drop in replacement for `.map { ... }.uniq`, with (hopefully) better performance.
I've quite often had to map over an array and get its unique elements. It occurred to me when doing so recently that Ruby doesn't have a short form method for doing that, similar to how `.flat_map { ... }` replaces `.map { ... }.flatten` and `.filter_map { ... }` replaces `.map { ... }.compact` (with minor differences). I think these new methods could be beneficial both in terms of better performance and writing more succinct code.
I've got a draft PR up with some initial benchmarks in the description: https://github.com/ruby/ruby/pull/10269.
--
https://bugs.ruby-lang.org/

Issue #19787 has been updated by joshuay03 (Joshua Young). rubyFeedback (robert heiler) wrote in #note-6:
Ultimately you only have to convince matz. :)
However had, matz may also request additional information and/or use case and "usefulness" of suggestions. Once features are added it is difficult to remove them due to backwards compatibility.
Thanks, that's a good point re: backwards compatibility.
I am not really invested in the proposal here, so I will not comment much at all. The way how I use ruby I use .map {} a lot, and .uniq sometimes, but I don't think I really had major use cases for combining the above into one method call. Note that I also don't use .flat_map either - I kind of prefer to stay with one-word methods when possible. They seem to make more sense to my brain. (I understand a rationale for e. g. library authors where efficiency may be more important, but personally I use ruby as kind of "syntax sugar" over C, the operating system and everyday tasks - ruby is really like the ultimate glue language the way how I use it. But that's just a side comment, I completely understand different people using ruby differently; just for my own use cases I don't seem to need .uniq_map or .flat_map. By the way, I also find it harder to remember the method names for two-word methods, e. g. .map_uniq or .map_flat; that's also one reason I stick with oldschool method chaining. Perhaps I am getting old ... .lazy is a bit different in that it also defers using something at "when it is needed", along with the functional use cases it has, which I think is different to both .flat_map and .uniq_map.)
I appreciate the feedback. ---------------------------------------- Feature #19787: Add Enumerable#uniq_map, Enumerable::Lazy#uniq_map, Array#uniq_map and Array#uniq_map! https://bugs.ruby-lang.org/issues/19787#change-107288 * Author: joshuay03 (Joshua Young) * Status: Open ---------------------------------------- I would like to propose a collection of new methods, `Enumerable#uniq_map`, `Enumerable::Lazy#uniq_map`, `Array#uniq_map` and `Array#uniq_map!`. TL;DR: It's a drop in replacement for `.map { ... }.uniq`, with (hopefully) better performance. I've quite often had to map over an array and get its unique elements. It occurred to me when doing so recently that Ruby doesn't have a short form method for doing that, similar to how `.flat_map { ... }` replaces `.map { ... }.flatten` and `.filter_map { ... }` replaces `.map { ... }.compact` (with minor differences). I think these new methods could be beneficial both in terms of better performance and writing more succinct code. I've got a draft PR up with some initial benchmarks in the description: https://github.com/ruby/ruby/pull/10269. -- https://bugs.ruby-lang.org/
participants (4)
-
austin (Austin Ziegler)
-
joshuay03 (Joshua Young)
-
mame (Yusuke Endoh)
-
rubyFeedback (robert heiler)