
Issue #21518 has been updated by mrkn (Kenta Murata). Hi. I'm a creator of enumerable-statistics gem and the original proposer of `Array#sum` and `Enumerable#sum`. In general, adding only `mean` (I prefer `mean` over `average`, see below) and `median` won't cover real-world statistical needs. When a sample mean is required, variance or standard deviation usually follow; where a sample median is used, quantiles or percentiles typically follow. Truly “median-only” scenarios are rare in my experience. If these are added to core, we should set a high bar: numerically stable, one-pass algorithms with a C implementation for performance; and for median/percentiles computations, avoid full sort in favor of selection algorithms such as quickselect. The enumerable-statistics gem already provides a simple one-pass combined methods such as `mean_variance` and `mean_stdev`. `median` and `percentile` for Enumerable remain to be implemented. On naming: I strongly prefer `mean` over `average` for consistency with other programming languages and libraries (cf. #18057 note-8). Across Python/NumPy/Pandas, R, Julia, MATLAB, and so on,—`mean` is the standard term and API name. Aligning with that convention keeps Ruby familiar to users who work across stacks (acknowledging that a few general-purpose APIs, e.g., LINQ, use average). ---------------------------------------- Feature #21518: Statistical helpers to `Enumerable` https://bugs.ruby-lang.org/issues/21518#change-114363 * Author: Amitleshed (Amit Leshed) * Status: Open ---------------------------------------- **Summary** I'd like to add two statistical helpers to `Enumerable`: - `Enumerable#average` (arithmetic mean) - `Enumerable#median` Both are small, well-defined operations that many Rubyists re-implement in apps and gems. Providing them in core avoids repeated, ad-hoc code and aligns with `Enumerable#sum`, which Ruby already ships. **Motivation** - These are among the most common “roll-your-own” helpers for arrays/ranges of numbers. - They are conceptually simple, universally useful beyond web/Rails. - Similar to `sum`, they’re primitives for quick data analysis, ETL scripts, CLI tooling, etc. - Including them encourages consistent semantics (what to do with empty sets, mixed numerics, etc.). ## Proposed API & Semantics ```ruby Enumerable#average -> Float or nil Enumerable#median -> Numeric or nil ``` ```ruby [1, 2, 3, 4].average # => 2.5 (1..4).average # => 2.5 [].average # => nil [1, 3, 2].median # => 2 [1, 2, 3, 10].median # => 2.5 (1..6).median # => 3.5 [].median # => nil ``` Ruby implementation ```ruby module Enumerable def average count = 0 total = 0.0 each do |x| raise TypeError, "non-numeric value for average" unless x.is_a?(Numeric) total += x count += 1 end count.zero? ? nil : total / count end def median arr = to_a return nil if arr.empty? arr.each { |x| raise TypeError, "non-numeric value for median" unless x.is_a?(Numeric) } arr.sort! mid = arr.length / 2 arr.length.odd? ? arr[mid] : (arr[mid - 1] + arr[mid]) / 2.0 end end ``` **Upon approval I'm more than willing to implement spec and code in C.** -- https://bugs.ruby-lang.org/