temporal_mean¶
- temporal_mean(arr, skip_na=True)[source]¶
Compute the mean along the leading time axis of a 1D–4D time‑first array.
- Parameters:
arr (
numpy.ndarray) – Time‑first array (1D–4D). Shapes: (T,), (T, F), (T, Y, X), (T, B, Y, X).skip_na (
bool, defaultTrue) – If True, NaNs are excluded per pixel/band; all‑NaN series produce NaN. If False, any NaN in a series propagates NaN to the output position.
- Returns:
Mean with time axis removed; float64 dtype. Scalar for 1D input.
- Return type:
Overview¶
temporal_mean computes the arithmetic mean along the leading (time) axis of an array.
It supports 1D, 2D, 3D, and 4D inputs with shapes interpreted as:
1D:
(time,)→ result is a scalar (float64)2D:
(time, feature)→ result shape(feature,)3D:
(time, y, x)→ result shape(y, x)4D:
(time, band, y, x)→ result shape(band, y, x)
NaN Handling¶
The parameter skip_na controls how NaNs are treated:
skip_na=True(default): NaNs are excluded from the mean calculation for each pixel location. If all values at that location are NaN, the output is set to NaN.skip_na=False: Any NaN in the series forces the output at that location to NaN.
Internal Behavior¶
Inputs of any numeric dtype (int/uint/float) are coerced to
float64once in the Rust layer.Parallelization (Rayon) is applied for 3D/4D arrays over spatial indices for improved throughput on large grids.
For 1D and 2D arrays, simple sequential Rust loops avoid unnecessary parallel overhead.
Parameters¶
See the generated signature above; primary arguments:
arrnumpy.ndarrayInput time-first array (1D–4D).
skip_nabool, defaultTrueWhether to ignore NaNs when computing the mean.
Returns¶
numpy.ndarrayorfloatMean with time axis removed. Scalar for 1D input; array otherwise. Output dtype is always
float64.
Edge Cases¶
All-NaN time series at a location → output NaN (when
skip_na=True).Series length < 1 (empty input) is not supported and will raise an error upstream.
Single valid value among NaNs → output that value (with
skip_na=True).
Example (1D)¶
import numpy as np
from eo_processor import temporal_mean
ts = np.array([1.0, 2.0, np.nan, 5.0])
m = temporal_mean(ts) # skip_na=True by default
print(m) # (1 + 2 + 5) / 3 = 8/3 ≈ 2.6666667
Example (3D)¶
import numpy as np
from eo_processor import temporal_mean
cube = np.random.rand(12, 256, 256) # (time, y, x)
mean_img = temporal_mean(cube) # shape (256, 256)
Example (NaN Propagation)¶
import numpy as np
from eo_processor import temporal_mean
arr = np.array([[1.0, np.nan],
[3.0, 4.0],
[np.nan, 2.0]]) # shape (time, feature)
# Column 0: [1.0, 3.0, NaN] -> mean = 2.0
# Column 1: [NaN, 4.0, 2.0] -> mean = 3.0
out_skip = temporal_mean(arr, skip_na=True) # array([2.0, 3.0])
out_noprop = temporal_mean(arr, skip_na=False) # NaNs present -> array([nan, nan])
Performance Notes¶
For large 3D/4D arrays, per-pixel series extraction and mean accumulation run in native Rust without Python GIL contention.
Parallel iteration threshold heuristics avoid excessive overhead on small arrays.
Performance (Representative Benchmarks)¶
Unlike some other reducers (e.g., median), temporal_mean does not always outperform NumPy on every shape. For moderate array sizes NumPy’s highly optimized C loops plus contiguous memory access can match or exceed the Rust implementation, especially when parallel thresholds intentionally avoid spawning threads to reduce overhead.
Single-run measurements (macOS ARM64, CPython 3.10, release build, time.perf_counter(), float64 data, warm cache):
Shape (time, y, x) |
Description |
Rust (s) |
NumPy (s) |
Rust Throughput (M elems/s) |
NumPy Throughput (M elems/s) |
Speedup |
|---|---|---|---|---|---|---|
(24, 1024, 1024) |
Medium cube |
0.276 |
0.078 |
91.17 |
322.63 |
0.28x |
(24, 2000, 2000) |
Large cube |
0.891 |
0.474 |
107.77 |
202.53 |
0.53x |
Interpretation: - Current parallel heuristics favor avoiding overhead on medium-sized grids; NumPy can be faster. - For substantially larger spatial domains or when integrating with other Rust kernels in a pipeline, total end-to-end throughput may still benefit from uniform Rust execution. - Further tuning (e.g., adjusting parallel thresholds or adding adaptive chunking) can improve relative performance; such changes will be documented with updated benchmarks.
Reproduction Snippet: .. code-block:: python
import numpy as np, time from eo_processor import temporal_mean
cube = np.random.rand(24, 2000, 2000) t0 = time.perf_counter() rust_out = temporal_mean(cube) rust_t = time.perf_counter() - t0
t0 = time.perf_counter() numpy_out = np.nanmean(cube, axis=0) numpy_t = time.perf_counter() - t0
print(f”Rust {rust_t:.3f}s vs NumPy {numpy_t:.3f}s speedup {numpy_t/rust_t:.2f}x”) assert np.allclose(rust_out, numpy_out, atol=1e-12)
Performance Claim Template: .. code-block:: text
Benchmark: Shape: (24, 2000, 2000) NumPy nanmean: 0.474s Rust temporal_mean: 0.891s Speedup: 0.53x (NumPy faster for this shape) Methodology: single run, time.perf_counter(), float64 arrays Validation: np.allclose(…, atol=1e-12)
Guidance: - Benchmark on your target workload (time axis length, spatial size) before selecting an implementation. - If most of your pipeline already uses Rust-accelerated functions, keeping the mean in Rust can simplify threading and reduce Python call overhead in aggregate. - Consider external chunking (Dask/XArray) to exploit multi-worker parallelism; each worker invokes the Rust kernel without holding the GIL.
Numerical Stability¶
All intermediate sums use
float64.No special epsilon handling is required for mean (unlike normalized difference denominators).
When to Use¶
Use temporal_mean to reduce a temporal stack to a representative average surface or per-band summary, especially prior to:
- Change detection baselining
- Feature engineering for machine learning
- Seasonal or annual aggregation steps
Limitations¶
Does not provide weighted mean; for weighting logic you would need a custom wrapper.
Does not internally chunk extremely large time dimensions; consider external chunking (e.g., Dask) for memory management.
End of temporal_mean reference.