temporal_mean

temporal_mean(arr, skip_na=True)[source]

Compute the mean along the leading time axis of a 1D–4D time‑first array.

Parameters:
  • arr (numpy.ndarray) – Time‑first array (1D–4D). Shapes: (T,), (T, F), (T, Y, X), (T, B, Y, X).

  • skip_na (bool, default True) – If True, NaNs are excluded per pixel/band; all‑NaN series produce NaN. If False, any NaN in a series propagates NaN to the output position.

Returns:

Mean with time axis removed; float64 dtype. Scalar for 1D input.

Return type:

numpy.ndarray

Overview

temporal_mean computes the arithmetic mean along the leading (time) axis of an array. It supports 1D, 2D, 3D, and 4D inputs with shapes interpreted as:

  • 1D: (time,) → result is a scalar (float64)

  • 2D: (time, feature) → result shape (feature,)

  • 3D: (time, y, x) → result shape (y, x)

  • 4D: (time, band, y, x) → result shape (band, y, x)

NaN Handling

The parameter skip_na controls how NaNs are treated:

  • skip_na=True (default): NaNs are excluded from the mean calculation for each pixel location. If all values at that location are NaN, the output is set to NaN.

  • skip_na=False: Any NaN in the series forces the output at that location to NaN.

Internal Behavior

  • Inputs of any numeric dtype (int/uint/float) are coerced to float64 once in the Rust layer.

  • Parallelization (Rayon) is applied for 3D/4D arrays over spatial indices for improved throughput on large grids.

  • For 1D and 2D arrays, simple sequential Rust loops avoid unnecessary parallel overhead.

Parameters

See the generated signature above; primary arguments:

arrnumpy.ndarray

Input time-first array (1D–4D).

skip_nabool, default True

Whether to ignore NaNs when computing the mean.

Returns

numpy.ndarray or float

Mean with time axis removed. Scalar for 1D input; array otherwise. Output dtype is always float64.

Edge Cases

  • All-NaN time series at a location → output NaN (when skip_na=True).

  • Series length < 1 (empty input) is not supported and will raise an error upstream.

  • Single valid value among NaNs → output that value (with skip_na=True).

Example (1D)

import numpy as np
from eo_processor import temporal_mean

ts = np.array([1.0, 2.0, np.nan, 5.0])
m = temporal_mean(ts)              # skip_na=True by default
print(m)  # (1 + 2 + 5) / 3 = 8/3 ≈ 2.6666667

Example (3D)

import numpy as np
from eo_processor import temporal_mean

cube = np.random.rand(12, 256, 256)         # (time, y, x)
mean_img = temporal_mean(cube)              # shape (256, 256)

Example (NaN Propagation)

import numpy as np
from eo_processor import temporal_mean

arr = np.array([[1.0, np.nan],
                [3.0, 4.0],
                [np.nan, 2.0]])  # shape (time, feature)
# Column 0: [1.0, 3.0, NaN] -> mean = 2.0
# Column 1: [NaN, 4.0, 2.0] -> mean = 3.0
out_skip = temporal_mean(arr, skip_na=True)   # array([2.0, 3.0])
out_noprop = temporal_mean(arr, skip_na=False) # NaNs present -> array([nan, nan])

Performance Notes

  • For large 3D/4D arrays, per-pixel series extraction and mean accumulation run in native Rust without Python GIL contention.

  • Parallel iteration threshold heuristics avoid excessive overhead on small arrays.

Performance (Representative Benchmarks)

Unlike some other reducers (e.g., median), temporal_mean does not always outperform NumPy on every shape. For moderate array sizes NumPy’s highly optimized C loops plus contiguous memory access can match or exceed the Rust implementation, especially when parallel thresholds intentionally avoid spawning threads to reduce overhead.

Single-run measurements (macOS ARM64, CPython 3.10, release build, time.perf_counter(), float64 data, warm cache):

Temporal Mean Benchmark (Single Run)

Shape (time, y, x)

Description

Rust (s)

NumPy (s)

Rust Throughput (M elems/s)

NumPy Throughput (M elems/s)

Speedup

(24, 1024, 1024)

Medium cube

0.276

0.078

91.17

322.63

0.28x

(24, 2000, 2000)

Large cube

0.891

0.474

107.77

202.53

0.53x

Interpretation: - Current parallel heuristics favor avoiding overhead on medium-sized grids; NumPy can be faster. - For substantially larger spatial domains or when integrating with other Rust kernels in a pipeline, total end-to-end throughput may still benefit from uniform Rust execution. - Further tuning (e.g., adjusting parallel thresholds or adding adaptive chunking) can improve relative performance; such changes will be documented with updated benchmarks.

Reproduction Snippet: .. code-block:: python

import numpy as np, time from eo_processor import temporal_mean

cube = np.random.rand(24, 2000, 2000) t0 = time.perf_counter() rust_out = temporal_mean(cube) rust_t = time.perf_counter() - t0

t0 = time.perf_counter() numpy_out = np.nanmean(cube, axis=0) numpy_t = time.perf_counter() - t0

print(f”Rust {rust_t:.3f}s vs NumPy {numpy_t:.3f}s speedup {numpy_t/rust_t:.2f}x”) assert np.allclose(rust_out, numpy_out, atol=1e-12)

Performance Claim Template: .. code-block:: text

Benchmark: Shape: (24, 2000, 2000) NumPy nanmean: 0.474s Rust temporal_mean: 0.891s Speedup: 0.53x (NumPy faster for this shape) Methodology: single run, time.perf_counter(), float64 arrays Validation: np.allclose(…, atol=1e-12)

Guidance: - Benchmark on your target workload (time axis length, spatial size) before selecting an implementation. - If most of your pipeline already uses Rust-accelerated functions, keeping the mean in Rust can simplify threading and reduce Python call overhead in aggregate. - Consider external chunking (Dask/XArray) to exploit multi-worker parallelism; each worker invokes the Rust kernel without holding the GIL.

Numerical Stability

  • All intermediate sums use float64.

  • No special epsilon handling is required for mean (unlike normalized difference denominators).

When to Use

Use temporal_mean to reduce a temporal stack to a representative average surface or per-band summary, especially prior to: - Change detection baselining - Feature engineering for machine learning - Seasonal or annual aggregation steps

Limitations

  • Does not provide weighted mean; for weighting logic you would need a custom wrapper.

  • Does not internally chunk extremely large time dimensions; consider external chunking (e.g., Dask) for memory management.

End of temporal_mean reference.