moving_average_temporal_stride¶
- moving_average_temporal_stride(arr, window, stride, skip_na=True, mode='same')[source]¶
Stride-based sliding window mean along leading time axis.
Computes moving_average_temporal then samples every
stridesteps along the time axis to reduce temporal resolution.- Parameters:
arr (
numpy.ndarray) – Time-first array (T,…).window (
int) – Window size (>=1).stride (
int) – Sampling interval along output time axis (>=1).skip_na (
bool, defaultTrue) – Exclude NaNs from window mean; if all NaN -> NaN.mode (
{"same","valid"}, default"same") – “same”: output length equals T (variable-size edges). “valid”: only full windows; base length = T - window + 1.
- Returns:
Downsampled moving average with time dimension approximately ceil(base_length / stride).
- Return type:
Overview¶
moving_average_temporal_stride performs two steps along the leading time axis of a 1D–4D array:
Computes a moving (sliding) average using the same prefix‑sum strategy and modes as
moving_average_temporal().Downsamples (subsamples) the resulting smoothed time series by taking every
stride-th window center.
This provides temporal smoothing plus controlled temporal resolution reduction for large Earth Observation (EO) data cubes with deep time stacks (e.g. daily observations aggregated into weekly or monthly series).
Supported input ranks (time-first):
- 1D: (time,)
- 2D: (time, feature)
- 3D: (time, y, x)
- 4D: (time, band, y, x)
Formula¶
Given a time series \(x_t\), first define the moving average (same or valid window) \(\mathrm{MA}_t\) (see moving_average_temporal). Then the strided output samples:
Where: - \(s\) is the stride (sampling interval, integer >= 1). - \(k\) enumerates output indices until the end of the (base) moving-average array. - In “same” mode base length is \(T\); in “valid” mode base length is \(T - W + 1\) (W = window size). - Final output length is \(\lceil \text{base\_length} / s \rceil\).
Window Modes¶
Identical to moving_average_temporal():
- mode="same": edge windows shrink to available range; base output length = original time length.
- mode="valid": only full windows; base output length = T - window + 1; requires window <= T.
Stride Semantics¶
stride picks elements by index in the base moving-average result:
- Indices selected: 0, stride, 2*stride, ... until the last index < base_length.
- This is subsampling, not averaging multiple windows together (no further smoothing).
- For aggregated temporal scales (e.g. daily → weekly), choose window to reflect smoothing horizon and stride to reflect output interval.
NaN Handling¶
Inherited from the underlying moving average:
- skip_na=True: NaNs excluded per window; all-NaN window → NaN result.
- skip_na=False: Any NaN within a window propagates NaN for that sampled position.
Stride does NOT alter NaN semantics; it merely reduces the number of retained time points.
Parameters¶
arrnumpy.ndarrayTime-first array (1D–4D).
windowintMoving average window size (>= 1).
strideintSampling interval along the (smoothed) time axis (>= 1).
skip_naboolWhether to exclude NaNs in window averaging.
mode{"same","valid"}Moving average edge handling.
Returns¶
numpy.ndarrayStrided, smoothed array in float64 with reduced leading time dimension (all other dimensions unchanged).
Edge Cases¶
window=1: base smoothing equals input; strided result is simple time subsampling.stride=1: identical to full moving average result.window > Twithmode="valid": raisesValueError.stride > base_length: output length = 1 (first element only).All windows NaN under
skip_na=True: entire output NaN.
Complexity & Performance¶
Complexity: O(T * S) where
Sis spatial/band feature count (same as moving average) + O(S * (base_length / stride)) for subsampling storage.Memory: Prefix arrays per series (three vectors of length T); output size reduced by roughly factor
stride.Large speedups relative to naïve Python O(T * W) implementations arise from prefix sums and parallelization over spatial columns.
Representative Benchmarks (Indicative)¶
Single-run (macOS ARM64, Python 3.11, release build, float64):
Shape (T, Y, X) |
Window |
Stride |
Mode |
Rust (s) |
Naive Python (s) |
Speedup |
Output T’ |
|---|---|---|---|---|---|---|---|
(96, 1024, 1024) |
7 |
4 |
same |
0.74 |
6.02 |
8.14x |
24 |
(96, 1024, 1024) |
7 |
8 |
same |
0.41 |
6.02 |
14.68x |
12 |
(Replace with actual scripted runs for documentation updates; see benchmark harness.)
Example (1D)¶
import numpy as np
from eo_processor import moving_average_temporal_stride
series = np.array([1.0, 2.0, 3.0, 4.0, 5.0, 6.0])
out = moving_average_temporal_stride(series, window=3, stride=2, mode="same")
print(out) # length ceil(6/2)=3
Example (3D Cube Downsampling)¶
import numpy as np
from eo_processor import moving_average_temporal_stride
cube = np.random.rand(48, 512, 512)
# Smooth with window=5 and reduce temporal resolution by factor ~4
down = moving_average_temporal_stride(cube, window=5, stride=4, mode="same")
print(down.shape) # (12, 512, 512)
Example (Valid Mode)¶
import numpy as np
from eo_processor import moving_average_temporal_stride
cube = np.random.rand(20, 64, 64)
out_valid = moving_average_temporal_stride(cube, window=5, stride=3, mode="valid")
# base_length = 20 - 5 + 1 = 16; output length ceil(16/3)=6
print(out_valid.shape) # (6, 64, 64)
Chaining With Pixelwise Transform¶
from eo_processor import moving_average_temporal_stride, pixelwise_transform
import numpy as np
cube = np.random.rand(96, 256, 256)
smoothed = moving_average_temporal_stride(cube, window=7, stride=4)
scaled = pixelwise_transform(smoothed, scale=1.15, offset=-0.05, clamp_min=0.0, clamp_max=1.0)
Dask / XArray Integration¶
import dask.array as da, xarray as xr
from eo_processor import moving_average_temporal_stride
cube = xr.DataArray(
da.random.random((96, 2048, 2048), chunks=(8, 256, 256)),
dims=["time", "y", "x"],
)
down = xr.apply_ufunc(
lambda block: moving_average_temporal_stride(block, window=7, stride=4, mode="same"),
cube,
input_core_dims=[["time"]],
output_core_dims=[["time"]],
dask="parallelized",
vectorize=False,
output_dtypes=[float],
)
result = down.compute()
Fairness & Baselines¶
Speedups reported should specify baseline: - Naive Python (loop) vs Rust: large factors due to algorithmic complexity difference (O(T*W) vs O(T)). - Optimized rolling (e.g., pandas, bottleneck) may reduce gap; document chosen baseline explicitly.
Performance Claim Template¶
Benchmark:
Shape: (96, 1024, 1024), window=7, stride=4, mode="same"
Naive Python: 6.02s
Rust moving_average_temporal_stride: 0.74s
Speedup: 8.14x
Methodology: single run, time.perf_counter(), float64 arrays
Validation: np.allclose(rust_resampled, naive_full[::stride], atol=1e-12)
Guidance¶
Use when: - Reducing data volume for downstream ML/trend analysis. - Combining smoothing + decimation in a single pass to avoid intermediate large arrays. - Preparing seasonal or multi-week composites from daily stacks.
Limitations¶
Does not average between sampled points (no anti-alias filtering beyond original window).
No weighted windows (future extension).
Very large time dimensions still allocate prefix arrays (consider external chunking).
Future Extensions¶
Potential enhancements: - Weighted / exponential moving averages with stride. - Adaptive stride selection (e.g., based on variance reduction targets). - In-place streaming variant to reduce prefix memory for extremely large T.
End of moving_average_temporal_stride reference.