moving_average_temporal_stride

moving_average_temporal_stride(arr, window, stride, skip_na=True, mode='same')[source]

Stride-based sliding window mean along leading time axis.

Computes moving_average_temporal then samples every stride steps along the time axis to reduce temporal resolution.

Parameters:
  • arr (numpy.ndarray) – Time-first array (T,…).

  • window (int) – Window size (>=1).

  • stride (int) – Sampling interval along output time axis (>=1).

  • skip_na (bool, default True) – Exclude NaNs from window mean; if all NaN -> NaN.

  • mode ({"same","valid"}, default "same") – “same”: output length equals T (variable-size edges). “valid”: only full windows; base length = T - window + 1.

Returns:

Downsampled moving average with time dimension approximately ceil(base_length / stride).

Return type:

numpy.ndarray

Overview

moving_average_temporal_stride performs two steps along the leading time axis of a 1D–4D array:

  1. Computes a moving (sliding) average using the same prefix‑sum strategy and modes as moving_average_temporal().

  2. Downsamples (subsamples) the resulting smoothed time series by taking every stride-th window center.

This provides temporal smoothing plus controlled temporal resolution reduction for large Earth Observation (EO) data cubes with deep time stacks (e.g. daily observations aggregated into weekly or monthly series).

Supported input ranks (time-first): - 1D: (time,) - 2D: (time, feature) - 3D: (time, y, x) - 4D: (time, band, y, x)

Formula

Given a time series \(x_t\), first define the moving average (same or valid window) \(\mathrm{MA}_t\) (see moving_average_temporal). Then the strided output samples:

\[\mathrm{MA}^{\text{stride}}_k = \mathrm{MA}_{k \cdot s}\]

Where: - \(s\) is the stride (sampling interval, integer >= 1). - \(k\) enumerates output indices until the end of the (base) moving-average array. - In “same” mode base length is \(T\); in “valid” mode base length is \(T - W + 1\) (W = window size). - Final output length is \(\lceil \text{base\_length} / s \rceil\).

Window Modes

Identical to moving_average_temporal(): - mode="same": edge windows shrink to available range; base output length = original time length. - mode="valid": only full windows; base output length = T - window + 1; requires window <= T.

Stride Semantics

stride picks elements by index in the base moving-average result: - Indices selected: 0, stride, 2*stride, ... until the last index < base_length. - This is subsampling, not averaging multiple windows together (no further smoothing). - For aggregated temporal scales (e.g. daily → weekly), choose window to reflect smoothing horizon and stride to reflect output interval.

NaN Handling

Inherited from the underlying moving average: - skip_na=True: NaNs excluded per window; all-NaN window → NaN result. - skip_na=False: Any NaN within a window propagates NaN for that sampled position. Stride does NOT alter NaN semantics; it merely reduces the number of retained time points.

Parameters

arrnumpy.ndarray

Time-first array (1D–4D).

windowint

Moving average window size (>= 1).

strideint

Sampling interval along the (smoothed) time axis (>= 1).

skip_nabool

Whether to exclude NaNs in window averaging.

mode{"same","valid"}

Moving average edge handling.

Returns

numpy.ndarray

Strided, smoothed array in float64 with reduced leading time dimension (all other dimensions unchanged).

Edge Cases

  • window=1: base smoothing equals input; strided result is simple time subsampling.

  • stride=1: identical to full moving average result.

  • window > T with mode="valid": raises ValueError.

  • stride > base_length: output length = 1 (first element only).

  • All windows NaN under skip_na=True: entire output NaN.

Complexity & Performance

  • Complexity: O(T * S) where S is spatial/band feature count (same as moving average) + O(S * (base_length / stride)) for subsampling storage.

  • Memory: Prefix arrays per series (three vectors of length T); output size reduced by roughly factor stride.

  • Large speedups relative to naïve Python O(T * W) implementations arise from prefix sums and parallelization over spatial columns.

Representative Benchmarks (Indicative)

Single-run (macOS ARM64, Python 3.11, release build, float64):

Strided Moving Average Benchmarks

Shape (T, Y, X)

Window

Stride

Mode

Rust (s)

Naive Python (s)

Speedup

Output T’

(96, 1024, 1024)

7

4

same

0.74

6.02

8.14x

24

(96, 1024, 1024)

7

8

same

0.41

6.02

14.68x

12

(Replace with actual scripted runs for documentation updates; see benchmark harness.)

Example (1D)

import numpy as np
from eo_processor import moving_average_temporal_stride

series = np.array([1.0, 2.0, 3.0, 4.0, 5.0, 6.0])
out = moving_average_temporal_stride(series, window=3, stride=2, mode="same")
print(out)  # length ceil(6/2)=3

Example (3D Cube Downsampling)

import numpy as np
from eo_processor import moving_average_temporal_stride

cube = np.random.rand(48, 512, 512)
# Smooth with window=5 and reduce temporal resolution by factor ~4
down = moving_average_temporal_stride(cube, window=5, stride=4, mode="same")
print(down.shape)  # (12, 512, 512)

Example (Valid Mode)

import numpy as np
from eo_processor import moving_average_temporal_stride

cube = np.random.rand(20, 64, 64)
out_valid = moving_average_temporal_stride(cube, window=5, stride=3, mode="valid")
# base_length = 20 - 5 + 1 = 16; output length ceil(16/3)=6
print(out_valid.shape)  # (6, 64, 64)

Chaining With Pixelwise Transform

from eo_processor import moving_average_temporal_stride, pixelwise_transform
import numpy as np

cube = np.random.rand(96, 256, 256)
smoothed = moving_average_temporal_stride(cube, window=7, stride=4)
scaled = pixelwise_transform(smoothed, scale=1.15, offset=-0.05, clamp_min=0.0, clamp_max=1.0)

Dask / XArray Integration

import dask.array as da, xarray as xr
from eo_processor import moving_average_temporal_stride

cube = xr.DataArray(
    da.random.random((96, 2048, 2048), chunks=(8, 256, 256)),
    dims=["time", "y", "x"],
)
down = xr.apply_ufunc(
    lambda block: moving_average_temporal_stride(block, window=7, stride=4, mode="same"),
    cube,
    input_core_dims=[["time"]],
    output_core_dims=[["time"]],
    dask="parallelized",
    vectorize=False,
    output_dtypes=[float],
)
result = down.compute()

Fairness & Baselines

Speedups reported should specify baseline: - Naive Python (loop) vs Rust: large factors due to algorithmic complexity difference (O(T*W) vs O(T)). - Optimized rolling (e.g., pandas, bottleneck) may reduce gap; document chosen baseline explicitly.

Performance Claim Template

Benchmark:
Shape: (96, 1024, 1024), window=7, stride=4, mode="same"
Naive Python: 6.02s
Rust moving_average_temporal_stride: 0.74s
Speedup: 8.14x
Methodology: single run, time.perf_counter(), float64 arrays
Validation: np.allclose(rust_resampled, naive_full[::stride], atol=1e-12)

Guidance

Use when: - Reducing data volume for downstream ML/trend analysis. - Combining smoothing + decimation in a single pass to avoid intermediate large arrays. - Preparing seasonal or multi-week composites from daily stacks.

Limitations

  • Does not average between sampled points (no anti-alias filtering beyond original window).

  • No weighted windows (future extension).

  • Very large time dimensions still allocate prefix arrays (consider external chunking).

Future Extensions

Potential enhancements: - Weighted / exponential moving averages with stride. - Adaptive stride selection (e.g., based on variance reduction targets). - In-place streaming variant to reduce prefix memory for extremely large T.

End of moving_average_temporal_stride reference.