mask_invalid

mask_invalid(arr, invalid_values, fill_value=None)[source]

Mask a list of common invalid sentinel values.

Parameters:
  • arr (numpy.ndarray) – Input array.

  • invalid_values (sequence) – List of numeric codes to mask.

  • fill_value (float, optional) – Value for masked positions (default NaN).

Returns:

Masked array.

Return type:

numpy.ndarray

Overview

mask_invalid masks a list of sentinel (invalid) numeric codes in an input array (1D–4D). Masked positions are replaced by NaN (default) or a user‑provided fill_value. The function returns a float64 NumPy array (even if the input dtype was integer) to preserve NaN semantics for downstream processing.

Typical Use Cases

  • Removing sensor error flags (e.g., 0 or -9999).

  • Normalizing heterogeneous invalid code conventions before temporal statistics.

  • Preparing arrays for compositing by stripping placeholder values.

Parameters

  • arr : numpy.ndarray Input array of shape (time, …), accepting 1D, 2D, 3D, or 4D layouts. Any numeric dtype.

  • invalid_values : Sequence[float] Iterable of numeric codes to treat as invalid (exact equality).

  • fill_value : float, optional Replacement for invalid positions. Defaults to NaN when omitted.

Returns

numpy.ndarray (float64) Array matching input shape with invalid codes replaced by NaN or the specified fill_value.

Behavior Notes

  • Equality test is exact; supply the precise codes used in your dataset.

  • Output is always float64 (internal coercion) to guarantee NaN support.

  • No in-place modification of the original input array occurs.

Examples

Basic masking (default NaN):

import numpy as np
from eo_processor import mask_invalid

arr = np.array([0, 1, -9999, 2], dtype=np.int32)
out = mask_invalid(arr, invalid_values=[0, -9999])
# out -> [nan, 1., nan, 2.]

Using a custom fill value:

out = mask_invalid(arr, invalid_values=[0], fill_value=-1.0)
# out -> [-1., 1., -9999., 2.]  (only 0 masked)

3D array (time, y, x):

cube = np.array([
    [[0, 5], [6, -9999]],
    [[1, 0], [3, 4]],
])
cleaned = mask_invalid(cube, invalid_values=[0, -9999])
# All 0 and -9999 entries become NaN; shape preserved.

Edge Cases

  • If invalid_values is empty, the function returns a float64 copy of the input.

  • Very large lists of invalid codes are handled by iterating per element; consider pre-filtering if performance becomes critical.

  • Supplying values not present in the array yields a simple float64 copy.

Integration Tips

  • Chain with replace_nans(arr, value) if you need a consistent sentinel after masking rather than NaN.

  • Use prior to temporal aggregations (temporal_mean, median, etc.) to prevent invalid values from influencing statistics when skip_na=True.

  • Combine with mask_vals when you need both equality masking and NaN normalization logic (e.g., replacing NaNs to a sentinel after masking).

Performance

For typical EO arrays (millions of pixels), masking runs in a single pass and allocates one output array. There is no parallel overhead for small arrays; performance scales primarily with total element count.

Version & Stability

  • Numeric coercion and masking semantics are stable across patch releases.

  • Adding new masking functions will trigger a minor version bump per repository governance.

End of mask_invalid reference.