mask_invalid¶

mask_invalid(arr, invalid_values, fill_value=None)[source]¶

Mask a list of common invalid sentinel values.

Parameters:

arr (numpy.ndarray) – Input array.
invalid_values (sequence) – List of numeric codes to mask.
fill_value (float, optional) – Value for masked positions (default NaN).

Returns:

Masked array.

Return type:

numpy.ndarray

Overview¶

mask_invalid masks a list of sentinel (invalid) numeric codes in an input array (1D–4D). Masked positions are replaced by NaN (default) or a user‑provided fill_value. The function returns a float64 NumPy array (even if the input dtype was integer) to preserve NaN semantics for downstream processing.

Typical Use Cases¶

Removing sensor error flags (e.g., 0 or -9999).
Normalizing heterogeneous invalid code conventions before temporal statistics.
Preparing arrays for compositing by stripping placeholder values.

Parameters¶

arr : numpy.ndarray Input array of shape (time, …), accepting 1D, 2D, 3D, or 4D layouts. Any numeric dtype.
invalid_values : Sequence[float] Iterable of numeric codes to treat as invalid (exact equality).
fill_value : float, optional Replacement for invalid positions. Defaults to NaN when omitted.

Returns¶

numpy.ndarray (float64) Array matching input shape with invalid codes replaced by NaN or the specified fill_value.

Behavior Notes¶

Equality test is exact; supply the precise codes used in your dataset.
Output is always float64 (internal coercion) to guarantee NaN support.
No in-place modification of the original input array occurs.

Examples¶

Basic masking (default NaN):

import numpy as np
from eo_processor import mask_invalid

arr = np.array([0, 1, -9999, 2], dtype=np.int32)
out = mask_invalid(arr, invalid_values=[0, -9999])
# out -> [nan, 1., nan, 2.]

Using a custom fill value:

out = mask_invalid(arr, invalid_values=[0], fill_value=-1.0)
# out -> [-1., 1., -9999., 2.]  (only 0 masked)

3D array (time, y, x):

cube = np.array([
    [[0, 5], [6, -9999]],
    [[1, 0], [3, 4]],
])
cleaned = mask_invalid(cube, invalid_values=[0, -9999])
# All 0 and -9999 entries become NaN; shape preserved.

Edge Cases¶

If invalid_values is empty, the function returns a float64 copy of the input.
Very large lists of invalid codes are handled by iterating per element; consider pre-filtering if performance becomes critical.
Supplying values not present in the array yields a simple float64 copy.

Integration Tips¶

Chain with replace_nans(arr, value) if you need a consistent sentinel after masking rather than NaN.
Use prior to temporal aggregations (temporal_mean, median, etc.) to prevent invalid values from influencing statistics when skip_na=True.
Combine with mask_vals when you need both equality masking and NaN normalization logic (e.g., replacing NaNs to a sentinel after masking).

Performance¶

For typical EO arrays (millions of pixels), masking runs in a single pass and allocates one output array. There is no parallel overhead for small arrays; performance scales primarily with total element count.

Version & Stability¶

Numeric coercion and masking semantics are stable across patch releases.
Adding new masking functions will trigger a minor version bump per repository governance.

End of mask_invalid reference.