mask_invalid ============ .. currentmodule:: eo_processor .. autofunction:: mask_invalid Overview -------- `mask_invalid` masks a list of sentinel (invalid) numeric codes in an input array (1D–4D). Masked positions are replaced by `NaN` (default) or a user‑provided `fill_value`. The function returns a `float64` NumPy array (even if the input dtype was integer) to preserve NaN semantics for downstream processing. Typical Use Cases ----------------- - Removing sensor error flags (e.g., 0 or -9999). - Normalizing heterogeneous invalid code conventions before temporal statistics. - Preparing arrays for compositing by stripping placeholder values. Parameters ---------- - ``arr`` : numpy.ndarray Input array of shape (time, ...), accepting 1D, 2D, 3D, or 4D layouts. Any numeric dtype. - ``invalid_values`` : Sequence[float] Iterable of numeric codes to treat as invalid (exact equality). - ``fill_value`` : float, optional Replacement for invalid positions. Defaults to ``NaN`` when omitted. Returns ------- numpy.ndarray (float64) Array matching input shape with invalid codes replaced by NaN or the specified ``fill_value``. Behavior Notes -------------- - Equality test is exact; supply the precise codes used in your dataset. - Output is always ``float64`` (internal coercion) to guarantee NaN support. - No in-place modification of the original input array occurs. Examples -------- Basic masking (default NaN): :: import numpy as np from eo_processor import mask_invalid arr = np.array([0, 1, -9999, 2], dtype=np.int32) out = mask_invalid(arr, invalid_values=[0, -9999]) # out -> [nan, 1., nan, 2.] Using a custom fill value: :: out = mask_invalid(arr, invalid_values=[0], fill_value=-1.0) # out -> [-1., 1., -9999., 2.] (only 0 masked) 3D array (time, y, x): :: cube = np.array([ [[0, 5], [6, -9999]], [[1, 0], [3, 4]], ]) cleaned = mask_invalid(cube, invalid_values=[0, -9999]) # All 0 and -9999 entries become NaN; shape preserved. Edge Cases ---------- - If ``invalid_values`` is empty, the function returns a float64 copy of the input. - Very large lists of invalid codes are handled by iterating per element; consider pre-filtering if performance becomes critical. - Supplying values not present in the array yields a simple float64 copy. Integration Tips ---------------- - Chain with `replace_nans(arr, value)` if you need a consistent sentinel after masking rather than NaN. - Use prior to temporal aggregations (`temporal_mean`, `median`, etc.) to prevent invalid values from influencing statistics when `skip_na=True`. - Combine with `mask_vals` when you need both equality masking and NaN normalization logic (e.g., replacing NaNs to a sentinel after masking). Related Functions ----------------- - ``mask_vals``: Masks exact codes with optional additional NaN normalization step. - ``mask_out_range`` / ``mask_in_range``: Range-based masking. - ``replace_nans``: Post-processing for NaN replacement. - ``mask_scl``: Specialized Sentinel‑2 SCL masking. Performance ----------- For typical EO arrays (millions of pixels), masking runs in a single pass and allocates one output array. There is no parallel overhead for small arrays; performance scales primarily with total element count. Version & Stability ------------------- - Numeric coercion and masking semantics are stable across patch releases. - Adding new masking functions will trigger a minor version bump per repository governance. End of `mask_invalid` reference.