eo-processor Documentation

High-performance Rust (PyO3) UDFs for Earth Observation (EO) processing with Python bindings. Fast spectral indices, temporal statistics, masking utilities, and spatial distance functions.

This site consolidates: - User guides (README, QUICKSTART) - High-level architecture overview (Rust UDF acceleration model) - Full Python API reference (autogenerated) - Benchmarking guidance and integration patterns (XArray / Dask)

Contents

Project Goals

eo-processor focuses on numerically intensive EO primitives that benefit from: - True multi-core parallelism (no Python GIL contention) - Deterministic float64 execution paths (stable numerical behavior) - Dimensional dispatch (1D–4D) without repeated Python layer overhead - Minimal intermediate allocations for multi-step computations

Performance Philosophy

Rust kernels use a hybrid strategy:

  1. Input Coercion: Any numeric NumPy dtype (int/uint/float) is converted once to float64.

  2. Shape Validation: Dispatch occurs early—specialized routines for 1D / 2D / 3D / 4D.

  3. Parallel Thresholding: Rayon parallelism activates only when problem sizes justify it (e.g. large temporal stacks or O(N*M) pairwise distance matrices).

  4. Memory Efficiency: Avoid proliferation of temporary arrays that pure NumPy broadcasting often introduces in chained expressions.

  5. Safety: No unsafe blocks; all loops are bounds-checked by the compiler.

Compared with pure Python/NumPy: - Multi-step index formulas (e.g., SAVI, EVI) run in tight Rust loops instead of composing multiple temporary arrays. - Temporal statistics (mean, median, std) operate per pixel with optional NaN filtering, parallelized over spatial indices. - Pairwise distances avoid intermediate broadcast expansions that inflate memory footprints. - Change detection (ΔNDVI, ΔNBR) computes both epochs within a single pass-friendly layout.

Rust UDF Architecture (High-Level)

Why Rust + PyO3?

  • GIL Release: Native execution frees Python’s global interpreter lock so CPU cores can be fully utilized.

  • Borrow Checker Guarantees: No manual reference counting, no segmentation faults typical of lower-level C extensions.

  • Composability: Functions act like ufuncs and slot neatly into xarray.apply_ufunc(..., dask="parallelized").

  • Reliability: Float64 coercion + explicit epsilon guards stabilize denominator-sensitive indices (e.g., normalized difference variants).

Key modules:

  • indices: Spectral and change-detection indices (NDVI, NDWI, EVI, SAVI, NBR, NDMI, NBR2, GCI, ΔNDVI, ΔNBR)

  • temporal: Time-axis aggregations (mean, std) with NaN skipping and dimensional flexibility

  • spatial: Median compositing + pairwise distance functions (Euclidean, Manhattan, Chebyshev, Minkowski)

  • masking: Value, range, invalid code, and Sentinel‑2 SCL masking utilities

API Reference

Below is the autosummary listing of public Python functions. Each entry links to detailed parameter/return docs.

Integration Example (XArray + Dask)

import xarray as xr
import dask.array as da
from eo_processor import ndvi

nir = da.random.random((5000, 5000), chunks=(500, 500))
red = da.random.random((5000, 5000), chunks=(500, 500))

nir_xr = xr.DataArray(nir, dims=["y", "x"])
red_xr = xr.DataArray(red, dims=["y", "x"])

ndvi_xr = xr.apply_ufunc(
    ndvi,
    nir_xr,
    red_xr,
    dask="parallelized",
    output_dtypes=[float],
)
result = ndvi_xr.compute()

Benchmark Guidance

Representative large-array benchmark template:

import numpy as np, time
from eo_processor import ndvi

nir = np.random.rand(5000, 5000)
red = np.random.rand(5000, 5000)

t0 = time.time()
rust_out = ndvi(nir, red)
rust_t = time.time() - t0

t0 = time.time()
numpy_out = (nir - red) / (nir + red)
numpy_t = time.time() - t0

print(f"Rust {rust_t:.3f}s vs NumPy {numpy_t:.3f}s (speedup {numpy_t / rust_t:.2f}x)")
assert np.allclose(rust_out, numpy_out, atol=1e-12)

Contributing

For adding new functions: 1. Implement Rust function (no unsafe) 2. Register in src/lib.rs via wrap_pyfunction! 3. Export in python/eo_processor/__init__.py and add stub in __init__.pyi 4. Add tests (cover edge cases, NaNs, shape mismatches) 5. Update README / docs (formula, usage) 6. Run pre-commit checklist (formatting, clippy, pytest, coverage) 7. Bump version (minor) if public API extended

See full contribution guide linked above.

License

MIT License. See repository for full text.

Indices and Tables