eo-processor Documentation¶

High-performance Rust (PyO3) UDFs for Earth Observation (EO) processing with Python bindings. Fast spectral indices, temporal statistics, masking utilities, and spatial distance functions.

This site consolidates: - User guides (README, QUICKSTART) - High-level architecture overview (Rust UDF acceleration model) - Full Python API reference (autogenerated) - Benchmarking guidance and integration patterns (XArray / Dask)

Contents¶

Reference

Contributing

Project Goals¶

eo-processor focuses on numerically intensive EO primitives that benefit from: - True multi-core parallelism (no Python GIL contention) - Deterministic float64 execution paths (stable numerical behavior) - Dimensional dispatch (1D–4D) without repeated Python layer overhead - Minimal intermediate allocations for multi-step computations

Performance Philosophy¶

Rust kernels use a hybrid strategy:

Input Coercion: Any numeric NumPy dtype (int/uint/float) is converted once to float64.
Shape Validation: Dispatch occurs early—specialized routines for 1D / 2D / 3D / 4D.
Parallel Thresholding: Rayon parallelism activates only when problem sizes justify it (e.g. large temporal stacks or O(N*M) pairwise distance matrices).
Memory Efficiency: Avoid proliferation of temporary arrays that pure NumPy broadcasting often introduces in chained expressions.
Safety: No unsafe blocks; all loops are bounds-checked by the compiler.

Compared with pure Python/NumPy: - Multi-step index formulas (e.g., SAVI, EVI) run in tight Rust loops instead of composing multiple temporary arrays. - Temporal statistics (mean, median, std) operate per pixel with optional NaN filtering, parallelized over spatial indices. - Pairwise distances avoid intermediate broadcast expansions that inflate memory footprints. - Change detection (ΔNDVI, ΔNBR) computes both epochs within a single pass-friendly layout.

Rust UDF Architecture (High-Level)¶

Why Rust + PyO3?

GIL Release: Native execution frees Python’s global interpreter lock so CPU cores can be fully utilized.
Borrow Checker Guarantees: No manual reference counting, no segmentation faults typical of lower-level C extensions.
Composability: Functions act like ufuncs and slot neatly into xarray.apply_ufunc(..., dask="parallelized").
Reliability: Float64 coercion + explicit epsilon guards stabilize denominator-sensitive indices (e.g., normalized difference variants).

Key modules:

indices: Spectral and change-detection indices (NDVI, NDWI, EVI, SAVI, NBR, NDMI, NBR2, GCI, ΔNDVI, ΔNBR)
temporal: Time-axis aggregations (mean, std) with NaN skipping and dimensional flexibility
spatial: Median compositing + pairwise distance functions (Euclidean, Manhattan, Chebyshev, Minkowski)
masking: Value, range, invalid code, and Sentinel‑2 SCL masking utilities

API Reference¶

Below is the autosummary listing of public Python functions. Each entry links to detailed parameter/return docs.

Integration Example (XArray + Dask)¶

import xarray as xr
import dask.array as da
from eo_processor import ndvi

nir = da.random.random((5000, 5000), chunks=(500, 500))
red = da.random.random((5000, 5000), chunks=(500, 500))

nir_xr = xr.DataArray(nir, dims=["y", "x"])
red_xr = xr.DataArray(red, dims=["y", "x"])

ndvi_xr = xr.apply_ufunc(
    ndvi,
    nir_xr,
    red_xr,
    dask="parallelized",
    output_dtypes=[float],
)
result = ndvi_xr.compute()

Benchmark Guidance¶

Representative large-array benchmark template:

import numpy as np, time
from eo_processor import ndvi

nir = np.random.rand(5000, 5000)
red = np.random.rand(5000, 5000)

t0 = time.time()
rust_out = ndvi(nir, red)
rust_t = time.time() - t0

t0 = time.time()
numpy_out = (nir - red) / (nir + red)
numpy_t = time.time() - t0

print(f"Rust {rust_t:.3f}s vs NumPy {numpy_t:.3f}s (speedup {numpy_t / rust_t:.2f}x)")
assert np.allclose(rust_out, numpy_out, atol=1e-12)

Contributing¶

For adding new functions: 1. Implement Rust function (no unsafe) 2. Register in src/lib.rs via wrap_pyfunction! 3. Export in python/eo_processor/__init__.py and add stub in __init__.pyi 4. Add tests (cover edge cases, NaNs, shape mismatches) 5. Update README / docs (formula, usage) 6. Run pre-commit checklist (formatting, clippy, pytest, coverage) 7. Bump version (minor) if public API extended

See full contribution guide linked above.

License¶

MIT License. See repository for full text.