eo-processor Documentation¶
High-performance Rust (PyO3) UDFs for Earth Observation (EO) processing with Python bindings. Fast spectral indices, temporal statistics, masking utilities, and spatial distance functions.
This site consolidates: - User guides (README, QUICKSTART) - High-level architecture overview (Rust UDF acceleration model) - Full Python API reference (autogenerated) - Benchmarking guidance and integration patterns (XArray / Dask)
Contents¶
Getting Started
- eo-processor
- Documentation
- Overview
- Key Features
- Installation
- Quick Start
- API Summary
- Spectral & Change Detection Indices
- Masking Utilities
- Temporal Statistics & Compositing
- Trend Analysis
- Advanced Temporal & Pixelwise Processing
- Spatial Distances
- XArray / Dask Integration
- CLI Usage
- Performance
- Benchmark Harness
- Test Coverage
- Contributing
- Semantic Versioning
- Roadmap (Indicative)
- Scientific Citation
- License
- Disclaimer
- Support
- Acknowledgements
- Quick Start Guide
Reference
- Architecture & Performance Model
- Overview
- Design Objectives
- Rust Extension Boundary
- Dimensional Dispatch
- Numerical Stability & Coercion
- Parallel Execution Strategy
- Memory Efficiency
- NaN Handling
- Error Handling
- Performance Comparative Summary
- Integration with XArray / Dask
- Extensibility Guidelines
- Testing & Validation
- Representative Benchmark Template & Empirical Results
- Key Differences from Pure NumPy
- Future Optimization Opportunities
- Security & Safety Notes
- Reference of Core Modules
- Rust Acceleration Note
- Cross-References
- License
- Benchmark Report (Generated)
- Meta
- Results
- Meta
- Results
- Python Version Benchmarks
- Meta
- Results
- Meta
- Results
- Meta
- Results
- Meta
- Results
- Benchmark Fairness & Methodology
- Stress Benchmarks
- Functions
Contributing
Project Goals¶
eo-processor focuses on numerically intensive EO primitives that benefit from:
- True multi-core parallelism (no Python GIL contention)
- Deterministic float64 execution paths (stable numerical behavior)
- Dimensional dispatch (1D–4D) without repeated Python layer overhead
- Minimal intermediate allocations for multi-step computations
Performance Philosophy¶
Rust kernels use a hybrid strategy:
Input Coercion: Any numeric NumPy dtype (int/uint/float) is converted once to float64.
Shape Validation: Dispatch occurs early—specialized routines for 1D / 2D / 3D / 4D.
Parallel Thresholding: Rayon parallelism activates only when problem sizes justify it (e.g. large temporal stacks or O(N*M) pairwise distance matrices).
Memory Efficiency: Avoid proliferation of temporary arrays that pure NumPy broadcasting often introduces in chained expressions.
Safety: No
unsafeblocks; all loops are bounds-checked by the compiler.
Compared with pure Python/NumPy: - Multi-step index formulas (e.g., SAVI, EVI) run in tight Rust loops instead of composing multiple temporary arrays. - Temporal statistics (mean, median, std) operate per pixel with optional NaN filtering, parallelized over spatial indices. - Pairwise distances avoid intermediate broadcast expansions that inflate memory footprints. - Change detection (ΔNDVI, ΔNBR) computes both epochs within a single pass-friendly layout.
Rust UDF Architecture (High-Level)¶
Why Rust + PyO3?
GIL Release: Native execution frees Python’s global interpreter lock so CPU cores can be fully utilized.
Borrow Checker Guarantees: No manual reference counting, no segmentation faults typical of lower-level C extensions.
Composability: Functions act like ufuncs and slot neatly into
xarray.apply_ufunc(..., dask="parallelized").Reliability: Float64 coercion + explicit epsilon guards stabilize denominator-sensitive indices (e.g., normalized difference variants).
Key modules:
indices: Spectral and change-detection indices (NDVI, NDWI, EVI, SAVI, NBR, NDMI, NBR2, GCI, ΔNDVI, ΔNBR)temporal: Time-axis aggregations (mean, std) with NaN skipping and dimensional flexibilityspatial: Median compositing + pairwise distance functions (Euclidean, Manhattan, Chebyshev, Minkowski)masking: Value, range, invalid code, and Sentinel‑2 SCL masking utilities
API Reference¶
Below is the autosummary listing of public Python functions. Each entry links to detailed parameter/return docs.
Integration Example (XArray + Dask)¶
import xarray as xr
import dask.array as da
from eo_processor import ndvi
nir = da.random.random((5000, 5000), chunks=(500, 500))
red = da.random.random((5000, 5000), chunks=(500, 500))
nir_xr = xr.DataArray(nir, dims=["y", "x"])
red_xr = xr.DataArray(red, dims=["y", "x"])
ndvi_xr = xr.apply_ufunc(
ndvi,
nir_xr,
red_xr,
dask="parallelized",
output_dtypes=[float],
)
result = ndvi_xr.compute()
Benchmark Guidance¶
Representative large-array benchmark template:
import numpy as np, time
from eo_processor import ndvi
nir = np.random.rand(5000, 5000)
red = np.random.rand(5000, 5000)
t0 = time.time()
rust_out = ndvi(nir, red)
rust_t = time.time() - t0
t0 = time.time()
numpy_out = (nir - red) / (nir + red)
numpy_t = time.time() - t0
print(f"Rust {rust_t:.3f}s vs NumPy {numpy_t:.3f}s (speedup {numpy_t / rust_t:.2f}x)")
assert np.allclose(rust_out, numpy_out, atol=1e-12)
Contributing¶
For adding new functions:
1. Implement Rust function (no unsafe)
2. Register in src/lib.rs via wrap_pyfunction!
3. Export in python/eo_processor/__init__.py and add stub in __init__.pyi
4. Add tests (cover edge cases, NaNs, shape mismatches)
5. Update README / docs (formula, usage)
6. Run pre-commit checklist (formatting, clippy, pytest, coverage)
7. Bump version (minor) if public API extended
See full contribution guide linked above.
License¶
MIT License. See repository for full text.