Feb 11, 2026 · 10 min · Research, Deep Learning, Systems

Ship Detection in Optical Imagery: The 2020 Cloud Constraint

Back in 2020, I defended my Master's Thesis: Ship Detection in Satellite Optical Imagery. Looking back from 2026, it captures a specific moment in time—before the transformer explosion, when we were still squeezing every ounce of performance out of U-Nets and fighting cloud budgets on single Azure instances.

The problem was straightforward: finding ships in optical satellite imagery is messy. Clouds look like ships. Waves look like ships. Islands definitely look like ships to a conv-net. SAR (Synthetic Aperture Radar) is the gold standard because it sees through clouds, but optical data offers easier interpretability—if you can solve the false positives.

The Architecture: Custom U-Net & RedisAI

Standard U-Nets were hitting a ceiling around 84-88% accuracy. To break that, I designed a Custom U-Net that went deeper. We increased the parameter count almost 10x (from ~31M to ~306M) and introduced bottlenecks in the encoding blocks to compress dimensionality before expensive 3x3 convolutions.

But the real "systems" interesting part wasn't the model—it was how we fed it.

The Constraint: Training on cloud VMs (Azure NC24) is expensive. File I/O is slow.

To saturate four Tesla K80 GPUs, I couldn't rely on standard disk reads. I implemented RedisAI as a tensor delivery layer. Instead of reading JPEGs from disk, we served pre-processed tensors directly from memory via Redis. It allowed asynchronous key-value reads that kept the hungry GPUs fed without the I/O wait times.

Offline Hard Example Mining (OffHEM)

We didn't just train and pray. We used OffHEM. After an initial training run, we froze the model and ran it against the training set again to find the "hard" examples—the ones with the highest loss.

These were usually the edge cases: ships in heavy cloud cover, ships docked next to complex urban infrastructure, or tiny vessels in choppy wake. We effectively told the model: "Look at these again. Learn why you failed."

The Results: The Generalization Gap

On the validation set (AirBus SPOT-5 data), the custom architecture hit 92% accuracy. It worked. We beat the baselines.

But then came the reality check: the UrtheCast Theia MRC target dataset. When applied to this totally unseen distribution (different sensor, different location—Burrard Inlet right here in Vancouver), accuracy dropped to 68%.

This is the classic story of EO (Earth Observation) models. They overfit to the sensor characteristics. The model learned what a "SPOT-5 ship" looked like, not necessarily the universal concept of a ship.

Looking Back

This thesis taught me that a heavier model isn't always the answer to domain shift. But more importantly, it was my first deep dive into systems engineering for AI. Building that RedisAI pipeline was more satisfying than tweaking the learning rate.

It was also a pivot. This wasn't actually what I went to grad school to study. I wanted to build autonomous Multi-Agent Systems (MAS). I wanted swarms of agents managing forests and oceans. But in 2020, the funding (and the practical path to graduation) was in pixel segmentation.

In the next post, I'll dig into that lost research track—and why, six years later, I'm finally building it.

Publications & Resources

MSc Thesis: Ship Detection in Satellite Optical Imagery (UVic DSpace)
Conference Paper (AICCC 2020): Ship Detection in Satellite Optical Imagery
Research Paper: Ship Detection Parameter Server Variant