← All Projects

NASA CloudML: Machine Learning for Atmospheric Remote Sensing

Machine LearningNASARemote SensingPythonXGBoost

Status: Paper 1 pending journal submission (awaiting NASA approval). Paper 2 pending NASA Technical Reports Server (NTRS).

Overview

Machine learning framework for cloud base height retrieval from NASA ER-2 airborne observations, developed during my NASA Goddard Space Flight Center OSTEM internship (Summer 2025). Two first-author papers pending NASA approval, covering complementary analyses.

Paper 1: CBH Retrieval (1,426 samples, 5 flights)

Systematic comparison of atmospheric feature-based versus image-based ML for CBH retrieval.

Metric Value
GBDT R² (per-flight shuffled CV) 0.744
Best CNN R² (ResNet-18) 0.617
GBDT MAE 117.4 m
CNN MAE 150.9 m
Labeled Samples 1,426

Domain shift: Leave-one-flight-out CV yields R² = -15.4 (catastrophic). Few-shot learning (50 samples) recovers R² = 0.57–0.85. Conformal prediction achieves 27% coverage (target: 90%); per-flight calibration recovers 86%.

Paper 2: Domain Shift Analysis (5,500 samples, 6 flights)

Expanded dataset focused on physics-informed feature engineering and domain adaptation.

Metric Value
LOFO R² (6-flight mean) -5.36
Worst single flight R² -19.4
Few-shot recovery (50 samples) R² = +0.35
Conformal coverage under shift 34% (target: 90%)
Within-flight calibration 90% coverage
Physics-derived features 29 (from 5 base ERA5 variables)

Five adaptation methods evaluated: only few-shot learning works. Instance weighting, TrAdaBoost, MMD alignment, and feature selection all fail or make things worse.

Why the Numbers Differ

The two papers analyze overlapping but different datasets. Paper 1 uses 1,426 samples across 5 flights (LOFO R² = -15.4). Paper 2 expands to 5,500 ocean-only boundary-layer observations across 6 flights (LOFO R² = -5.36). The shift is less severe in the expanded dataset because additional flights reduce the mean, but remains catastrophic in both cases.

Technical Approach

Data Pipeline

  • HDF5 preprocessing pipeline for NASA ER-2 observations
  • Temporal interpolation and radiometric correction
  • Integration with ERA5 reanalysis atmospheric data

Model Comparison

  • Feature-based: XGBoost gradient boosting with atmospheric variables
  • Image-based: CNNs (ResNet-18, EfficientNet-B0) on raw thermal IR imagery
  • Result: Atmospheric features significantly outperform raw images

Key Insight

Temporal autocorrelation (lag-1 ρ = 0.94) inflates pooled K-fold R² from 0.744 to 0.924. Per-flight shuffled CV is the honest within-regime metric.

Technology Stack

  • ML Frameworks: PyTorch, TensorFlow, scikit-learn
  • Gradient Boosting: XGBoost, LightGBM
  • Data Processing: HDF5, NetCDF, Pandas, NumPy
  • Atmospheric Data: ERA5 reanalysis

Links

  • GitHub Repository
  • Paper 1: Pending journal submission (awaiting NASA approval)
  • Paper 2: Pending NASA Technical Reports Server (NTRS)

Affiliation

NASA Goddard Space Flight Center
OSTEM Intern – Atmospheric Remote Sensing
May – August 2025