CheXpert Medical Imaging Competition

CS 156b (2025)

This project tackled automated anomaly detection in chest radiographs using over 100 thousand images from the CheXpert dataset, a large-scale labeled dataset curated for real-world medical imaging research. Our goal was to predict the presence of 9 different conditions from chest X-rays (e.g., Cardiomegaly, Lung Opacity, Pleural Effusion) using supervised learning, submitting a CSV with probability predictions for each finding per image. Evaluation was based on mean squared error (MSE), scaled by class-wise variance.

Our team worked on the design, training, and evaluation of multiple deep learning models to identify optimal architecture-performance trade-offs:

Vanilla CNN: Built and trained a baseline convolutional model to establish a performance benchmark.
ResNet (Residual Networks): Implemented deeper architectures using skip connections (ResNet-18 and ResNet-50) to capture complex radiographic patterns.
DenseNet: Leveraged densely connected CNNs to enhance feature propagation and minimize vanishing gradients in deeper layers.

To optimize model performance, we iteratively tuned:

Learning rate schedules (step decay, cosine annealing)
Optimizer selection (Adam vs. SGD with momentum)
Batch size, weight decay, and dropout rates to reduce overfitting
Loss functions (tested both BCEWithLogits and class-weighted variants due to label imbalance)

We also incorporated early stopping and model checkpointing to manage training stability and maximize generalization performance.

In terms of data processing and augmentation, we:

Rescaled and normalized grayscale chest X-ray images to suit pretrained model input layers.
Applied data augmentation techniques such as random horizontal flips, rotations, and brightness adjustments to improve robustness.

For the evaluation phase, we:

Created scripts to compute scaled MSE across each of the 10 clinical labels.
Visualized label-wise prediction distributions and investigated error trends across patient subgroups.
Identified which conditions were most prone to false positives or false negatives and refined models accordingly.

This competition provided a hands-on opportunity to apply deep learning to a high-impact healthcare domain. We gained practical experience working with multi-label classification, medical image preprocessing, model performance debugging and optimization, and team-based ML pipeline development

Code Repo

TOOLS

Screenshot 2024-09-12 at 12.34_edited.jpg

Screen Shot 2023-03-12 at 3.32_edited.jpg

Screenshot 2024-09-12 at 1.52_edited.jpg

Karen Zhou

CheXpert Medical Imaging Competition

TOOLS