top of page
Keyboard and Mouse

CheXpert Medical Imaging Competition

CS 156b (2025)

This project tackled automated anomaly detection in chest radiographs using over 100 thousand images from the CheXpert dataset,  a large-scale labeled dataset curated for real-world medical imaging research. Our goal was to predict the presence of 9 different conditions from chest X-rays (e.g., Cardiomegaly, Lung Opacity, Pleural Effusion) using supervised learning, submitting a CSV with probability predictions for each finding per image. Evaluation was based on mean squared error (MSE), scaled by class-wise variance.

​

Our team worked on the design, training, and evaluation of multiple deep learning models to identify optimal architecture-performance trade-offs:

  • Vanilla CNN: Built and trained a baseline convolutional model to establish a performance benchmark.

  • ResNet (Residual Networks): Implemented deeper architectures using skip connections (ResNet-18 and ResNet-50) to capture complex radiographic patterns.

  • DenseNet: Leveraged densely connected CNNs to enhance feature propagation and minimize vanishing gradients in deeper layers.

​

To optimize model performance, we iteratively tuned:

  • Learning rate schedules (step decay, cosine annealing)

  • Optimizer selection (Adam vs. SGD with momentum)

  • Batch size, weight decay, and dropout rates to reduce overfitting

  • Loss functions (tested both BCEWithLogits and class-weighted variants due to label imbalance)

We also incorporated early stopping and model checkpointing to manage training stability and maximize generalization performance.

​

In terms of data processing and augmentation, we:

  • Rescaled and normalized grayscale chest X-ray images to suit pretrained model input layers.

  • Applied data augmentation techniques such as random horizontal flips, rotations, and brightness adjustments to improve robustness.

​

For the evaluation phase, we:

  • Created scripts to compute scaled MSE across each of the 10 clinical labels.

  • Visualized label-wise prediction distributions and investigated error trends across patient subgroups.

  • Identified which conditions were most prone to false positives or false negatives and refined models accordingly.

​

This competition provided a hands-on opportunity to apply deep learning to a high-impact healthcare domain. We gained practical experience working with multi-label classification, medical image preprocessing, model performance debugging and optimization, and team-based ML pipeline development

TOOLS

Screenshot 2024-09-12 at 12.34_edited.jpg
Screen Shot 2023-03-12 at 3.32_edited.jpg
images_edited.jpg
Screenshot 2024-09-12 at 1.52_edited.jpg
Created_with_Matplotlib-logo_edited.png
download_edited.jpg

CONTACT

Thanks for submitting!

CS + BEM @ Caltech

Phone:

682-237-9848

Email:

  • LinkedIn

© 2025 by Karen Zhou.

bottom of page