Pneumonia Prediction in Chest X-Ray Images
DenseNet-201 architecture applied to classify chest X-rays and predict pneumonia presence using the public Chest X-Ray Images dataset.
Medical Context
Pneumonia is a lung infection that inflames the air sacs in one or both lungs, filling them with fluid or pus. This causes symptoms such as cough with phlegm, fever, chills, and difficulty breathing. Various microorganisms (bacteria, viruses, and fungi) can cause it.
In a chest X-ray, the key differences are:
| Healthy Lung | Lung with Pneumonia |
|---|---|
| Lung transparency (appear dark due to air) | Opacities or consolidations (white/gray areas due to fluid or pus) |
| No abnormal white areas | Air bronchogram (dark lines within white areas) |
| Defined structures (heart, diaphragm, vessels) | Lobar or interstitial infiltrates |
DenseNet-201
DenseNet-201 is a deep convolutional neural network with 201 layers, belonging to the Dense Convolutional Networks (DenseNets) family, introduced in 2016 by Gao Huang and collaborators.
In a traditional convolutional network with L layers there are L connections (each layer connects only to its successor). In a DenseNet with L layers there are L(L+1)/2 direct connections: each layer receives as input the feature maps of all previous layers and delivers its own to all subsequent layers. This enables more efficient gradient flow and reduces the vanishing gradient problem.
Dataset
The Pneumonia X-Ray Images dataset by Paulo Breviglieri (Kaggle) was used, a more balanced version of the original dataset by Paul Mooney.
Image distribution:
| Split | Pneumonia | Normal | Total |
|---|---|---|---|
| Train | 3,110 | 1,082 | 4,192 |
| Validation | 773 | 267 | 1,040 |
| Test | 390 | 234 | 624 |
A significant imbalance is observed: the “Pneumonia” class has approximately three times as many images as the “Normal” class in each split.
Preprocessing
- Resized to 224×224 pixels with 3 RGB channels
- Pixel normalization (rescaling 1/255)
- Data augmentation applied to the training generator:
- Random rotation (±30°)
- Random zoom (0.2)
- Horizontal and vertical shift (0.2)
- Brightness variation (range 0.7–1.3)
- Random horizontal flip
- Nearest-neighbor fill
Model Architecture
Transfer learning was applied using DenseNet-201 pre-trained with ImageNet weights as a feature extractor (frozen layers), with custom top layers added:
Input (224×224×3)
└── DenseNet-201 (frozen, ImageNet weights)
└── GlobalAveragePooling2D
└── Dense(1536, relu) → Dropout(0.1)
└── Dense(512, relu) → Dropout(0.1)
└── Dense(1024, relu) → Dropout(0.1)
└── Dense(512, relu) → Dropout(0.1)
└── Dense(2, softmax)
The layer configuration and dropout (0.1) were based on the results from the paper “Optimizing DenseNet for Potato Leaf Disease Classification” (arXiv 2402.03347), which showed that a dropout of 0.1 achieves the best balance between training and validation performance.
Training
| Parameter | Value |
|---|---|
| Optimizer | Adam |
| Loss function | Categorical Crossentropy |
| Metric | Accuracy |
| Max epochs | 20 |
| Early Stopping | Patience of 6 epochs, monitoring val_loss |
| Batch size | 32 (training), 8 (test) |
Class weights were computed inversely proportional to frequency, but were not applied during training.
Results
Training curve
The model reached ~95–96% accuracy on both training and validation over the 20 epochs, with decreasing losses and negligible overfitting between both curves.
Test evaluation
| Metric | Normal Class (0) | Pneumonia Class (1) |
|---|---|---|
| Precision | 0.95 | 0.80 |
| Recall | 0.59 | 0.98 |
| F1-score | 0.73 | 0.88 |
Global test accuracy: 83.8% (loss: 0.42)
Confusion Matrix
| Predicted Normal | Predicted Pneumonia | |
|---|---|---|
| Actual Normal | 135 | 99 |
| Actual Pneumonia | 6 | 384 |
Interpretation
- The model correctly detects most pneumonia cases (98% recall).
- However, it has poor performance on the Normal class (only 59% recall), misclassifying 99 normal images as pneumonia.
- This reveals a bias toward the dominant class due to dataset imbalance and the non-application of the computed class weights.
Conclusions
- High accuracy on pneumonia — the model correctly identifies most positive cases.
- Low performance on normal class — high false positive rate (99 images) due to class imbalance.
- Overfitting — although validation showed good performance, the test revealed a significant drop, possibly because train and validation came from the same
ImageDataGenerator, which may have led to pattern memorization.
Proposed Improvements
- Apply class_weights during training to balance the importance of both classes.
- Fine-tuning by unfreezing top layers of DenseNet-201 with a reduced learning rate.
- Targeted data augmentation for the minority class (Normal) to improve generalization.
Versions
The project was developed with the following libraries and versions:
| Package | Version |
|---|---|
| Python | 3.12.3 |
| TensorFlow | 2.18.0 |
| Keras | 3.7.0 |
| scikit-learn | 1.5.1 |
| matplotlib | 3.9.2 |
| numpy | 1.26.4 |
| pandas | 2.2.2 |
| seaborn | 0.13.2 |
References
- Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. Q. (2017). Densely Connected Convolutional Networks. arXiv:1608.06993.
- Breviglieri, P. Pneumonia X-Ray Images. Kaggle.
- Mooney, P. Chest X-Ray Images (Pneumonia). Kaggle.
- Reference paper for adapted architecture: arXiv 2402.03347.