DICOM Image Classifier using ResNet50 and SVC

This project provides a complete pipeline to train a machine learning model capable of classifying DICOM images. The process uses a pre-trained ResNet50 model to extract visual features from the images and a Support Vector Classifier (SVC) to perform the final classification.

The workflow is divided into four main phases, each corresponding to a Python script.

Project Structure

.
├── input_dicom/               # <-- Place your original, unsorted DICOM files here (can have subfolders)
│   ├── folder1/
│   │   └── original_image1
│   └── original_image2
├── datos_etiquetados/           # (Generated by you after Phase 1)
│   ├── FDT/                     # <-- Place "FDT" class images here
│   ├── MTF/                     # <-- Place "MTF" class images here
│   └── TOR/                     # <-- Place "TOR" class images here
├── dataset_final/               # (Generated by Phase 2)
│   ├── dataset_X.npy
│   └── dataset_y.npy
├── modelo_final/                # (Generated by Phase 3)
│   ├── clasificador_svc.joblib
│   ├── escalador.joblib
│   └── matriz_de_confusion.png
├── fase1_descompresion.py       # Script for Phase 1
├── fase2_generar_vectores.py    # Script for Phase 2
├── fase3_entrenamiento_svc.py   # Script for Phase 3
├── fase4_inferencia.py          # Script for Phase 4 (inference)
├── config.py                    # Simplified configuration file
├── utils.py                     # Utility functions
└── pyproject.toml               # Project dependencies

Prerequisites

This project uses Python 3.13 or higher. The required libraries are listed in the pyproject.toml file. You can install them using a package manager like pip or uv.

# Example installation with pip
pip install numpy torch torchvision pillow pydicom scikit-learn seaborn joblib gdcm pylibjpeg pylibjpeg-libjpeg

Workflow

Follow these phases in order to go from raw DICOM files to a fully functional classifier.

Phase 1: Decompress and Standardize DICOM Files

This phase reads all valid DICOM files from the input_dicom directory (including subdirectories), decompresses them, and renames them in chronological order based on their acquisition time.

➡️ Action:

Place all your original DICOM files (with or without extension) inside the input_dicom folder. You can organize them in subfolders if you wish.
Run the script from your terminal:
```
python fase1_descompresion.py
```

➡️ Result:

A new folder named f1_descomprimidos will be created, containing the standardized DICOM files. The files will be named chronologically (e.g., Img1_..., Img2_...).

Phase 1.5: Manual Labeling

This is a manual step where you provide the ground truth for the model.

➡️ Action:

Create a folder named datos_etiquetados.
Inside datos_etiquetados, create subfolders for each of your classes (e.g., FDT, MTF, TOR).
Move the files from f1_descomprimidos into their corresponding class folder inside datos_etiquetados.

Phase 2: Feature Extraction and Data Augmentation

This script processes the labeled images, applies a powerful data augmentation strategy, and uses a pre-trained ResNet50 model to extract a feature vector for each image. At the end, it consolidates all data into two master files for training.

The script uses a differentiated data augmentation strategy:

Aggressive Augmentation for classes with objects (MTF, TOR), including full rotations and large translations, to teach the model position/rotation invariance.
Gentle Augmentation for the background class (FDT), focusing on brightness and contrast changes to preserve the background structure.

➡️ Action:

Ensure you have completed Phase 1.5 and your labeled images are in the datos_etiquetados folder.
Run the script:
```
python fase2_generar_vectores.py
```

➡️ Result:

A new folder named dataset_final will be created, containing:
- dataset_X.npy: An array with all the feature vectors.
- dataset_y.npy: An array with the corresponding label for each vector.
An output_procesado_aumentado folder is also created for verification purposes.

Phase 3: Training the SVC Classifier

This script takes the consolidated dataset from Phase 2, splits it into training and testing sets, and trains a Support Vector Classifier (SVC) model.

➡️ Action:

Ensure the dataset_final folder with its .npy files exists.
Run the training script:
```
python fase3_entrenamiento_svc.py
```

➡️ Result:

The script will print a Classification Report and Global Accuracy to the console, showing the model's performance on the test data.
A new folder modelo_final will be created, containing:
- clasificador_svc.joblib: The trained and saved AI model.
- escalador.joblib: The saved data scaler, crucial for inference.
- matriz_de_confusion.png: A visual plot of the model's performance.

Phase 4: Inference on a New Image

This final script uses the trained model to classify a new, unseen DICOM image.

➡️ Action:

Place a new DICOM image you want to classify in the project's root directory (or any other location).
Open the fase4_inferencia.py script and modify the RUTA_IMAGEN_NUEVA variable to point to your new image file.
Run the inference script:
```
python fase4_inferencia.py
```

➡️ Result:

The script will print the final prediction to the console, showing the classified label and the confidence score for each class. For example:
```
✅ La imagen ha sido clasificada como: 'FDT'

📊 Confianza por clase:
  - FDT: 98.14%
  - MTF: 1.55%
  - TOR: 0.31%
```

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
.python-version		.python-version
=0.22.0		=0.22.0
=2.0		=2.0
=2.1		=2.1
=2.7.0		=2.7.0
=3.0.0		=3.0.0
README.md		README.md
config.py		config.py
dicom_workflow.log		dicom_workflow.log
fase1_descompresion.py		fase1_descompresion.py
fase1_descompresion.py:Zone.Identifier		fase1_descompresion.py:Zone.Identifier
fase2_generar_vectores.py		fase2_generar_vectores.py
fase3_entrenamiento_svc.py		fase3_entrenamiento_svc.py
fase4_inferencia.py		fase4_inferencia.py
main.py		main.py
pyproject.toml		pyproject.toml
utils.py		utils.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DICOM Image Classifier using ResNet50 and SVC

Project Structure

Prerequisites

Workflow

Phase 1: Decompress and Standardize DICOM Files

Phase 1.5: Manual Labeling

Phase 2: Feature Extraction and Data Augmentation

Phase 3: Training the SVC Classifier

Phase 4: Inference on a New Image

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DICOM Image Classifier using ResNet50 and SVC

Project Structure

Prerequisites

Workflow

Phase 1: Decompress and Standardize DICOM Files

Phase 1.5: Manual Labeling

Phase 2: Feature Extraction and Data Augmentation

Phase 3: Training the SVC Classifier

Phase 4: Inference on a New Image

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages