Skin Lesion Classification Task

Introduction

[EN]

In this assignment, I was tasked with utilizing the PyTorch libraries to train and evaluate a set of deep learning models. The assignment revolved around two distinct approaches:

Transfer learning approach: In this approach, I leveraged pre-trained models as a starting point. Transfer learning involves utilizing the knowledge gained from training on a large dataset (typically ImageNet) and applying it to a different but related task, such as classifying skin lesions. By building upon the pre-trained models, I aimed to benefit from their learned features and adapt them to improve the performance of my skin lesion classifier.

Building a classifier model from scratch: In addition to the transfer learning approach, I was required to implement my own classifier model from scratch. This involved designing and training a deep learning model specifically tailored for classifying skin lesions. I had the freedom to experiment with different architectures, layers, and optimization techniques to create an effective and accurate classifier.

Both approaches provided valuable insights into the field of deep learning and allowed me to explore different strategies for developing a skin lesion classifier.

Giris

[TR]

Bu ödevde PyTorch kütüphanelerini kullanarak bir dizi derin öğrenme modeli eğitmem ve değerlendirmem gerekti. Ödev, iki farklı yaklaşımı içeriyor:

Transfer öğrenme yaklaşımı: Bu yaklaşımda önceden eğitilmiş modelleri başlangıç noktası olarak kullanıyorum. Transfer öğrenme, büyük bir veri seti (genellikle ImageNet) üzerinde eğitilen bir modelin bilgisini, farklı ancak ilgili bir görev için (örneğin, cilt lezyonlarını sınıflandırma) uygulamayı içerir. Önceden eğitilmiş modellerin öğrendiği özelliklerden faydalanarak, cilt lezyonu sınıflandırıcısının performansını artırmayı hedefledim.

Sıfırdan bir sınıflandırıcı modeli oluşturma: Transfer öğrenme yaklaşımına ek olarak, sıfırdan kendi sınıflandırıcı modelimi oluşturmam istendi. Bu, cilt lezyonlarını sınıflandırmaya yönelik özel bir derin öğrenme modeli tasarlamak ve eğitmek anlamına geliyor. Farklı mimariler, katmanlar ve optimizasyon tekniklerini deneyerek etkili ve doğru bir sınıflandırıcı oluşturmakta özgürdüm.

Her iki yaklaşım da derin öğrenme alanında değerli bilgiler sunarak, cilt lezyonu sınıflandırıcısı geliştirmek için farklı stratejileri keşfetmeme olanak sağladı.

Step1: Transfer Learning Approach

[EN]

In this step, I utilized the ResNet50 model, which has been pretrained on the ImageNet dataset. You can access the pretrained model weights 3here. For this task, I trained three different models using the following fine-tuning strategies:

Model-1: With Frozen Layers: In this approach, I kept the pretrained layers of the ResNet50 model fixed and only trained the classifier layer. By freezing the other layers, I leveraged the learned features of the pretrained model while focusing on adapting the classifier to the specific task of skin lesion classification.

Model-2: With Partially Open Layers: In this model, I selectively opened and fine-tuned the last half of the ResNet50 model while also training the classifier layer. By unfreezing and fine-tuning these layers, I allowed the model to adjust and improve its performance based on the target task.

Model-3: With Reduced Architecture: In this approach, I added a classifier layer at the end of the first half of the ResNet50 model. By removing the second half of the pretrained model, I effectively reduced the model to half its original size. In this case, the first half of this modified model remained frozen, while the second half was fine-tuned during training. This allowed the model to focus on learning representations specific to skin lesion classification within the constrained architecture.

By applying these three strategies, I explored different ways to utilize the ResNet50 model for skin lesion classification and evaluated the effectiveness of these approaches in achieving accurate and robust results.

Adım 1: Transfer Öğrenme Yaklaşımı

[TR]

Bu adımda, ImageNet veri seti üzerinde önceden eğitilmiş olan ResNet50 modelini kullandım. Önceden eğitilmiş modelin ağırlıklarına buradan erişebilirsiniz. Bu görev için üç farklı model eğittim ve her bir modelde şu stratejileri kullandım:

Model-1: Donmuş Katmanlarla: Bu yaklaşımda, ResNet50 modelinin önceden eğitilmiş katmanlarını sabit tutarak yalnızca sınıflandırıcı katmanını eğittim. Diğer katmanları dondurarak, önceden eğitilmiş modelin öğrendiği özelliklerden yararlandım ve sınıflandırıcıyı cilt lezyonu sınıflandırma görevine uyarlamaya odaklandım.

Model-2: Kısmen Açık Katmanlarla: Bu modelde, ResNet50 modelinin son yarısını açarak ince ayar yaptım ve aynı zamanda sınıflandırıcı katmanını da eğittim. Bu katmanları seçerek açıp ince ayar yaparak, modelin hedef göreve dayalı performansını iyileştirmesini sağladım.

Model-3: Azaltılmış Mimari ile: Bu yaklaşımda, ResNet50 modelinin ilk yarısının sonuna bir sınıflandırıcı katman ekledim. Modelin ikinci yarısını kaldırarak, mimariyi orijinal boyutunun yarısına indirdim. Bu durumda, modifiye edilen modelin ilk yarısı donmuş olarak kaldı, ikinci yarısı ise eğitim sırasında ince ayarlandı. Bu yaklaşım, sınırlı bir mimari içinde cilt lezyonu sınıflandırmaya özgü temsil öğrenimine odaklanmasını sağladı.

Bu üç stratejiyi uygulayarak, ResNet50 modelini cilt lezyonu sınıflandırması için farklı yollarla kullandım ve bu yaklaşımların doğru ve güçlü sonuçlar elde etmedeki etkinliğini değerlendirdim.

Step2: Building an Image Classification Model From Scratch

[EN]

Step 2: Building an Image Classification Model From Scratch In this step, I created an image classification model from scratch without utilizing pre-trained weights. I designed a convolutional neural network (CNN) architecture specifically tailored for the task of skin lesion classification. I used the PyTorch libraries for image preprocessing tasks such as standardizing images and applying data augmentation techniques.

For image preprocessing, I took advantage of PyTorch’s functions to perform tasks like image normalization, resizing, and data augmentation techniques such as random cropping, flipping, and rotation. These preprocessing steps were crucial for enhancing the model’s ability to learn discriminative features from skin lesion images.

When designing my CNN architecture, I experimented with various configurations of convolutional layers, pooling layers, and fully connected layers. I adjusted the number of layers, filter sizes, and the number of neurons in the fully connected layers to optimize the model’s performance. By building the model from scratch, I had the opportunity to explore and implement innovative architectural choices and evaluate their impact on the task of skin lesion classification.

Adım 2: Sıfırdan Bir Görüntü Sınıflandırma Modeli Oluşturma

[TR]

Bu adımda, önceden eğitilmiş ağırlıklar kullanmadan sıfırdan bir görüntü sınıflandırma modeli oluşturdum. Cilt lezyonu sınıflandırma görevine özgü olarak tasarlanmış bir konvolüsyonel sinir ağı (CNN) mimarisi oluşturdum. Görüntü ön işleme adımları için PyTorch kütüphanelerini kullandım. Bu adımlar, görüntülerin normalize edilmesi ve veri çoğaltma (data augmentation) tekniklerini içerdi.

Görüntü ön işleme için, PyTorch’un sağladığı işlevler sayesinde görüntüleri normalleştirme, yeniden boyutlandırma ve rastgele kırpma, çevirme, döndürme gibi veri çoğaltma tekniklerini uyguladım. Bu adımlar, modelin cilt lezyonu görüntülerinden ayırt edici özellikler öğrenme yeteneğini artırmak için oldukça önemliydi.

CNN mimarimi tasarlarken, konvolüsyonel katmanlar, havuzlama katmanları (pooling layers) ve tam bağlı katmanlar (fully connected layers) gibi farklı katman yapılarını denedim. Katman sayısı, filtre boyutları ve tam bağlı katmanlardaki nöron sayısını ayarlayarak modelin performansını optimize etmeye çalıştım. Sıfırdan bir model oluşturarak, yenilikçi mimari tercihleri keşfetme ve bunların cilt lezyonu sınıflandırma görevine etkisini değerlendirme fırsatım oldu.

Visualization and Evaluation of Models

[EN]

After completing the model training, I visually analyzed and evaluated the performance of the trained models. These observations were included in my report. The following components were expected:

Visualizations:

Loss Graph: Displayed the training and validation loss curves for each model. These graphs provided insights into the training progress and how close the models were to convergence over the epochs.

Confusion Matrix: Presented a confusion matrix for each model, illustrating the classification performance across different skin lesion categories. This visual assessment helped in understanding the model’s ability to correctly predict and distinguish between different classes.

Sample Images: I included a selection of test set samples. For each model, I showcased five correctly classified test samples and five incorrectly classified test samples. This visual representation provided an intuitive understanding of the model’s strengths and weaknesses.

Evaluation:

Comparison of Results: I compared and analyzed the performance of the transfer learning models and the model trained from scratch. To assess the effectiveness of these models in multi-category classification, I used metrics such as accuracy, precision, recall, and F1-score. These metrics were summarized and compared in a table, providing a concise overview and allowing for easy comparison and analysis. By examining the loss curves, confusion matrices, and sample images, I gained insights into the learning patterns, class separability, and error tendencies of the models. Additionally, the evaluation of accuracy metrics and the comparison table provided a quantitative assessment of the models’ classification capabilities. This comprehensive evaluation helped me analyze the strengths and weaknesses of different approaches and compare their effectiveness in skin lesion classification.

Modellerin Görselleştirilmesi ve Değerlendirilmesi

[TR]

Model eğitimlerini tamamladıktan sonra, eğitilen modellerin performanslarını görsel olarak analiz ettim ve değerlendirdim. Raporumda bu gözlemler yer aldı. Beklenen bileşenler aşağıdaki gibidir:

Görselleştirmeler:

Kayıp Grafiği (Loss Graph): Her bir model için eğitim ve doğrulama kayıplarını gösteren grafikler oluşturuldu. Bu grafikler, modellerin eğitim sürecindeki ilerlemelerini ve her epoch boyunca ne kadar yaklaştıklarını göstermesi açısından önemliydi.

Karışıklık Matrisi (Confusion Matrix): Her modelin sınıflandırma performansını gösteren karışıklık matrisi sunuldu. Bu matris, modellerin farklı cilt lezyonu kategorilerini ne derece doğru sınıflandırabildiğini görsel olarak değerlendirmeye olanak tanıdı.

Örnek Görseller: Test setinden alınan örnek görüntüler sunuldu. Her model için doğru sınıflandırılmış beş test örneği ve yanlış sınıflandırılmış beş test örneği gösterildi. Bu görsel temsil, modellerin güçlü ve zayıf yönlerini vurgulamak için önemli bir sezgisel anlayış sağladı.

Değerlendirme:

Sonuçların Karşılaştırılması: Transfer öğrenme modelleri ile sıfırdan eğitilen modelin performansını karşılaştırdım ve analiz ettim. Çok kategorili sınıflandırma görevinde modellerin etkinliğini değerlendirmek için doğruluk (accuracy), kesinlik (precision), geri çağırma (recall), ve F1 skoru gibi metrikleri kullandım. Bu metrikler, modellerin performansını özetleyen ve karşılaştırmalı analiz için kolaylık sağlayan bir tablo halinde sunuldu. Kayıp eğrilerini, karışıklık matrislerini ve örnek görüntüleri görsel olarak inceleyerek, modellerin öğrenme eğilimlerini, sınıf ayrışabilirliğini ve hata yapma eğilimlerini gözlemleyebildim. Ayrıca, doğruluk metrikleri ve performans karşılaştırma tablosu, modellerin sınıflandırma yeteneklerine dair nicel bir değerlendirme sundu. Bu kapsamlı değerlendirme, farklı yaklaşımların güçlü ve zayıf yönlerini analiz etmeme ve cilt lezyonu sınıflandırmasındaki etkinliklerini karşılaştırmama yardımcı oldu.

Report

[EN]

  • Custom Design CNN Model: In this report, I provided a detailed description of the convolutional neural network (CNN) model I built from scratch. I explained each layer in the model, its configurations, and the specific design choices I made. Additionally, I included a block diagram to visually illustrate the data flow and the connections between different layers within the model. This diagram provided a clearer understanding of the overall architecture.

While designing the CNN, I experimented with various configurations of convolutional layers, pooling layers, and fully connected layers. By adjusting parameters like the number of layers, filter sizes, and the number of neurons in fully connected layers, I aimed to optimize the model’s performance. Image preprocessing steps such as data augmentation and normalization also played a key role in improving the model’s effectiveness.

  • Test Results & Visualizations: In this section, I presented the test results for both the transfer learning models and the model I built from scratch. This section included:

Loss Graphs: I displayed graphs showing the changes in training and validation losses over time, which helped evaluate the progress and convergence of each model during training. Confusion Matrices: I presented confusion matrices for each model to analyze their ability to correctly classify different categories of skin lesions. Sample Images: I included examples from the test set that were both correctly and incorrectly classified, which provided visual insights into the models’ strengths and weaknesses. Performance Comparison Table: I created a table comparing the accuracy, precision, recall, and F1-scores of all the models, enabling a clear comparison of their results.

  • Conclusion and Future Work: I summarized the key findings and conclusions from my experiments. I observed that transfer learning methods, especially with pre-trained models, tend to outperform custom-built models in terms of both speed and accuracy, particularly when working with limited datasets. However, building a model from scratch offered more control over the architecture and allowed for creative solutions specific to the task.

For future work, I plan to experiment with larger datasets and deeper, more complex architectures to further enhance model performance. Additionally, extending the classification to more detailed categories of skin lesions and adding more classes could be an interesting direction for future research.

Rapor

[TR]

  • Özel Tasarım CNN Modeli: Bu raporda, sıfırdan oluşturduğum konvolüsyonel sinir ağı (CNN) modelimin mimarisini detaylı bir şekilde açıkladım. Modelde yer alan her bir katmanı, katmanların yapılandırmalarını ve yaptığım tasarım tercihlerini ayrıntılı olarak belirttim. Ayrıca, modelin katmanlar arasındaki veri akışını ve katmanlar arası bağlantıları görsel olarak gösteren bir blok diyagramı sundum. Bu diyagram, mimarinin genel yapısını daha iyi anlamayı sağladı.

CNN mimarisinde farklı konvolüsyonel katmanlar, pooling katmanları ve tam bağlı katmanları denedim. Bu denemelerde katman sayısı, filtre boyutları ve tam bağlı katmanlardaki nöron sayıları gibi değişkenler üzerinde oynayarak en iyi performansı elde etmeye çalıştım. Özellikle veri çoğaltma ve normalizasyon gibi görüntü ön işleme adımlarıyla modelin performansını iyileştirdim.

  • Test Sonuçları ve Görselleştirmeler: Raporumda, hem transfer öğrenme yöntemleriyle eğittiğim modellerin hem de sıfırdan oluşturduğum modelin test sonuçlarını sundum. Bu bölümde şu detaylar yer aldı:

Kayıp Grafikleri: Eğitim ve doğrulama süreçlerindeki kayıpların nasıl değiştiğini gösteren grafikler ile modellerin eğitim süreçlerindeki ilerlemeyi ve yakınsamalarını değerlendirdim. Karışıklık Matrisleri: Her bir model için karışıklık matrislerini göstererek, modellerin farklı cilt lezyonu sınıflarını doğru sınıflandırma yeteneklerini analiz ettim. Örnek Görseller: Test setinden doğru ve yanlış sınıflandırılmış örnekleri sundum. Bu örnekler modellerin güçlü ve zayıf yanlarını görsel olarak anlamamı sağladı. Performans Karşılaştırma Tablosu: Her modelin doğruluk, kesinlik, geri çağırma ve F1 skorlarını içeren bir tablo ile sonuçların karşılaştırmasını yaptım.

  • Sonuç ve Gelecek Çalışmalar: Deneyimlerimden elde ettiğim ana bulguları ve sonuçları özetledim. Transfer öğrenme yöntemlerinin sıfırdan model oluşturmaya kıyasla daha hızlı ve daha başarılı sonuçlar verebildiğini gözlemledim, özellikle sınırlı veri setleri ile çalışırken transfer öğrenmenin etkili olduğunu fark ettim. Bununla birlikte, sıfırdan model oluşturma süreci, özellikle mimari tasarım üzerinde daha fazla kontrol imkanı sunduğundan, bu tür görevler için yaratıcı çözümler üretme fırsatı sağladı.

Gelecekte, daha geniş veri setleri ile daha derin ve karmaşık mimariler deneyerek model performansını artırmayı planlıyorum. Ayrıca, cilt lezyonlarının daha detaylı sınıflandırılması ve daha fazla sınıf eklenmesi, gelecekteki çalışmaların odak noktası olabilir.

1- Transfer learning approach:

1.1- Model 1: with frozen layers

import torch
import torch.nn as nn
import torch.optim as optim
import torchvision.models as models
import torchvision.transforms as transforms
from torch.utils.data import DataLoader, Dataset
from torchvision.datasets import ImageFolder
import pandas as pd
from sklearn.model_selection import train_test_split
from PIL import Image

# Defining the dataset class
# train_dataset = SkinLesionDataset(train_data_dir, train_df, transform=transform)

class SkinLesionDataset(Dataset):
    def __init__(self, data_dir, df, transform=None):
        self.data_dir = data_dir
        self.df = df
        self.transform = transform

    def __getitem__(self, index):
        image_id = self.df.iloc[index]["image_id"]
        image_path = f"{self.data_dir}/{image_id}.jpg"
        image = Image.open(image_path).convert("RGB")

        if self.transform is not None:
            image = self.transform(image)

        label_str = self.df.iloc[index]["label"]
        label = 1 if label_str == "malignant" else 0

        return image, label

    def __len__(self):
        return len(self.df)

class SkinLesionDatasetForTest(Dataset):
    def __init__(self, data_dir, df, transform=None):
        self.data_dir = data_dir
        self.df = df
        self.transform = transform

    def __getitem__(self, index):
        image_id = self.df.iloc[index]["image_id"]
        image_path = f"{self.data_dir}/{image_id}.jpg"
        image = Image.open(image_path).convert("RGB")

        if self.transform is not None:
            image = self.transform(image)

        label_str = self.df.iloc[index]["label"]
        label = 1 if label_str == 1.0 else 0

        return image, label

    def __len__(self):
        return len(self.df)

# Read the CSV files
train_csv_file = "ISBI2016_ISIC_Training_GroundTruth.csv"
val_csv_file = "ISBI2016_ISIC_Validation_GroundTruth.csv"
test_csv_file = "ISBI2016_ISIC_Test_GroundTruth.csv"

train_df = pd.read_csv(train_csv_file, delimiter=",", header=None, names=["image_id", "label"])
val_df = pd.read_csv(val_csv_file, delimiter=",", header=None, names=["image_id", "label"])
test_df = pd.read_csv(test_csv_file, delimiter=",", header=None, names=["image_id", "label"])

# Split the dataset into training, validation, and test sets
train_data_dir = "ISBI2016_ISIC_Training_Data"
val_data_dir = "ISBI2016_ISIC_Validation_Data"
test_data_dir = "ISBI2016_ISIC_Test_Data"

# Define the transformations
transform = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

# Create the dataset objects
train_dataset = SkinLesionDataset(train_data_dir, train_df, transform=transform)
val_dataset = SkinLesionDataset(val_data_dir, val_df, transform=transform)
test_dataset = SkinLesionDatasetForTest(test_data_dir, test_df, transform=transform)

# Create the data loaders
batch_size = 32
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False)
test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

# Load the pretrained ResNet50 model
model = models.resnet50(pretrained=True)

# Replace the last fully connected layer
num_features = model.fc.in_features
model.fc = nn.Linear(num_features, 2)  # 2 classes: benign and malignant

# Move the model to the device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)

# Define the loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Training and evaluation loop
num_epochs = 10

for epoch in range(num_epochs):
    # Training
    model.train()
    train_loss = 0.0
    train_correct = 0

    for images, labels in train_loader:
        images = images.to(device)
        labels = labels.to(device)

        optimizer.zero_grad()

        outputs = model(images)
        loss = criterion(outputs, labels)

        _, preds = torch.max(outputs, 1)
        train_correct += torch.sum(preds == labels.data)

        loss.backward()
        optimizer.step()

        train_loss += loss.item() * images.size(0)

    train_loss = train_loss / len(train_dataset)
    train_acc = train_correct.double() / len(train_dataset)

    # Validation
    model.eval()
    val_loss = 0.0
    val_correct = 0

    with torch.no_grad():
        for images, labels in val_loader:
            images = images.to(device)
            labels = labels.to(device)

            outputs = model(images)
            loss = criterion(outputs, labels)

            _, preds = torch.max(outputs, 1)
            val_correct += torch.sum(preds == labels.data)

            val_loss += loss.item() * images.size(0)

    val_loss = val_loss / len(val_dataset)
    val_acc = val_correct.double() / len(val_dataset)

    print(f"Epoch {epoch+1}/{num_epochs}:"
          f" Train Loss: {train_loss:.4f}, Train Acc: {train_acc:.4f},"
          f" Val Loss: {val_loss:.4f}, Val Acc: {val_acc:.4f}")

# Evaluation on the test set
test_correct = 0

model.eval()
with torch.no_grad():
    for images, labels in test_loader:
        images = images.to(device)
        labels = labels.to(device)

        outputs = model(images)

        _, preds = torch.max(outputs, 1)
        test_correct += torch.sum(preds == labels.data)

test_acc = test_correct.double() / len(test_dataset)
print(f"Test Accuracy: {test_acc:.4f}")
C:\Users\fklas\AppData\Local\Programs\Python\Python311\Lib\site-packages\torchvision\models\_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.
  warnings.warn(
C:\Users\fklas\AppData\Local\Programs\Python\Python311\Lib\site-packages\torchvision\models\_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=ResNet50_Weights.IMAGENET1K_V1`. You can also use `weights=ResNet50_Weights.DEFAULT` to get the most up-to-date weights.
  warnings.warn(msg)
Epoch 1/10: Train Loss: 0.5386, Train Acc: 0.7901, Val Loss: 13.0946, Val Acc: 0.8667
Epoch 2/10: Train Loss: 0.4343, Train Acc: 0.8111, Val Loss: 0.8220, Val Acc: 0.6111
Epoch 3/10: Train Loss: 0.4542, Train Acc: 0.7914, Val Loss: 0.4609, Val Acc: 0.8667
Epoch 4/10: Train Loss: 0.4548, Train Acc: 0.8037, Val Loss: 0.4562, Val Acc: 0.8333
Epoch 5/10: Train Loss: 0.4845, Train Acc: 0.8012, Val Loss: 5.2942, Val Acc: 0.8667
Epoch 6/10: Train Loss: 0.4583, Train Acc: 0.8086, Val Loss: 0.6526, Val Acc: 0.7333
Epoch 7/10: Train Loss: 0.4698, Train Acc: 0.8037, Val Loss: 0.3620, Val Acc: 0.8778
Epoch 8/10: Train Loss: 0.4457, Train Acc: 0.8025, Val Loss: 0.4424, Val Acc: 0.8778
Epoch 9/10: Train Loss: 0.4188, Train Acc: 0.8173, Val Loss: 0.5997, Val Acc: 0.9000
Epoch 10/10: Train Loss: 0.4061, Train Acc: 0.8210, Val Loss: 0.6453, Val Acc: 0.8778
Test Accuracy: 0.8206

1.2- Model 2: with reduced architecture

# Load the pretrained ResNet50 model with defult weight which is weights=ResNet50_Weights.IMAGENET1K_V1
model_2 = models.resnet50(pretrained=True)

# Unfreeze the last half of the model
for param in model_2.layer3.parameters():
    param.requires_grad = True
for param in model_2.layer4.parameters():
    param.requires_grad = True

# Replace the last fully connected layer
num_features = model_2.fc.in_features
model_2.fc = nn.Linear(num_features, 2)  # 2 classes: benign and malignant

# Move the model to the device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model_2 = model_2.to(device)

# Define the loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer_2 = optim.Adam(model_2.parameters(), lr=0.001)

# Training and evaluation loop for Model 2
for epoch in range(num_epochs):
    # Training
    model_2.train()
    train_loss = 0.0
    train_correct = 0

    for images, labels in train_loader:
        images = images.to(device)
        labels = labels.to(device)

        optimizer_2.zero_grad()

        outputs = model_2(images)
        loss = criterion(outputs, labels)

        _, preds = torch.max(outputs, 1)
        train_correct += torch.sum(preds == labels.data)

        loss.backward()
        optimizer_2.step()

        train_loss += loss.item() * images.size(0)

    train_loss = train_loss / len(train_dataset)
    train_acc = train_correct.double() / len(train_dataset)

    # Validation
    model_2.eval()
    val_loss = 0.0
    val_correct = 0

    with torch.no_grad():
        for images, labels in val_loader:
            images = images.to(device)
            labels = labels.to(device)

            outputs = model_2(images)
            loss = criterion(outputs, labels)

            _, preds = torch.max(outputs, 1)
            val_correct += torch.sum(preds == labels.data)

            val_loss += loss.item() * images.size(0)

    val_loss = val_loss / len(val_dataset)
    val_acc = val_correct.double() / len(val_dataset)

    print(f"Model 2 - Epoch {epoch+1}/{num_epochs}:"
          f" Train Loss: {train_loss:.4f}, Train Acc: {train_acc:.4f},"
          f" Val Loss: {val_loss:.4f}, Val Acc: {val_acc:.4f}")

# Evaluation on the test set for Model 2
test_correct_2 = 0

model_2.eval()
with torch.no_grad():
    for images, labels in test_loader:
        images = images.to(device)
        labels = labels.to(device)

        outputs = model_2(images)

        _, preds = torch.max(outputs, 1)
        test_correct_2 += torch.sum(preds == labels.data)

test_acc_2 = test_correct_2.double() / len(test_dataset)
print(f"Model 2 - Test Accuracy: {test_acc_2:.4f}")
Model 2 - Epoch 1/10: Train Loss: 0.6450, Train Acc: 0.7728, Val Loss: 1.4463, Val Acc: 0.8000
Model 2 - Epoch 2/10: Train Loss: 0.4924, Train Acc: 0.8025, Val Loss: 0.7435, Val Acc: 0.7111
Model 2 - Epoch 3/10: Train Loss: 0.4801, Train Acc: 0.7914, Val Loss: 1.4627, Val Acc: 0.7778
Model 2 - Epoch 4/10: Train Loss: 0.4401, Train Acc: 0.8049, Val Loss: 0.7477, Val Acc: 0.6333
Model 2 - Epoch 5/10: Train Loss: 0.4602, Train Acc: 0.8062, Val Loss: 0.5359, Val Acc: 0.8556
Model 2 - Epoch 6/10: Train Loss: 0.4094, Train Acc: 0.8259, Val Loss: 1.3895, Val Acc: 0.4333
Model 2 - Epoch 7/10: Train Loss: 0.4227, Train Acc: 0.8160, Val Loss: 0.5916, Val Acc: 0.8889
Model 2 - Epoch 8/10: Train Loss: 0.4044, Train Acc: 0.8173, Val Loss: 0.6277, Val Acc: 0.7889
Model 2 - Epoch 9/10: Train Loss: 0.3749, Train Acc: 0.8407, Val Loss: 0.5674, Val Acc: 0.8778
Model 2 - Epoch 10/10: Train Loss: 0.3800, Train Acc: 0.8383, Val Loss: 0.4910, Val Acc: 0.8667
Model 2 - Test Accuracy: 0.8100

1.3- Model 3: with partially open layers

# Load the pretrained ResNet50 model with defult weight which is weights=ResNet50_Weights.IMAGENET1K_V1
model_3 = models.resnet50(pretrained=True)

# Split the model into two halves
num_layers = len(list(model_3.children()))
split_layer = int(num_layers / 2)

# Freeze the first half of the model
for param in model_3.parameters():
    param.requires_grad = False

# Add the classifier layer to the end of the first half of the model
model_3.fc = nn.Linear(model_3.fc.in_features, 2)  # 2 classes: benign and malignant

# Move the model to the device
model_3 = model_3.to(device)

# Define the loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer_3 = optim.Adam(model_3.parameters(), lr=0.001)

# Training and evaluation loop for Model-3
for epoch in range(num_epochs):
    # Training
    model_3.train()
    train_loss = 0.0
    train_correct = 0

    for images, labels in train_loader:
        images = images.to(device)
        labels = labels.to(device)

        optimizer_3.zero_grad()

        outputs = model_3(images)
        loss = criterion(outputs, labels)

        _, preds = torch.max(outputs, 1)
        train_correct += torch.sum(preds == labels.data)

        loss.backward()
        optimizer_3.step()

        train_loss += loss.item() * images.size(0)

    train_loss = train_loss / len(train_dataset)
    train_acc = train_correct.double() / len(train_dataset)

    # Validation
    model_3.eval()
    val_loss = 0.0
    val_correct = 0

    with torch.no_grad():
        for images, labels in val_loader:
            images = images.to(device)
            labels = labels.to(device)

            outputs = model_3(images)
            loss = criterion(outputs, labels)

            _, preds = torch.max(outputs, 1)
            val_correct += torch.sum(preds == labels.data)

            val_loss += loss.item() * images.size(0)

    val_loss = val_loss / len(val_dataset)
    val_acc = val_correct.double() / len(val_dataset)

    print(f"Model 3 - Epoch {epoch+1}/{num_epochs}:"
          f" Train Loss: {train_loss:.4f}, Train Acc: {train_acc:.4f},"
          f" Val Loss: {val_loss:.4f}, Val Acc: {val_acc:.4f}")

# Evaluation on the test set for Model-3
test_correct_3 = 0

model_3.eval()
with torch.no_grad():
    for images, labels in test_loader:
        images = images.to(device)
        labels = labels.to(device)

        outputs = model_3(images)

        _, preds = torch.max(outputs, 1)
        test_correct_3 += torch.sum(preds == labels.data)

test_acc_3 = test_correct_3.double() / len(test_dataset)
print(f"Model-3 Test Accuracy: {test_acc_3:.4f}")
Model 3 - Epoch 1/10: Train Loss: 0.5055, Train Acc: 0.7975, Val Loss: 0.4412, Val Acc: 0.8556
Model 3 - Epoch 2/10: Train Loss: 0.4295, Train Acc: 0.8148, Val Loss: 0.5671, Val Acc: 0.7556
Model 3 - Epoch 3/10: Train Loss: 0.4247, Train Acc: 0.8148, Val Loss: 0.3774, Val Acc: 0.8667
Model 3 - Epoch 4/10: Train Loss: 0.3726, Train Acc: 0.8370, Val Loss: 0.4966, Val Acc: 0.7889
Model 3 - Epoch 5/10: Train Loss: 0.3781, Train Acc: 0.8346, Val Loss: 0.4125, Val Acc: 0.8667
Model 3 - Epoch 6/10: Train Loss: 0.3361, Train Acc: 0.8506, Val Loss: 0.3548, Val Acc: 0.8889
Model 3 - Epoch 7/10: Train Loss: 0.3577, Train Acc: 0.8519, Val Loss: 0.3344, Val Acc: 0.9000
Model 3 - Epoch 8/10: Train Loss: 0.3788, Train Acc: 0.8457, Val Loss: 0.4833, Val Acc: 0.8000
Model 3 - Epoch 9/10: Train Loss: 0.3386, Train Acc: 0.8617, Val Loss: 0.3358, Val Acc: 0.9000
Model 3 - Epoch 10/10: Train Loss: 0.3262, Train Acc: 0.8580, Val Loss: 0.3409, Val Acc: 0.9000
Model-3 Test Accuracy: 0.8153

1.4- Results of Models

Corect Train - which means the latest version of the models result -

Model 1 - Test Accuracy: 0.8206

Train Loss: 0.5386, 0.4343, 0.4542, 0.4548, 0.4845, 0.4583, 0.4698, 0.4457, 0.4188, 0.4061
Train Acc: 0.7901, 0.8111, 0.7914, 0.8037, 0.8012, 0.8086, 0.8037, 0.8025, 0.8173, 0.8210
Val Loss: 13.0946,0.8220, 0.4609, 0.4562, 5.2942, 0.6526, 0.3620, 0.4424, 0.5997, 0.6453
Val Acc: 0.8667, 0.6111, 0.8667, 0.8333, 0.8667, 0.7333, 0.8778, 0.8778, 0.9000, 0.8778

Model 2 - Test Accuracy: 0.8100

Train Loss: 0.6450, 0.4924, 0.4801, 0.4401, 0.4602, 0.4094, 0.4227, 0.4044, 0.3749, 0.3800
Train Acc: 0.7728, 0.8025, 0.7914, 0.8049, 0.8062, 0.8259, 0.8160, 0.8173, 0.8407, 0.8383
Val Loss: 1.4463, 0.7435, 1.4627, 0.7477, 0.5359, 1.3895, 0.5916, 0.6277, 0.5674, 0.4910
Val Acc: 0.8000, 0.7111, 0.7778, 0.6333, 0.8556, 0.4333, 0.8889, 0.7889, 0.8778, 0.8667

Model-3 Test Accuracy: 0.8153

Train Loss: 0.5055, 0.4295, 0.4247, 0.3726, 0.3781, 0.3361, 0.3577, 0.3788, 0.3386, 0.3262
Train Acc: 0.7975, 0.8148, 0.8148, 0.8370, 0.8346, 0.8506, 0.8519, 0.8457, 0.8617, 0.8580
Val Loss: 0.4412, 0.5671, 0.3774, 0.4966, 0.4125, 0.3548, 0.3344, 0.4833, 0.3358, 0.3409
Val Acc: 0.8556, 0.7556, 0.8667, 0.7889, 0.8667, 0.8889, 0.9000, 0.8000, 0.9000, 0.9000

import matplotlib.pyplot as plt

train_loss_1 = [0.5386, 0.4343, 0.4542, 0.4548, 0.4845, 0.4583, 0.4698, 0.4457, 0.4188, 0.4061]
train_acc_1 =  [0.7901, 0.8111, 0.7914, 0.8037, 0.8012, 0.8086, 0.8037, 0.8025, 0.8173, 0.8210]
val_loss_1 =   [13.0946,0.8220, 0.4609, 0.4562, 5.2942, 0.6526, 0.3620, 0.4424, 0.5997, 0.6453]
val_acc_1 =    [0.8667, 0.6111, 0.8667, 0.8333, 0.8667, 0.7333, 0.8778, 0.8778, 0.9000, 0.8778]

train_loss_2 = [0.6450, 0.4924, 0.4801, 0.4401, 0.4602, 0.4094, 0.4227, 0.4044, 0.3749, 0.3800]
train_acc_2 =  [0.7728, 0.8025, 0.7914, 0.8049, 0.8062, 0.8259, 0.8160, 0.8173, 0.8407, 0.8383]
val_loss_2 =   [1.4463, 0.7435, 1.4627, 0.7477, 0.5359, 1.3895, 0.5916, 0.6277, 0.5674, 0.4910]
val_acc_2 =    [0.8000, 0.7111, 0.7778, 0.6333, 0.8556, 0.4333, 0.8889, 0.7889, 0.8778, 0.8667]

train_loss_3 = [0.5055, 0.4295, 0.4247, 0.3726, 0.3781, 0.3361, 0.3577, 0.3788, 0.3386, 0.3262]
train_acc_3 =  [0.7975, 0.8148, 0.8148, 0.8370, 0.8346, 0.8506, 0.8519, 0.8457, 0.8617, 0.8580]
val_loss_3 =   [0.4412, 0.5671, 0.3774, 0.4966, 0.4125, 0.3548, 0.3344, 0.4833, 0.3358, 0.3409]
val_acc_3 =    [0.8556, 0.7556, 0.8667, 0.7889, 0.8667, 0.8889, 0.9000, 0.8000, 0.9000, 0.9000]


plt.plot(train_loss_1, label='Model-1 Training Loss')
plt.plot(val_loss_1, label='Model-1 Validation Loss')
plt.plot(train_loss_2, label='Model-2 Training Loss')
plt.plot(val_loss_2, label='Model-2 Validation Loss')
plt.plot(train_loss_3, label='Model-3 Training Loss')
plt.plot(val_loss_3, label='Model-3 Validation Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Training and Validation Loss Curves')
plt.legend()
plt.show()

- - ^^ THE PREVİOUS MODEL FROM THE MODELS THAT I TRAINED CORRECTLY LAST ^^- -

(I put this in my notebook because there were constant errors in the model results that I changed many times. Especially in the validation_loss part and finally it occurred to me that this may due to 10% validation_data split and I wanted to leave one of the results of my old training models here. I split first %10 of training data to validation data.)

Model 1 - Test Accuracy: 0.8047

Train Loss:0.5435, 0.4601, 0.4975, 0.4708, 0.4537, 0.4823, 0.4608, 0.4963, 0.4737, 0.4382
Train Acc: 0.7963, 0.8062, 0.7951, 0.7988, 0.8025, 0.7975, 0.8074, 0.7938, 0.8123, 0.8074
Val Loss: 4.8731, 2.9035, 0.5113, 0.4200, 1.2071, 13.4837,0.6148, 4.5843, 0.4652, 0.6397
Val Acc: 0.8222, 0.8667, 0.8778, 0.8778, 0.4778, 0.8222, 0.7667, 0.8556, 0.8778, 0.8667

Model 2 - Test Accuracy: 0.8100

Train Loss: 0.5781, 0.5239, 0.4941, 0.4409, 0.4269, 0.4372, 0.4563, 0.4408, 0.4256, 0.3935
Train Acc: 0.7741, 0.7877, 0.7963, 0.7975, 0.8247, 0.8086, 0.8210, 0.8062, 0.8210, 0.8247
Val Loss: 0.9259, 1.4470, 0.4640, 0.6841, 0.5459, 0.6064, 1.7164, 1.4912, 1.7312, 0.6335
Val Acc: 0.7778, 0.8667, 0.8222, 0.7000, 0.8333, 0.8000, 0.8556, 0.6889, 0.8667, 0.8889

Model-3 Test Accuracy: 0.8364

Train Loss: 0.5033, 0.4161, 0.4176, 0.3682, 0.3681, 0.3344, 0.3283, 0.3475, 0.3082, 0.3123
Train Acc: 0.7877, 0.8123, 0.8173, 0.8469, 0.8494, 0.8519, 0.8593, 0.8519, 0.8716, 0.8654
Val Loss: 0.3961, 0.3818, 0.5023, 0.3751, 0.4320, 0.4086, 0.4572, 0.3512, 0.4083, 0.3692
Val Acc: 0.8667, 0.8667, 0.8111, 0.8778, 0.8222, 0.8556, 0.7889, 0.8889, 0.8333, 0.8778

####### WRONG ATTEMP LOSS DATAS ########
first_train_loss_1 = [0.5435, 0.4601, 0.4975, 0.4708, 0.4537, 0.4823, 0.4608, 0.4963, 0.4737, 0.4382]
first_train_acc_1 =  [0.7963, 0.8062, 0.7951, 0.7988, 0.8025, 0.7975, 0.8074, 0.7938, 0.8123, 0.8074]
first_val_loss_1 =   [4.8731, 2.9035, 0.5113, 0.4200, 1.2071, 13.4837,0.6148, 4.5843, 0.4652, 0.6397]
first_val_acc_1 =    [0.8222, 0.8667, 0.8778, 0.8778, 0.4778, 0.8222, 0.7667, 0.8556, 0.8778, 0.8667]

first_train_loss_2 = [0.5781, 0.5239, 0.4941, 0.4409, 0.4269, 0.4372, 0.4563, 0.4408, 0.4256, 0.3935]
first_train_acc_2 =  [0.7741, 0.7877, 0.7963, 0.7975, 0.8247, 0.8086, 0.8210, 0.8062, 0.8210, 0.8247]
first_val_loss_2 =   [0.9259, 1.4470, 0.4640, 0.6841, 0.5459, 0.6064, 1.7164, 1.4912, 1.7312, 0.6335]
first_val_acc_2 =    [0.7778, 0.8667, 0.8222, 0.7000, 0.8333, 0.8000, 0.8556, 0.6889, 0.8667, 0.8889]

first_train_loss_3 = [0.5033, 0.4161, 0.4176, 0.3682, 0.3681, 0.3344, 0.3283, 0.3475, 0.3082, 0.3123]
first_train_acc_3 =  [0.7877, 0.8123, 0.8173, 0.8469, 0.8494, 0.8519, 0.8593, 0.8519, 0.8716, 0.8654]
first_val_loss_3 =   [0.3961, 0.3818, 0.5023, 0.3751, 0.4320, 0.4086, 0.4572, 0.3512, 0.4083, 0.3692]
first_val_acc_3 =    [0.8667, 0.8667, 0.8111, 0.8778, 0.8222, 0.8556, 0.7889, 0.8889, 0.8333, 0.8778]

plt.plot(first_train_loss_1, label='Model-1 TL')
plt.plot(first_val_loss_1, label='Model-1 VL')
plt.plot(first_train_loss_2, label='Model-2 TL')
plt.plot(first_val_loss_2, label='Model-2 VL')
plt.plot(first_train_loss_3, label='Model-3 TL')
plt.plot(first_val_loss_3, label='Model-3 VL')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('PREVİOUS MODEL loss curves -from old train-')
plt.legend()
plt.show()

Model 1 loss curves

train_loss_1 = [0.5386, 0.4343, 0.4542, 0.4548, 0.4845, 0.4583, 0.4698, 0.4457, 0.4188, 0.4061]
train_acc_1 =  [0.7901, 0.8111, 0.7914, 0.8037, 0.8012, 0.8086, 0.8037, 0.8025, 0.8173, 0.8210]
val_loss_1 =   [13.0946,0.8220, 0.4609, 0.4562, 5.2942, 0.6526, 0.3620, 0.4424, 0.5997, 0.6453]
val_acc_1 =    [0.8667, 0.6111, 0.8667, 0.8333, 0.8667, 0.7333, 0.8778, 0.8778, 0.9000, 0.8778]
plt.plot(train_loss_1, label='Model-1 Training Loss')
plt.plot(val_loss_1, label='Model-1 Validation Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Training and Validation Loss Curves of Model-1')
plt.legend()
plt.show()

Model 2 loss curves

train_loss_2 = [0.6450, 0.4924, 0.4801, 0.4401, 0.4602, 0.4094, 0.4227, 0.4044, 0.3749, 0.3800]
train_acc_2 =  [0.7728, 0.8025, 0.7914, 0.8049, 0.8062, 0.8259, 0.8160, 0.8173, 0.8407, 0.8383]
val_loss_2 =   [1.4463, 0.7435, 1.4627, 0.7477, 0.5359, 1.3895, 0.5916, 0.6277, 0.5674, 0.4910]
val_acc_2 =    [0.8000, 0.7111, 0.7778, 0.6333, 0.8556, 0.4333, 0.8889, 0.7889, 0.8778, 0.8667]
plt.plot(train_loss_2, label='Model-2 Training Loss')
plt.plot(val_loss_2, label='Model-2 Validation Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Training and Validation Loss Curves of Model-2')
plt.legend()
plt.show()

Model 3 loss curves

train_loss_3 = [0.5055, 0.4295, 0.4247, 0.3726, 0.3781, 0.3361, 0.3577, 0.3788, 0.3386, 0.3262]
train_acc_3 =  [0.7975, 0.8148, 0.8148, 0.8370, 0.8346, 0.8506, 0.8519, 0.8457, 0.8617, 0.8580]
val_loss_3 =   [0.4412, 0.5671, 0.3774, 0.4966, 0.4125, 0.3548, 0.3344, 0.4833, 0.3358, 0.3409]
val_acc_3 =    [0.8556, 0.7556, 0.8667, 0.7889, 0.8667, 0.8889, 0.9000, 0.8000, 0.9000, 0.9000]

plt.plot(train_loss_3, label='Model-3 Training Loss')
plt.plot(val_loss_3, label='Model-3 Validation Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Training and Validation Loss Curves of Mod#### Model 2 loss curvesel-3')
plt.legend()
plt.show()

1.5- Reporting of Step 1: Transfer Learning Approach

1.5.1- Confusion matrix, Classification Report: precision, recall, f1-score, and support

1.5.1.1- Confusion matrix, Classification Report: precision, recall, f1-score, and support of the model 1

from sklearn.metrics import confusion_matrix, classification_report
import numpy as np

# Evaluation on the test set
test_predictions = []
test_labels = []

model.eval()
with torch.no_grad():
    for images, labels in test_loader:
        images = images.to(device)
        labels = labels.to(device)

        outputs = model(images)

        _, preds = torch.max(outputs, 1)
        test_predictions.extend(preds.cpu().numpy())
        test_labels.extend(labels.cpu().numpy())

test_predictions = np.array(test_predictions)
test_labels = np.array(test_labels)

# Compute metrics
print("Classification Report:")
print(classification_report(test_labels, test_predictions))

# Generate confusion matrix
print("Confusion Matrix:")
cm = confusion_matrix(test_labels, test_predictions)
print(cm)
Classification Report:
              precision    recall  f1-score   support

           0       0.84      0.96      0.90       304
           1       0.62      0.24      0.35        75

    accuracy                           0.82       379
   macro avg       0.73      0.60      0.62       379
weighted avg       0.79      0.82      0.79       379

Confusion Matrix:
[[293  11]
 [ 57  18]]

1.5.1.2- Confusion matrix, Classification Report: precision, recall, f1-score, and support of the model 2

from sklearn.metrics import confusion_matrix, classification_report
import numpy as np

# Evaluation on the test set
test_predictions = []
test_labels = []

model_2.eval()
with torch.no_grad():
    for images, labels in test_loader:
        images = images.to(device)
        labels = labels.to(device)

        outputs = model_2(images)

        _, preds = torch.max(outputs, 1)
        test_predictions.extend(preds.cpu().numpy())
        test_labels.extend(labels.cpu().numpy())

test_predictions = np.array(test_predictions)
test_labels = np.array(test_labels)

# Compute metrics
print("Classification Report:")
print(classification_report(test_labels, test_predictions))

# Generate confusion matrix
print("Confusion Matrix:")
cm = confusion_matrix(test_labels, test_predictions)
print(cm)
Classification Report:
              precision    recall  f1-score   support

           0       0.82      0.98      0.89       304
           1       0.59      0.13      0.22        75

    accuracy                           0.81       379
   macro avg       0.70      0.56      0.55       379
weighted avg       0.77      0.81      0.76       379

Confusion Matrix:
[[297   7]
 [ 65  10]]

1.5.1.3- Confusion matrix, Classification Report: precision, recall, f1-score, and support of the model 3

from sklearn.metrics import confusion_matrix, classification_report
import numpy as np

# Evaluation on the test set
test_predictions = []
test_labels = []

model_3.eval()
with torch.no_grad():
    for images, labels in test_loader:
        images = images.to(device)
        labels = labels.to(device)

        outputs = model_3(images)

        _, preds = torch.max(outputs, 1)
        test_predictions.extend(preds.cpu().numpy())
        test_labels.extend(labels.cpu().numpy())

test_predictions = np.array(test_predictions)
test_labels = np.array(test_labels)

# Compute metrics
print("Classification Report:")
print(classification_report(test_labels, test_predictions))

# Generate confusion matrix
print("Confusion Matrix:")
cm = confusion_matrix(test_labels, test_predictions)
print(cm)
Classification Report:
              precision    recall  f1-score   support

           0       0.82      0.98      0.90       304
           1       0.67      0.13      0.22        75

    accuracy                           0.82       379
   macro avg       0.74      0.56      0.56       379
weighted avg       0.79      0.82      0.76       379

Confusion Matrix:
[[299   5]
 [ 65  10]]

1.5.2- Example of Correctly Classified Samples and Incorrectly Classified Samples

1.5.2.1- Example of Correctly Classified Samples and Incorrectly Classified Samples of model 1

# display images
def show_images(images, labels, predicted_labels, title):
    fig, axes = plt.subplots(2, min(5, len(images)), figsize=(12, 6))
    fig.suptitle(title, fontsize=14)

    for i, ax in enumerate(axes.flat):
        if i < len(images):
            ax.imshow(images[i].permute(1, 2, 0))
            ax.axis('off')
            ax.set_title(f"True: {labels[i]}\nPred: {predicted_labels[i]}")
        else:
            ax.axis('off')

# Get a batch of test images and labels
images, labels = next(iter(test_loader))

# Move images to device
images = images.to(device)

# Get model predictions
model.eval()
with torch.no_grad():
    outputs = model(images)
    _, preds = torch.max(outputs, 1)

# Get indices of correctly and incorrectly classified images
correct_indices = (preds == labels).nonzero().squeeze()
incorrect_indices = (preds != labels).nonzero().squeeze()

# Select correctly classified images
num_correct = min(5, len(correct_indices))
correct_images = images[correct_indices][:num_correct]
correct_labels = labels[correct_indices][:num_correct]
correct_preds = preds[correct_indices][:num_correct]

# Select incorrectly classified images
num_incorrect = min(5, len(incorrect_indices))
incorrect_images = images[incorrect_indices][:num_incorrect]
incorrect_labels = labels[incorrect_indices][:num_incorrect]
incorrect_preds = preds[incorrect_indices][:num_incorrect]

# Show the selected images
show_images(correct_images, correct_labels, correct_preds, "Correctly Classified Samples")
show_images(incorrect_images, incorrect_labels, incorrect_preds, "Incorrectly Classified Samples")

# Display the images
plt.show()
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).

1.5.2.2- Example of Correctly Classified Samples and Incorrectly Classified Samples of model 2

# display images
def show_images(images, labels, predicted_labels, title):
    fig, axes = plt.subplots(2, min(5, len(images)), figsize=(12, 6))
    fig.suptitle(title, fontsize=14)

    for i, ax in enumerate(axes.flat):
        if i < len(images):
            ax.imshow(images[i].permute(1, 2, 0))
            ax.axis('off')
            ax.set_title(f"True: {labels[i]}\nPred: {predicted_labels[i]}")
        else:
            ax.axis('off')

# Get a batch of test images and label
images, labels = next(iter(test_loader))

# Device
images = images.to(device)

# Get model predictions
model_2.eval()
with torch.no_grad():
    outputs = model_2(images)
    _, preds = torch.max(outputs, 1)

# Get indices of correctly and incorrectly classified imagess. Lieted with 5 
correct_indices = (preds == labels).nonzero().squeeze()
incorrect_indices = (preds != labels).nonzero().squeeze()

# Select correctly classified images
num_correct = min(5, len(correct_indices))
correct_images = images[correct_indices][:num_correct]
correct_labels = labels[correct_indices][:num_correct]
correct_preds = preds[correct_indices][:num_correct]

# Select incorrectly classified images
num_incorrect = min(5, len(incorrect_indices))
incorrect_images = images[incorrect_indices][:num_incorrect]
incorrect_labels = labels[incorrect_indices][:num_incorrect]
incorrect_preds = preds[incorrect_indices][:num_incorrect]

# Show the selected images
show_images(correct_images, correct_labels, correct_preds, "Correctly Classified Samples")
show_images(incorrect_images, incorrect_labels, incorrect_preds, "Incorrectly Classified Samples")

# Display the images
plt.show()
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).

1.5.2.3- Example of Correctly Classified Samples and Incorrectly Classified Samples of model 3

# display images
def show_images(images, labels, predicted_labels, title):
    fig, axes = plt.subplots(2, min(5, len(images)), figsize=(12, 6))
    fig.suptitle(title, fontsize=14)

    for i, ax in enumerate(axes.flat):
        if i < len(images):
            ax.imshow(images[i].permute(1, 2, 0))
            ax.axis('off')
            ax.set_title(f"True: {labels[i]}\nPred: {predicted_labels[i]}")
        else:
            ax.axis('off')

# Get a batch of test images and labels a
images, labels = next(iter(test_loader))

# choose a device for runnin
images = images.to(device)

# Get model prediction
model_3.eval()
with torch.no_grad():
    outputs = model_3(images)
    _, preds = torch.max(outputs, 1)

# Get indices of correctly andincorrectly classified images. Limeted with 5
correct_indices = (preds == labels).nonzero().squeeze()
incorrect_indices = (preds != labels).nonzero().squeeze()

# Select correctly classified images
num_correct = min(5, len(correct_indices))
correct_images = images[correct_indices][:num_correct]
correct_labels = labels[correct_indices][:num_correct]
correct_preds = preds[correct_indices][:num_correct]

# Select incorrectly classified images
num_incorrect = min(5, len(incorrect_indices))
incorrect_images = images[incorrect_indices][:num_incorrect]
incorrect_labels = labels[incorrect_indices][:num_incorrect]
incorrect_preds = preds[incorrect_indices][:num_incorrect]

# Show the selected images
show_images(correct_images, correct_labels, correct_preds, "Correctly Classified Samples")
show_images(incorrect_images, incorrect_labels, incorrect_preds, "Incorrectly Classified Samples")

# Display the images
plt.show()
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).

Step2: Building an Image Classification Model From Scratch

import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader
import torchvision.transforms as transforms
from torchvision.datasets import ImageFolder
from PIL import Image
import pandas as pd

# ----------------------------------- custom CNN architecture ----------------------------------- #

class SkinLesionClassifier(nn.Module):
    def __init__(self, num_classes):
        super(SkinLesionClassifier, self).__init__()

        # Convolutional Layers: 
        # The features sequential module contains three sets of operations: 
        # a 2D convolutional layer, a ReLU activation function, and a max pooling layer. 
        # This pattern is repeated three times to extract features from the input images at 
        # different spatial resolutions.
        self.features = nn.Sequential(
            nn.Conv2d(3, 32, kernel_size=3, stride=1, padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2, stride=2),

            nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2, stride=2),

            nn.Conv2d(64, 128, kernel_size=3, stride=1, padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2, stride=2),
        )
        
        
        
        # Fully Connected Layers: The classifier sequential module consists of two fully connected layers. 
        # The first fully connected layer has 128 * 28 * 28 input features, 
        # which corresponds to the flattened output of the last convolutional layer. 
        # It applies a ReLU activation function and dropout regularization. 
        # The final fully connected layer reduces the feature dimensionality to num_classes, 
        # representing the number of classes in the classification task.
        self.classifier = nn.Sequential(
            nn.Linear(128 * 28 * 28, 512),
            nn.ReLU(inplace=True),
            nn.Dropout(0.5),

            nn.Linear(512, num_classes)
        )

    def forward(self, x):
        # The forward method defines the forward pass of the model, 
        # where the input x flows through the convolutional layers (features) and is then flattened.
        x = self.features(x)
        x = x.view(x.size(0), -1)
        
        # Flattened tensor passes through the fully connected layers (classifier) to produce the output logits.
        x = self.classifier(x)
        return x

# Defining the transformation pipeline for image preprocessing
transform = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

# Defining the dataset class for training and validation
class SkinLesionDataset(Dataset):
    def __init__(self, data_dir, df, transform=None):
        self.data_dir = data_dir
        self.df = df
        self.transform = transform

    def __getitem__(self, index):
        image_id = self.df.iloc[index]["image_id"]
        image_path = f"{self.data_dir}/{image_id}.jpg"
        image = Image.open(image_path).convert("RGB")

        if self.transform is not None:
            image = self.transform(image)

        label_str = self.df.iloc[index]["label"]
        label = 1 if label_str == "malignant" else 0

        return image, label

    def __len__(self):
        return len(self.df)

# Defining the dataset class for testing because labels are different for test data
class SkinLesionDatasetForTest(Dataset):
    def __init__(self, data_dir, df, transform=None):
        self.data_dir = data_dir
        self.df = df
        self.transform = transform

    def __getitem__(self, index):
        image_id = self.df.iloc[index]["image_id"]
        image_path = f"{self.data_dir}/{image_id}.jpg"
        image = Image.open(image_path).convert("RGB")

        if self.transform is not None:
            image = self.transform(image)

        label_str = self.df.iloc[index]["label"]
        label = 1 if label_str == 1.0 else 0

        return image, label

    def __len__(self):
        return len(self.df)

# Reading the CSV files
train_csv_file = "ISBI2016_ISIC_Training_GroundTruth.csv"
val_csv_file = "ISBI2016_ISIC_Validation_GroundTruth.csv"
test_csv_file = "ISBI2016_ISIC_Test_GroundTruth.csv"

train_df = pd.read_csv(train_csv_file, delimiter=",", header=None, names=["image_id", "label"])
val_df = pd.read_csv(val_csv_file, delimiter=",", header=None, names=["image_id", "label"])
test_df = pd.read_csv(test_csv_file, delimiter=",", header=None, names=["image_id", "label"])

# Spliting the dataset into training, validation, and test sets
train_data_dir = "ISBI2016_ISIC_Training_Data"
val_data_dir = "ISBI2016_ISIC_Validation_Data"
test_data_dir = "ISBI2016_ISIC_Test_Data"

# Creating the dataset objects
train_dataset = SkinLesionDataset(train_data_dir, train_df, transform=transform)
val_dataset = SkinLesionDataset(val_data_dir, val_df, transform=transform)
test_dataset = SkinLesionDatasetForTest(test_data_dir, test_df, transform=transform)

# Creating the data loaders
batch_size = 32
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False)
test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

# Creating an instance of the skin lesion classifier
model = SkinLesionClassifier(num_classes=2)

# choose device for short run
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)

# Defining the loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Training and evaluation loopss
num_epochs = 10

for epoch in range(num_epochs):
    # Training
    model.train()
    train_loss = 0.0
    train_correct = 0

    for images, labels in train_loader:
        images = images.to(device)
        labels = labels.to(device)

        optimizer.zero_grad()

        outputs = model(images)
        loss = criterion(outputs, labels)

        _, preds = torch.max(outputs, 1)
        train_correct += torch.sum(preds == labels.data)

        loss.backward()
        optimizer.step()

        train_loss += loss.item() * images.size(0)

    train_loss = train_loss / len(train_dataset)
    train_acc = train_correct.double() / len(train_dataset)

    # Validation d
    model.eval()
    val_loss = 0.0
    val_correct = 0

    with torch.no_grad():
        for images, labels in val_loader:
            images = images.to(device)
            labels = labels.to(device)

            outputs = model(images)
            loss = criterion(outputs, labels)

            _, preds = torch.max(outputs, 1)
            val_correct += torch.sum(preds == labels.data)

            val_loss += loss.item() * images.size(0)

    val_loss = val_loss / len(val_dataset)
    val_acc = val_correct.double() / len(val_dataset)

    print(f"Epoch {epoch+1}/{num_epochs}:"
          f" Train Loss: {train_loss:.4f}, Train Acc: {train_acc:.4f},"
          f" Val Loss: {val_loss:.4f}, Val Acc: {val_acc:.4f}")

# Evaluation on the test set
test_correct = 0

model.eval()
with torch.no_grad():
    for images, labels in test_loader:
        images = images.to(device)
        labels = labels.to(device)

        outputs = model(images)

        _, preds = torch.max(outputs, 1)
        test_correct += torch.sum(preds == labels.data)

test_acc = test_correct.double()/ len(test_dataset)
print(f"Test Accuracy: {test_acc:.4f}")
Epoch 1/10: Train Loss: 0.6883, Train Acc: 0.7840, Val Loss: 1.0978, Val Acc: 0.8667
Epoch 2/10: Train Loss: 0.4775, Train Acc: 0.8012, Val Loss: 0.7613, Val Acc: 0.8667
Epoch 3/10: Train Loss: 0.4766, Train Acc: 0.8012, Val Loss: 1.1496, Val Acc: 0.8667
Epoch 4/10: Train Loss: 0.4573, Train Acc: 0.8012, Val Loss: 0.5120, Val Acc: 0.8667
Epoch 5/10: Train Loss: 0.4660, Train Acc: 0.8025, Val Loss: 2.1202, Val Acc: 0.8667
Epoch 6/10: Train Loss: 0.4578, Train Acc: 0.7975, Val Loss: 2.8478, Val Acc: 0.8667
Epoch 7/10: Train Loss: 0.4420, Train Acc: 0.8123, Val Loss: 0.4304, Val Acc: 0.8667
Epoch 8/10: Train Loss: 0.4364, Train Acc: 0.8160, Val Loss: 0.5747, Val Acc: 0.8667
Epoch 9/10: Train Loss: 0.4234, Train Acc: 0.8099, Val Loss: 1.2480, Val Acc: 0.7333
Epoch 10/10: Train Loss: 0.4211, Train Acc: 0.8185, Val Loss: 1.7306, Val Acc: 0.8778
Test Accuracy: 0.8047

I thought the model did not prerdict any 1 but when I look at a few more pictures and than I see it. It was just a coincidence.

# I was looking why model is not predicting 1 but it seems it predict 1.
test_correct = 0

model.eval()
with torch.no_grad():
    for images, labels in test_loader:
        images = images.to(device)
        labels = labels.to(device)

        outputs = model(images)

        x, preds = torch.max(outputs, 1)## I thought the model did not prerdict any 1 but when I look at a few more pictures and than I see it. It was just a coincidence.
        print(x)
        test_correct += torch.sum(preds == labels.data)

test_acc = test_correct.double()/ len(test_dataset)
print(f"Test Accuracy: {test_acc:.4f}")
tensor([ 0.0937,  2.1015,  3.6010,  2.4113,  1.9821,  1.7362,  1.2605,  0.8603,
         1.7896,  5.0396,  5.2531, 50.8374,  2.6454,  1.1652,  1.7367,  0.9046,
         1.6091,  1.8293,  2.7449,  3.1041,  0.3036,  3.9430,  1.1789,  1.3454,
         2.3033,  1.4861,  2.7959,  1.0077,  1.6401,  2.3750,  0.9363,  0.5674])
tensor([2.5817, 0.9873, 1.3215, 1.5310, 2.2378, 1.4115, 1.6017, 0.3990, 1.4902,
        0.4133, 1.6955, 0.1113, 0.2570, 1.0877, 1.4184, 0.9930, 2.4602, 2.8013,
        1.4107, 1.5000, 0.6790, 1.3751, 2.0535, 0.2843, 0.2412, 2.1956, 1.1011,
        0.7443, 1.7058, 1.3067, 1.9446, 0.0089])
tensor([ 0.8349,  0.3133,  1.3670,  3.3524,  3.5384,  0.2645,  1.5857,  3.4304,
        -0.4512, -0.3601,  0.1897,  2.4914, -0.0942,  0.1356,  2.0123, -0.0498,
         0.3288,  1.7425,  0.9330,  1.4010,  1.8567,  1.6810,  2.2702,  0.1694,
        -0.0960,  2.9003,  0.4485, -0.2392,  2.4882,  0.7698,  0.7547,  1.2207])
tensor([1.5043, 1.0157, 1.2613, 3.0613, 0.2635, 1.1433, 1.2761, 0.9536, 0.7417,
        1.4428, 2.0577, 1.7440, 0.3943, 2.7956, 1.1047, 1.3567, 1.5960, 1.1970,
        2.2807, 1.1686, 1.8387, 0.1625, 1.2236, 1.0147, 0.8317, 0.5501, 0.3410,
        0.8569, 1.2969, 0.1061, 1.9943, 0.7279])
tensor([ 0.8545,  0.8127,  1.2135,  4.1292,  1.6904,  0.9264, -0.2408,  1.1856,
         1.8489, 87.4981,  1.4319,  5.9103, 20.6343, -0.2789, 16.8895,  4.6065,
         3.1078, 42.5545,  4.8674, 43.5815, 79.5309, 29.3269,  4.9300, 19.0257,
        29.0336, 17.5688,  1.6997,  9.8101,  2.4479, 24.4624, 16.1752, 79.0457])
tensor([101.2009,   3.0091,  39.1507,   3.8212,   5.1475,   7.5182,  32.3012,
         35.0839,   1.6544,   1.4746,   1.1124,   9.7696,  88.0368,  77.9080,
          9.4977,  10.5813,  11.4202,   0.9530,   0.6935,   3.1215,   1.0884,
          1.7153,   1.3366,   0.5971,   1.6596,   1.7849,   0.3785,   2.0836,
          1.2670,   1.7869,   1.4597,   1.7633])
tensor([2.2995, 2.1081, 2.3648, 1.3362, 1.8183, 1.3060, 0.8119, 1.6908, 2.5642,
        3.0446, 0.6321, 2.1565, 2.3138, 1.0462, 1.9045, 2.1485, 0.3267, 2.2342,
        1.2976, 1.7640, 2.1171, 2.6122, 1.3607, 1.1441, 2.0413, 1.1485, 1.0785,
        1.8160, 1.3603, 1.3085, 2.5379, 1.6335])
tensor([2.0473, 0.9899, 0.3702, 1.3680, 2.0031, 1.6296, 0.4149, 1.6536, 0.7690,
        1.8846, 1.7417, 0.9885, 0.6942, 2.6520, 0.6851, 1.3870, 0.9074, 2.0471,
        1.0763, 1.4366, 2.3153, 0.6569, 1.2559, 1.0868, 0.7112, 1.0420, 1.3834,
        1.2426, 1.1627, 1.6373, 1.2623, 1.0946])
tensor([1.1277, 3.2547, 1.3066, 2.1425, 1.4845, 2.0286, 1.7710, 2.6210, 0.5834,
        1.3838, 1.3659, 1.0842, 0.8454, 1.5239, 2.1263, 2.7938, 1.1597, 1.8536,
        1.4809, 1.0453, 1.6131, 0.9616, 1.7907, 2.0963, 0.8363, 0.1313, 1.2552,
        2.6176, 3.3169, 0.9542, 0.6622, 1.2455])
tensor([0.9635, 2.2767, 2.1864, 1.4981, 0.0593, 2.2318, 2.6336, 1.8114, 2.5235,
        2.0843, 1.5744, 0.9195, 3.2828, 1.3356, 0.4239, 0.6484, 1.5783, 0.6217,
        2.2783, 1.1738, 0.7524, 1.6400, 0.4848, 1.1484, 2.0387, 0.7545, 1.7639,
        2.5285, 1.9325, 0.0385, 1.1693, 0.6602])
tensor([2.2509, 0.7275, 1.0698, 0.9149, 1.8650, 1.4176, 1.0211, 1.6861, 1.6237,
        2.1444, 1.6880, 0.9437, 2.0135, 1.0426, 1.4953, 0.3369, 1.3841, 1.7618,
        1.7872, 0.5326, 3.0921, 1.6711, 1.5916, 0.6846, 1.4660, 1.1372, 1.4315,
        2.6840, 1.2566, 2.0639, 1.0983, 2.5201])
tensor([1.1694, 2.6556, 1.7032, 1.5781, 1.4159, 0.9517, 2.5772, 2.2222, 2.1363,
        3.0863, 0.3688, 1.4876, 1.6623, 2.0530, 1.3271, 2.1487, 1.0181, 0.6534,
        2.2778, 0.9635, 1.3127, 0.2119, 1.9036, 0.0226, 2.7261, 1.7074, 0.9662])
Test Accuracy: 0.7995
# same here too.
test_correct = 0

model.eval()
with torch.no_grad():
    for images, labels in test_loader:
        images = images.to(device)
        labels = labels.to(device)

        outputs = model(images)

        _, preds = torch.max(outputs, 1)
        
        print(torch.sum(preds == labels.data))
        test_correct += torch.sum(preds == labels.data)

test_acc = test_correct.double()/ len(test_dataset)
print(f"Test Accuracy: {test_acc:.4f}")
print(len(test_dataset))
tensor(25)
tensor(23)
tensor(24)
tensor(24)
tensor(23)
tensor(30)
tensor(25)
tensor(28)
tensor(27)
tensor(25)
tensor(26)
tensor(23)
Test Accuracy: 0.7995
379

Loss curve of CNN model.

Train Loss: 0.8324, 0.5082, 0.4658, 0.4420, 0.4376, 0.4262, 0.3820, 0.3857, 0.3818, 0.3714
Train Acc:  0.7457, 0.8062, 0.8049, 0.8111, 0.8210, 0.8160, 0.8259, 0.8272, 0.8407, 0.8469
Val Loss:   ## I thought the model did not prerdict any 1 but when I look at a few more pictures and than I see it. It was just a coincidence.0.6008, 0.9038, 1.7440, 1.2597, 3.0783, 3.4121, 3.4937, 8.7321, 0.6095, 2.0479
Val Acc:    0.8778, 0.8667, 0.8667, 0.8889, 0.8889, 0.8778, 0.8778, 0.8778, 0.8444, 0.8556
import matplotlib.pyplot as plt

train_loss_1 = [0.8324, 0.5082, 0.4658, 0.4420, 0.4376, 0.4262, 0.3820, 0.3857, 0.3818, 0.3714]
train_acc_1 =  [0.7457, 0.8062, 0.8049, 0.8111, 0.8210, 0.8160, 0.8259, 0.8272, 0.8407, 0.8469]
val_loss_1 =   [0.6008, 0.9038, 1.7440, 1.2597, 3.0783, 3.4121, 3.4937, 8.7321, 0.6095, 2.0479]
val_acc_1 =    [0.8778, 0.8667, 0.8667, 0.8889, 0.8889, 0.8778, 0.8778, 0.8778, 0.8444, 0.8556]

plt.plot(train_loss_1, label='CNN Model Training Loss')
plt.plot(val_loss_1, label='CNN Model Validation Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Training and Validation Loss Curve for CNN Model')
plt.legend()
plt.show()

Confusion matrix and Classification Report of CNN Model

from sklearn.metrics import confusion_matrix, classification_report
import numpy as np

# Evaluation on the test set
test_predictions = []
test_labels = []

model.eval()
with torch.no_grad():
    for images, labels in test_loader:
        images = images.to(device)
        labels = labels.to(device)

        outputs = model(images)

        _, preds = torch.max(outputs, 1)
        test_predictions.extend(preds.cpu().numpy())
        test_labels.extend(labels.cpu().numpy())

test_predictions = np.array(test_predictions)
test_labels = np.array(test_labels)

# Compute metrics
print("Classification Report:")
print(classification_report(test_labels, test_predictions))

# Generate confusion matrix
print("Confusion Matrix:")
cm = confusion_matrix(test_labels, test_predictions)
print(cm)
Classification Report:
              precision    recall  f1-score   support

           0       0.81      0.98      0.89       304
           1       0.46      0.08      0.14        75

    accuracy                           0.80       379
   macro avg       0.64      0.53      0.51       379
weighted avg       0.74      0.80      0.74       379

Confusion Matrix:
[[297   7]
 [ 69   6]]
# display images
def show_images(images, labels, predicted_labels, title):
    fig, axes = plt.subplots(2, min(5, len(images)), figsize=(12, 6))
    fig.suptitle(title, fontsize=14)

    for i, ax in enumerate(axes.flat):
        if i < len(images):
            ax.imshow(images[i].permute(1, 2, 0))
            ax.axis('off')
            ax.set_title(f"True: {labels[i]}\nPred: {predicted_labels[i]}")
        else:
            ax.axis('off')

# Get a batch of test images and labels
images, labels = next(iter(test_loader))

# Move images to device
images = images.to(device)

# Get model predictions
model.eval()
with torch.no_grad():
    outputs = model(images)
    _, preds = torch.max(outputs, 1)

# Get indices of correctly and incorrectly classified images
correct_indices = (preds == labels).nonzero().squeeze()
incorrect_indices = (preds != labels).nonzero().squeeze()

# Select correctly classified images
num_correct = len(correct_indices)
correct_images = images[correct_indices][:num_correct]
correct_labels = labels[correct_indices][:num_correct]
correct_preds = preds[correct_indices][:num_correct]

# Select incorrectly classified images
num_incorrect = len(incorrect_indices)
incorrect_images = images[incorrect_indices][:num_incorrect]
incorrect_labels = labels[incorrect_indices][:num_incorrect]
incorrect_preds = preds[incorrect_indices][:num_incorrect]

# Show the selected images
show_images(correct_images, correct_labels, correct_preds, "Correctly Classified Samples")
show_images(incorrect_images, incorrect_labels, incorrect_preds, "Incorrectly Classified Samples")

# Display the images
plt.show()
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).

^^ ^^ I thought the model did not prerdict any 1 but when I look at a few more pictures and than I see it. It was just a coincidence. And also I look to the confusion matrix and it shows us 69 + 6 predictions results are 1.