This code demonstrates the implementation of a Convolutional Neural Network (CNN) for image classification using TensorFlow/Keras. It represents an evolution from basic neural networks by adding spatial feature extraction layers.
1. CNN Architecture (create_model)
kernel_sz = (3,3): Defines the size of the sliding window (kernel) used to extract features from images.
Conv2D (First Layer): Uses 32 filters and a stride of 2. The stride reduces the spatial dimensions (height and width) of the image by half while the filters learn specific patterns like edges or textures.
kernel_initializer='he_normal': This initializer is specifically chosen to work well with the 'relu' activation function by setting weights to a scale that prevents gradients from vanishing or exploding during training.
Conv2D (Second Layer): Increases the number of filters to 64. As the network gets deeper, it learns more complex, high-level features from the patterns identified in the first layer.
Flatten and Dense: After extracting spatial features, the 2D maps are "flattened" into a 1D vector and passed through a fully connected layer to make a final prediction via the Softmax activation.
2. Data Handling and Preprocessing
load_labels: Opens a CSV file and reads class labels line-by-line into a NumPy array.
load_images: Uses the PIL (Pillow) library to open PNG files from a folder and convert them into a 4D NumPy array format required by Keras (count, height, width, channels).
normalize_dataset: Standardizes pixel values to a range (typically -1 to 1 or 0 to 1) to improve training stability and speed.
3. Training Execution and Performance
model.compile: The model is set up with the Adam optimizer and a learning rate typically around 3e-5 for stable convergence.
Training Log Analysis:
The logs show that by Epoch 2, the training accuracy reached approximately 58.7%, indicating the CNN is effectively capturing spatial patterns in the CIFAR10 dataset.
Note on Overfitting: If training accuracy (e.g., 0.71) is much higher than validation accuracy (e.g., 0.45), the model is overfitting, and you should increase regularization strength or use dropout.