Skip to content

A tensorflow implentation for Ensemble Learning in CNN Augmented with Fully Connected Subnetworks

License

Notifications You must be signed in to change notification settings

eaguaida/TF_EnsNet-

Repository files navigation

EnsNet - TensorFlow Implementation

This repository contains a third party TensorFlow implementation of EnsNet, a novel convolutional neural network (CNN) architecture augmented with Fully Connected Subnetworks (FCSNs), as described in [Ensemble Learning in CNN Augmented with Fully Connected Subnetworks].

EnsNet is designed to enhance image recognition performance by leveraging a base CNN combined with multiple FCSNs, improving accuracy through ensemble learning techniques to then use a majority voting count.

EnsNet Architecture

Table of Contents

💡 Introduction

EnsNet introduces an innovative approach to deep learning models for image recognition tasks. By dividing the feature maps generated by the last convolutional layer of a base CNN among multiple Fully Connected Subnetworks (FCSNs), EnsNet leverages ensemble learning within a single model architecture. This method significantly enhances the model's predictive accuracy on challenging datasets like MNIST, Fashion-MNIST, and CIFAR-10.

This architecture begins with a foundational Convolutional Neural Network (CNN), which subsequently branches into 10 distinct subnetworks. These subnetworks are trained in parallel alongside the original CNN, utilizing an ensemble learning approach to enhance model robustness and prediction accuracy.

During the inference phase, a majority voting system is used among the outputs of the subnetworks. This method consolidates the individual predictions into a single, final prediction, leveraging the collective intelligence of the ensemble to improve decision-making accuracy.

This architecture can be tested on MNIST, Fashion-MNIST, and CIFAR-10. This approach demonstrates significant improvements in predictive performance, showcasing the strength of ensemble learning in deep learning applications.

⚙️ Installation

Make sure you have the necessary dependencies for running this implementation, please follow the steps below:

pip install tensorflow tensorflow-addons
pip install keras-preprocessing
pip install cloud-tpu-client
pip install dropconnect-tensorflow


🧠 Architecture Overview

Overview

Convolutional Layers

  • Notation: Conv⟨receptive field size⟩-⟨number of channels⟩
    • Receptive Field Size: Specifies the dimensions of the filter window (e.g., 3 for a 3x3 filter), dictating the input area each convolution operation examines.
    • Number of Channels: Indicates the layer's depth or the number of filters used, with each channel targeting different features from the input.

Fully Connected Layers

  • Notation: FC-⟨number of nodes⟩
    • Number of Nodes: Denotes the count of neurons within the layer, fully interconnected to all neurons in the previous layer, enabling the network to synthesize features learned in earlier stages.

Activation Function

  • Note: Each layer employs the ReLU (Rectified Linear Unit) activation function by default, fostering non-linear learning capabilities crucial for deciphering complex patterns within the data.

🚀 Dataset Preprocessing and Training Summary

Optimizer

The parameters of the Adam optimizer were set as follows: α = 0.001, β1 = 0.9, β2 = 0.999, ϵ = 10−8, and Decay = 0 for MNIST and Fashion-MNIST

adamw_optimizer = tfa.optimizers.AdamW(
        learning_rate=0.001,
        beta_1=0.9,
        beta_2=0.999,
        epsilon=1e-8,
        weight_decay=0.0 
)

This project leverages models trained on three benchmark datasets: MNIST, Fashion-MNIST, and CIFAR-10, employing extensive data augmentation and carefully selected training parameters to enhance model performance.

📄 MNIST Dataset

  • Description: A collection of 28x28 grayscale images of handwritten digits (0-9).
  • Dataset Size: 60,000 training images, 10,000 test images.
  • Data Augmentation:
    • Rotation: ±10°
    • Scaling: 0.8-1.2x
    • Shifting: ±8% of width/height
    • Shearing: ±0.3°
  • Training Parameters:
    • Batch Size: 100
    • Epochs: 1,300

👗 Fashion-MNIST Dataset

  • Description: A set of 28x28 grayscale images across 10 fashion categories.
  • Dataset Size: 60,000 training images, 10,000 test images.
  • Data Augmentation:
    • Rotation: ±5°
  • Training Parameters:
    • Batch Size: 100
    • Epochs: 600

🦌 CIFAR-10 Dataset

  • Description: A collection of 32x32 color images across 10 categories.
  • Dataset Size: 50,000 training images, 10,000 test images.
  • Data Augmentation:
    • Rotation: ±10°
    • Scaling: 0.8-1.2x
    • Shifting: ±8% of width/height
    • Shearing: ±0.3°
  • Training Parameters:
    • Batch Size: 100
    • Epochs: 200
    • Learning Rate Decay: 0.1 every 100 epochs

Each dataset underwent specific data augmentation techniques before training to ensure the robustness and generalization of the models. The chosen batch sizes and epochs are optimized for each dataset's peculiarities, aiming for a balance between computational efficiency and model accuracy.

References

Daiki Hirata, Norikazu Takahashi. "Ensembled Learning in CNN Augmented with Fully Connected Subnetworks." https://arxiv.org/abs/2003.08562 arXiv.2003.08562

L. Wan, M. Zeiler, S. Zhang, Y. Le Cun, and R. Fergus, “Regularization of Neural Networks using DropConnect.” in International Conference on Machine Learning, 2013, pp. 1058–1066. https://arxiv.org/pdf/1708.04552.pdf

About

A tensorflow implentation for Ensemble Learning in CNN Augmented with Fully Connected Subnetworks

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published