Welcome to the NumPy Neural Network project! This repository contains a Python implementation of a neural network framework using basic Python and NumPy libraries. The framework is designed as a personal project aimed at revisiting machine learning concepts and practicing coding skills. It offers a modular architecture that can be easily customized for testing new solutions and architectures.
The NumPy Neural Network project is a personal initiative to dive back into machine learning concepts. The goal is to create a flexible and user-friendly neural network framework that serves as a hands-on learning experience. The framework is designed to be modular and easily adaptable for experimenting with new ideas and approaches.
To explore and experiment with the NumPy Neural Network framework, follow these steps:
- Clone this repository to your local machine:
git clone https://github.com/RECHE23/NumPy-Neural-Network.git- Navigate to the project directory:
cd NumPy-Neural-Network- Set up a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows, use "venv\Scripts\activate".- Install the required packages:
pip install -r requirements.txt
pip install -r optional-requirements.txt # If you intend to run the tests and examples.
Please note that, while opt_einsum is optional, it provides a great increase in speed.
The framework allows you to build, train, and evaluate neural network models using basic Python and NumPy libraries. The NeuralNetwork class provides an intuitive interface for constructing models and training them on your data.
Here's a basic example of how to use the framework:
from neural_network import *
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
# Create a dataset:
X, y = make_classification()
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
# Create a neural network:
nn = NeuralNetwork(
Linear(in_features=20, out_features=64),
ReLU(),
Linear(in_features=64, out_features=32),
ReLU(),
Linear(in_features=32, out_features=2),
ReLU(),
SoftmaxCategoricalCrossEntropy()
)
# Train the neural network:
nn.fit(X_train, y_train, epochs=20, batch_size=5, shuffle=True)
# Make predictions:
y_pred = nn.predict(X_test)
# Evaluate the neural network:
score = accuracy_score(y_test, y_pred)
print(f"Accuracy score on the test set: {score:.2%}")The NumPy Neural Network project provides a range of features to help you build, train, and experiment with neural network models. These features include:
The framework supports various types of layers that can be combined to create complex neural network architectures:
-
LinearA dense (fully connected) layer similar to PyTorch's Linear or Tensorflow's Dense. -
Conv2dA 2D convolutional layer similar to PyTorch's Conv2d or Tensorflow's Conv2D. -
BatchNorm2dA 2D batch normalization similar to PyTorch's BatchNorm2d or Tensorflow's BatchNormalization. -
AvgPool2dA average pooling layer similar to PyTorch's AvgPool2d or Tensorflow's AveragePooling2D. -
MaxPool2dA average pooling layer similar to PyTorch's MaxPool2d or Tensorflow's MaxPooling2D. -
DropoutA average pooling layer similar to PyTorch's Dropout or Tensorflow's Dropout. -
Activation Layers: A collection of activation functions:
ReLU,Sigmoid,Tanh,LeakyReLU,Swish,ELU,SELU,Softplus,GELU,SiLU,CELU,ArcTan,BentIdentity,Mish,Gaussian
-
Output Layers: A special type of layer with an activation function and a loss function:
OutputLayerA generic output layer with specified activation function and loss function.SoftmaxBinaryCrossEntropy/SoftminBinaryCrossEntropyOutputs a probability distribution with a binary cross entropy loss.SoftmaxCategoricalCrossEntropy/SoftminCategoricalCrossEntropyOutputs a probability distribution with a categorical cross entropy loss.
-
Shape manipulation layers: An ancillary layer that reshape the data for compatibility between layers:
ReshapeA layer for reshaping the input data to a specified shape.FlattenA layer for flattening the input data with specified start and end dimensions.UnflattenA layer for unflattening the input data.
The framework provides several loss functions for training neural networks:
-
binary_cross_entropyCross entropy loss for binary classification. -
categorical_cross_entropyStandard cross entropy loss for multi-class classification. -
mean_absolute_errorL1 loss suitable for robust regression. -
mean_squared_errorStandard MSE loss for regression tasks.
The framework provides a collection of optimization methods to fine-tune neural network parameters:
-
Stochastic Gradient Descent (SGD): Apply the classic SGD optimizer with customizable learning rate and momentum for gradient descent.
-
Momentum: Utilize momentum optimization to accelerate convergence by incorporating a moving average of past gradients.
-
Nesterov Momentum: Improve upon standard momentum optimization with Nesterov accelerated gradient (NAG) for smoother convergence.
-
Adagrad: Implement adaptive gradient optimization with Adagrad, which adjusts learning rates for individual parameters.
-
RMSprop: Incorporate RMSprop optimization to adaptively adjust learning rates based on accumulated gradient magnitudes.
-
Adadelta: Utilize the Adadelta optimizer, which adapts learning rates based on moving average gradients and squared gradients.
-
Adam and Adamax: Apply the Adam and Adamax optimizers, which combine features of both momentum optimization and RMSprop for faster convergence.
-
Modular Architecture: Design your own custom neural network architectures by combining different layers and activation functions.
-
Easy-to-Use Interface: Utilize the intuitive
NeuralNetworkclass for creating, training, and evaluating models with minimal coding effort. -
Performance Metrics: Evaluate model performance using metrics like accuracy, precision, recall, F1-score, and confusion matrices.
-
Batch Training: Train models using mini-batch gradient descent for improved convergence and memory efficiency.
-
Callbacks: Add your own callbacks for monitoring the neural network during training.
-
Customizable Parameters: Customize various hyperparameters, such as learning rates and batch sizes, to fine-tune the training process.
-
Example Projects: Explore the
examplesdirectory for detailed usage examples, including image classification, XOR gate learning, and more.
These features collectively enable you to construct, train, and evaluate neural network models across various domains while gaining insights into machine learning concepts and techniques.
Here are the next features I intend to implement:
- Weight initialization module with support for:
- Xavier Glorot uniform
- Xavier Glorot normal
- Kaiming He uniform
- Kaiming He normal
- Orthogonal
- Additional loss functions:
- HuberLoss : Combines MSE and MAE to be less sensitive to outliers.
- Hinge loss : A loss function used for "maximum-margin" classification.
- Regularization methods:
- L1 regularization : Adds a penalty equal to the absolute value of the weights to the loss function.
- L2 regularization: Adds a penalty equal to the square of the weights to the loss function.
- Elastic net regularization: Combines L1 and L2 regularization.
- Additional callbacks:
- Early stopping : Stop training early if model performance stops improving on a validation set.
- Model checkpoint : Save model checkpoints during training at defined intervals.
- Learning rate scheduler : Dynamically adjust learning rate at different epochs using a schedule.
- Additional normalization modules:
- Layer normalization : Normalization across the features and channels for each sample in a batch.
- Instance normalization : Normalization across each channel for each sample in a batch.
- Group normalization : Splits channels into groups and normalizes within each group for each sample in a batch.
- Additional modules:
- 1D Convolution, batch normalization and pooling.
- Reccurent layers such as RNN, LSTM and GRU.
- Better and more detailed examples:
- Jupyter notebooks with various models and datasets
- Comparison of performance between NumPy Neural Network, PyTorch and Tensorflow.
- Support for regression.
While contributions are not the primary focus of this personal project, suggestions and feedback are always welcome. If you have ideas for improvements or spot any issues, feel free to create an issue or reach out.
This project is licensed under the MIT License. Feel free to explore and modify the code as a learning exercise.
This project was created by René Chenard, a computer scientist and mathematician with a degree from Université Laval.
You can contact the author at: rene.chenard.1@ulaval.ca