ML & DL — Convolutional Neural Networks (Part 6)

3 min readMay 10, 2020

Convolutional neural networks extract the most useful information for the task at hand.

In this article, you will find:

A brief introduction to convolutional neural networks,
Parameters and layers,
Regularization,
Implementation of a convolutional neural network with Keras in the Jupyter Notebook,
Partial summary.

Convolutional neural networks

Convolutional neural networks or Convolutional networks are neural networks that use convolution instead of the general multiplication of the matrix, in at least one of its layers[1].

Convolution is a mathematical operation that describes a rule of how to mix two functions or parts of the information.

Convolution of an image with an edge detector convolution kernel by Deep Learning in a Nutshell.

S(i,j) = (I*K)(i,j) , convolution, where
I Feature map,
K convolution kernel, and
S(i,j) Map of transformed characteristics.

Parameters

Convolutional layers have parameters that are learned so that these filters are automatically adjusted to extract the most useful information for the task at hand.

Input is a multidimensional array of data,
Kernel is a multidimensional array of parameters,

These multidimensional arrays are tensors:

Time-series: grid 1D regular time intervals,
Image data: 2D pixel grid.

Layers

Convolution: extract features from the image,
Pooling: reduce the size of the entrance, and
Dense/Fully connected: connect layers.

Regularization

It is used to overcome the problem of underfitting and overfitting.

In the regularization, we penalize the loss by adding a standard L1 (LASSO) or L2 (Ridge) in the weight vector W. These penalties are incorporated in the loss function that the network optimizes.

• L1: the sum of the absolute value of the coefficients.
• L2: the sum of the squared value of the coefficients.
• Dropout: randomly sets a fraction of input units to 0 for each update during training time

Github code

In this repository, you will find the implementation of a convolutional neural network, step by step, with Keras in Jupyter Notebook.

mafda/deep_learning_101

Permalink Dismiss GitHub is home to over 50 million developers working together to host and review code, manage…

github.com

Model training and evaluation

Data: MNIST dataset

(X_train, y_train), (X_test, y_test) = mnist.load_data()

2. Model:

model = Sequential()
# Add the input layer and hidden layer 1
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=input_shape))
# Add hidden layer 2
model.add(Conv2D(64, (3, 3), activation='relu'))
# Flatten convolutional output
model.add(Flatten())
# Add hidden layer 3
model.add(Dense(128, activation='relu'))
# Add the output layer
model.add(Dense(10, activation='softmax'))

3. Compile:

model.compile(optimizer='rmsprop', loss='categorical_crossentropy',
              metrics=['categorical_accuracy'])

4. Fit:

history = model.fit(X_train, y_train_cat,
                    batch_size=256, epochs=50,
                    validation_data=(X_test, y_test_cat))

5. Evaluate:

[test_loss, test_acc] = model.evaluate(X_test, y_test_cat)

MNIST results

For all training models: Epochs: 50, Batch size: 256, Optimizer: RMSProp, and Output layer: 10 softmax units.

Results model training by ma. fernanda rodríguez r.

Partial summary

All theoretical and practical implementations: Linear regression, Logistic regression, Artificial neural networks, Deep neural networks, and Convolutional neural networks.

For those looking for all the articles in our ML & DL series. Here is the link.

ML & DL — Machine Learning and Deep Learning 101

Introduction to the basic notions that involve the concept of Machine Learning and Deep Learning.

medium.com

References

[1] Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep Learning. MIT Press.

Deep Learning in a Nutshell: Core Concepts | NVIDIA Developer Blog

This post is the first in a series I'll be writing for Parallel Forall that aims to provide an intuitive and gentle…

devblogs.nvidia.com

A deeper understanding of NNets (Part 1) — CNNs

Introduction

towardsdatascience.com