ML & DL — Artificial Neural Networks (Part 4)

3 min readMay 9, 2020

Artificial neural networks are called networks because they are represented by the composition of several different functions.

In this article, you will find:

A brief introduction to artificial neural networks,
Graphic representation,
Activation and cost function,
Implementation of artificial neural networks with Keras in the Jupyter Notebook,
Partial summary.

Artificial neural networks

Most real-world problems are not linearly separable. To calculate nonlinear hypotheses, one of the most computationally efficient ways is to connect small units that do “logistic regression”.

Artificial neural networks are called networks because the composition of several different functions represents them.

Information on a network flows through functions connected in a chain[1]:

ŷ = 𝑓⑵(𝑓⑴(𝓍)), functions connected or artificial neural network, where:
(𝓍) is the input layer,
𝑓⑴ is the hidden layer, and
𝑓⑵ is the output layer.

for each layer 𝑓⒤:

𝑓⒤(𝒽) = σ⒤(W⒤𝒽 + b⒤) , logistic regression, where:
σ⒤ is the activation function,
W⒤ is the weight matrix, and
b⒤ is the bias vector.

that is:

ŷ = σ⑵ [W⑵σ⑴ (W⑴𝓍 + b⑴) + b⑵]

Graphic representation

Graphical representation of artificial neural network by ma. fernanda rodríguez r.

Activation function

They introduce non-linearity in artificial neural networks.

For regression problems σ⑵: identity function,

σ(𝑧)=𝑧

For classification problemsσ⑵: sigmoid function.

Cost function

For regression problems:

MSE = 1/n ∑(y⒤-ŷ⒤)², minimizing with MSE.

For classification problems:

E(y, ŷ)= — ∑ y⒤ log(ŷ) , minimization with cross-entropy.

Github code

In this repository, you will find the implementation of an artificial neural network, step by step, with Keras in the Jupyter Notebook.

mafda/deep_learning_101

Permalink Dismiss GitHub is home to over 50 million developers working together to host and review code, manage…

github.com

Model training and evaluation

Data: MNIST dataset

(X_train, y_train), (X_test, y_test) = mnist.load_data()

2. Model:

model = Sequential()
# Add the input layer and hidden layer 1
model.add(Dense(32, input_shape=(784,), activation='sigmoid'))
# Add the output layer
model.add(Dense(10, activation='softmax'))

3. Compile:

model.compile(optimizer='rmsprop', loss='categorical_crossentropy',
              metrics=['categorical_accuracy'])

4. Fit:

history = model.fit(X_train, y_train_cat,
                    batch_size=256, epochs=50,
                    validation_data=(X_test, y_test_cat))

5. Evaluate:

[test_loss, test_acc] = model.evaluate(X_test, y_test_cat)

MNIST results

For all training models: Epochs: 50, Batch size: 256, Optimizer: RMSProp, and Output layer: 10 softmax units.

Results model training by ma. fernanda rodríguez r.

Partial summary

All theoretical and practical implementations: Linear regression, Logistic regression, Artificial neural networks, Deep neural networks, and Convolutional neural networks.

For those looking for all the articles in our ML & DL series. Here is the link.

ML & DL — Machine Learning and Deep Learning 101

Introduction to the basic notions that involve the concept of Machine Learning and Deep Learning.

medium.com

References

[1] Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep Learning. MIT Press.