ML & DL — Artificial Neural Networks (Part 4)

fernanda rodríguez
3 min readMay 9, 2020

--

Artificial neural networks are called networks because they are represented by the composition of several different functions.

Photo by Clint Adair on Unsplash

In this article, you will find:

  • A brief introduction to artificial neural networks,
  • Graphic representation,
  • Activation and cost function,
  • Implementation of artificial neural networks with Keras in the Jupyter Notebook,
  • Partial summary.

Artificial neural networks

Most real-world problems are not linearly separable. To calculate nonlinear hypotheses, one of the most computationally efficient ways is to connect small units that do “logistic regression”.

Artificial neural networks are called networks because the composition of several different functions represents them.

Information on a network flows through functions connected in a chain[1]:

  • ŷ = 𝑓⑵(𝑓⑴(𝓍)), functions connected or artificial neural network, where:
  • (𝓍) is the input layer,
  • 𝑓⑴ is the hidden layer, and
  • 𝑓⑵ is the output layer.

for each layer 𝑓⒤:

  • 𝑓⒤(𝒽) = σ⒤(W⒤𝒽 + b⒤) , logistic regression, where:
  • σ⒤ is the activation function,
  • W⒤ is the weight matrix, and
  • b⒤ is the bias vector.

that is:

  • ŷ = σ⑵ [W⑵σ⑴ (W⑴𝓍 + b⑴) + b⑵]

Graphic representation

Graphical representation of artificial neural network by ma. fernanda rodríguez r.

Activation function

They introduce non-linearity in artificial neural networks.

Some activation functions by Activation functions.

For regression problems σ⑵: identity function,

  • σ(𝑧)=𝑧

For classification problemsσ⑵: sigmoid function.

Cost function

For regression problems:

  • MSE = 1/n ∑(y⒤-ŷ⒤)², minimizing with MSE.

For classification problems:

  • E(y, ŷ)= — ∑ y⒤ log(ŷ) , minimization with cross-entropy.

Github code

In this repository, you will find the implementation of an artificial neural network, step by step, with Keras in the Jupyter Notebook.

Model training and evaluation

  1. Data: MNIST dataset
(X_train, y_train), (X_test, y_test) = mnist.load_data()

2. Model:

model = Sequential()
# Add the input layer and hidden layer 1
model.add(Dense(32, input_shape=(784,), activation='sigmoid'))
# Add the output layer
model.add(Dense(10, activation='softmax'))

3. Compile:

model.compile(optimizer='rmsprop', loss='categorical_crossentropy',
metrics=['categorical_accuracy'])

4. Fit:

history = model.fit(X_train, y_train_cat,
batch_size=256, epochs=50,
validation_data=(X_test, y_test_cat))

5. Evaluate:

[test_loss, test_acc] = model.evaluate(X_test, y_test_cat)

MNIST results

For all training models: Epochs: 50, Batch size: 256, Optimizer: RMSProp, and Output layer: 10 softmax units.

Results model training by ma. fernanda rodríguez r.

Partial summary

Partial summary artificial neural networks by ma. fernanda rodríguez r.

All theoretical and practical implementations: Linear regression, Logistic regression, Artificial neural networks, Deep neural networks, and Convolutional neural networks.

--

--

fernanda rodríguez

hi, i’m maría fernanda rodríguez r. multimedia engineer. data scientist. front-end dev. phd candidate: augmented reality + machine learning.