top of page

Day 19: Neural Network Model

Neural Network Layer

The fundamental building block of most modern neural networks is a layer of neurons.

In this section, we will learn how to construct a layer of neurons and once we have that down, we would be able to take those building blocks and put them together to form a large neural network.

Recall the following neural network from yesterday:

One final optional step is to decide if we want a binary prediction: yes(1) or no(0).

We can set a threshold as we have seen in logistic regression and set it at 0.5.

so, if a[2] >= 0.5, the prediction would be yes(1).

More complex neural network

By conventions, when we say a neural network has 4 layers, that includes all the hidden and output layer, but exclude the input layer, take a look at an example of a neural network with 4 layers:

Let's zoom in to hidden layer 3, which is the third and final layer, to look at the computations of that layer.

Layer 3 inputs a vector a[2] that was computed by the previous layer and outputs a[3], which is another vector. What is the computation that layer 3 does in order to go from a[2] to a[3]?

if it has 3 neurons or we call it 3 hidden units, then it has parameters w1, b1, w2, b2, w3, b3, and each computes a, let's take a closer look at them (in this post, bolded parameters are vectors, and non-bolds are constants):


a1[3] = activation unit 1 in layer 3

also note that that a1[3] uses a[2] during computation; a1[3] = g(w1[3] . a[2] + b1[3]

Let's compute the general form of this equation for an arbitrary layer 0 and unit j, this is also the notation we'll be using:

Please note the difference between the letter l and the number 1 in the notation, for example: "output of layer 'l - 1' (previous layer)" 'l - 1' is the letter L minus the number 1

The computation we see in this post from left to right (starting at x(a[0]) -> a[1] -> a[2] -> a[3] -> a[4]

This algorithm is called forward propagation, as we're propagating the activations of the neurons. This is in contrast to a different propagation called backward propagation.

Note: During this course, back-propagation was an optional course that was briefly covered in the ML specialization, but I will not be covering the topic in my notes for the ML specialization. Instead, I will cover back-propagation in the Deep Learning specialization notes, which I'll start posting once I've completed covering all the notes in the ML specialization.

Recent Posts

See All

Day 39: Tree Ensembles

Using Multiple Decision Trees One of the weaknesses of using a single decision tree is that decision tree can be highly sensitive to small changes in the data. One solution to make the algorithm less


bottom of page