Neural Network Layer
The fundamental building block of most modern neural networks is a layer of neurons. Today, we'll learn how to construct a layer of neurons and once we have that down, we can take those building blocks and put them together to form a large neural network.
Let's look at some terminology:
x = input features
a(activation) = a term from neuroscience, it refers to how much a neuron is sending a high output to other neurons downstream from it.
We'll be using a to denote the output of each neuron, take a look at the following image of a single neuron, with a single input feature, using an activation function to output a prediction.
This single unit of neuron can be thought of as a very simplified model of a single neuron.
What the neuron does is:
it takes an input feature (price, etc)
then it computes the activation formula
then outputs the prediction (e.g. probability of an item being a top seller)
Given this description of a single neuron, building a neural network only requires taking a bunch of these neurons and wiring them together.
A layer is a grouping of neurons which takes as input the same or similar features, and that, in turn, outputs a few numbers together. A layer can have multiple neurons or a single neuron.
Let's now take a look at a simple neural network with one hidden layer:
Layer 2 is called the output layer because the outputs of this final neuron is the probability predicted by the neural network.
Layer 1 is known as "activations" in the terminology of neural network.
This particular neural network carries out computations as follows:
it inputs 4 numbers, then that layer of neural network uses the four numbers
to compute what's in layer 1, aka "activation values"
then, the final layer, the output layer of the neural network, used those 3 numbers to compute one number.
Each neuron in a certain layer (e.g., the layer in the middle) will have access to every feature, to every value from the previous layer. Essentially, that's all a neural network is, it has a few layers where each layer inputs a vector and outputs another vector of numbers.
The middle layer is called the hidden layer(s). In a training set, you get to observe both x and y, so the data shows your what the correct inputs and outputs are, but it doesn't tell you what the correct values are for the middle layer, as the correct values for those are hidden, hence the name.
One way to think of a neural network is, they can learn their own features that makes it easier to make accurate predictions. To summarize a neural network:
starts with input features which forms the input layer with vector features
it inputs to the hidden layer, which outputs a vector of activations
the output layer then takes these vector of activations and output one number/final activation/final prediction of the neural network.
Comments