Introduction to Neural Networks

This guide will introduce you to the basics of neurals networks using examples and drawings to hopefully dismystify one the the coolest principles in machine learning. A neural network can – just like other machine learning models – learn to perform tasks from examples rather than being explicitly programmed. What really makes a neural networks stand out from other models is its abillity to digest vast amounts of unstructured data like video, audio or text and learn from it.

An artificial neural network is inspired by the biological neural networks in our brains. That does not mean that a neural network model is a perfect recreation of the human brain, but merely that it utilizes some of the same basic concepts for learning that our brains do.

Your brain is constantly processing tons of information that you come across every day. The first couple of times in your life that you are presented with a basic task, like tying your shoes, it will seem like a challenging or even tedius task. However, the more times your brian is presented with tasks like these, the easier it will be to solve future similar problems. That is because you are learning!

But how does this relate to a neural network?

Let me try to explain it with an example. Take a look at this handwritten word.

Most of you will see the word “Hello”. In fact, you probably instantly knew that you were reading the word “Hello”. You did not have to analyze each letter one at a time to realize you were presented with “H”, “e”, “l”, “l” and “o”, and not something like “He110”. There reason for this is quite simple: Throughout your life, you have come across this word many many times and your brain will therefore automatically recognize the it – even if there exists other possibilities with similar appearances.

Likewise, most people will read ¬†as “economics 101” rather that “economics lol”!

Just like us, a neural network is able to learn to recognize patterns (make predictions) based a series of previous examples (a dataset).

Let’s have a look at how a neural network makes predictions in the next section!


Neural Networks: Without the Math

Here is a sketch of a biological neuron.

Our brians consist of approximately 100 billion neurons. Each of these work in a remarkably simple way: Upon receiving an input, the neuron makes a calculation and returns an output that is passed on to one or more new neurons. We will now apply the same idea to build an artificial neural network. Here is a common representation of a neuron.

The question remains. How does this little thing help us make a decision? I’ll get to that – don’t worry!

To understand how a neuron in a network make decisions (or predictions), imagine a neuron whose sole purpose is to figure out whether some animal is dangerous or not. Now, if this neuron is presented with an input like the height of the animal, it will be able to make a prediction of wheter the animal is dangerous or not. It may look something like this.

or

You’ve probably already pointed out the obvious. This model is way too simple. Based upon height alone, a horse may be considered dangerous while a piranha will be perfectly safe. Thats where the network of neurons come into play. By expanding the network to consist of multiple neurons, it will be able to use multiple features and interconnections between these features to make decisions. The neural network can be expanded to consider height, weight, teeth size, number of legs, etc. when making a prediction.

A typical neural network could therefore end up looking like this.

Take a second to study the image above. Notice how the network likely would predict “dangerous” if only given the first two inputs, number of legs and claw type. Since the network knew the weight of the animal as well, however, it was able to make a better prediction.

This idea of expanding the network to build more complex models has proven to be extremely useful. The next section will discuss the mathematics behind a neural network, and how expanding the network allows for more presise models.

Neural Networks: The Math

As we have already talked about, a neural network is build up of neurons. Each of these neurons are receiving one or more inputs and processing them to produce an output. This output is created mathematically by the functionm f(x) as follows.

The most common way for a neuron to produce an output is by weighing the input and adding a bias to it. The result is then given as input to an activation function that ultimately will decide wether or not the neuron “fires” i.e. produces an output. This can be illustrated like follows.

Where w is the input weight, b is the bias, z = w*x + b, and a(z) is the activation function. The activation function is somewhat abitrary but popular choices for this function is the sigmoid function, hyperbolic tangent (tanh), or even a step function.

A standard neural network consists of an input layer, one or multiple hidden layers and an output layer. Neurons are divided across the hidden layers and are responsible for doing the calculations of the network. This sketch illustrates a basic neural network architecture:

X1 and X2 are inputs while Y is the output. Each neuron in the hiddens layers and the output layer contains a function that will calculate Y given the inputs X1 and X2. Let’s take another example.

We own a weather station and our job is to predict wether or not it will snow tomorrow given todays weather conditions, temperature and air humidity. Previously, temperature has proven to be a more important factor than humidity when calculating the risk of snow. This means that your network should pay more attention to the temperature feature when predicting snow conditions. This is done be tweeking all the neurons and their connections appropriately.

How does neural networks learn?

Let’s continue with the weather station example. We want our neural network to be able to predict the weather without telling it explicitly how to do so. Instead, we wanna show it old weather data and let it figure it out itself!

This can be done using back propagation or backward propagation of errors which is a commonly used algorithm for supervised machine learning. Rather than feeding input features into a pretrained network to calculate an output, matching input- and output data is fed into to network and used to “learn” the weights and biases and thereby train the network. A good network architecture and a good amount of data will usually imporve the results.

After training, the neural network can be presented with todays temperature and air humidity to make a prediction for tomorrows weather.

If you are interested in building your own neural network with Tensorflow in Python, check out this guide!

Summary

Hopefully, this introduction to neural networks was useful to some of you! If you enjoy the site and you want the guides to keep coming, feel free to leave a comment or follow us on Facebook.

Leave a Reply

Your email address will not be published.