How I got started with neural networks

Artificial neural networks can be thought of as a set of interconnected nodes somewhat similar to brain neurons in nature. Computer neural networks can be used to analyze or categorize items from data, such as recognizing hand-written digits (see the image above) or faces in images.

Here is how I started teaching myself the basics of computer neural networks. The resources I used were these:

I did most of my learning on a MacBook Pro laptop running macOS High Sierra. I also did some exploring of programming exercises using a VMWARE Fusion virtual copy of the Raspbian Linux distribution. I believe all the required tools can also be used on Windows.

The Python programing examples in the books require that you install the numpy, matplotlib, and tensorflow Python modules. It is also possible to use the Thonny Python IDE to run the programming examples.

The first half of book (1) by Tariq Rashid is a technical overview of a simple algorithm that is used to process data by a neural network. In this part of the book there are many nice diagrams that show how a computer neural network works. I went through this part slowly and from this I got the theory behind a neural networking algorithm that optimizes the network based on mathematical gradient descent techniques. The application is to recognize numbers contained in a set of images from the MNIST database of handwritten digits. The neural network takes as input a set of 784 pixels (which represent a 28X28 pixel image) and the output is the digit contained in the image. There is a single hidden layer between the input and output nodes of the network.

In the second half of book (1), a series of Python programs are created in Jupyter notebooks. The end result is a Python program that faithfully implements the  neural network algorithm presented in the first half of the book. I used copy and paste to also create a single Python script from the notebooks that I could modify and run in IDLE.

Book #2 by Michael Taylor describes a neural networking algorithm very similar to that of book #1, so I skimmed this part of the book. In the 2nd half of book #2 there is a description of a Python program that uses the TensorFlow module to implement the algorithm. TensorFlow is an open source machine learning framework. It allows you to code many variations of neural networks, but I found it a little too general-purpose for someone just getting started with neural networks.

Follow-on activities

Once I hand created the Python program from book #1, I modified it to handle another application. In this application there was an single input of a number “x” between 0.0 and Pi. The output was the value of sin(x). You can see from the following plot how good the result was:

Neural network sin(x) 0 to pi

In order to test the model further, I changed the range of inputs to 0.0 to 1.5*pi. As you can see, in this case the model does not do well at all! So you have to careful in applying neural networks.

Neural network sin(x) 0.0 to 3*pi/2

Now that I have gotten this far, I plan next to learn more about TensorFlow and also to watch the Google Machine Learning Crash Course lectures online.