Torch7 and Neural Networks

Written by on October 18, 2016 in Machine Learning, Programming with 0 Comments

This week I wanted to experiment with Torch7, a popular Machine Learning framework implemented in C/LuaJIT. Seamless CUDA support is another plus point in favour of Torch7.

I downloaded and installed Torch7 and related packages, as described here. It is important to also install cunn and cutorch packages if you need CUDA support. In my case, I did that.

Before dashing off into more complex convolutional neural nets (CNNs), I wanted to start with the basics and implement a simple fully-connected, feed-forward network. In Torch7, this is called a Sequential Model.

I decided to implement a 3-layer network, each with its transfer function. Each layer is of type Linear, the most common fully-connected layer. For the first two layers, I chose Sigmoid and Tanh transfer functions. For the output layer, I chose SoftMax (called LogSoftMax in Torch7).

Instead of hard-coding each layer and transfer function, I wanted to add a little bit of flexibility, by setting up a table that defines the layers and corresponding transfer functions. I wrote a Lua function to convert this table into the desired Sequential model.

Different Layers

Different Layers

As you can see, a simple trick is needed in the buildLayer function to create the layer dynamically.

Because I also wanted to test with CUDA support enabled, I wrote a wrapper function to convert a Tensor into CudaTensor as needed (inspired by this blog).

CUDA Wrapper

CUDA Wrapper

The rest of the code is fairly straightforward. First, I setup a flag to decide whether the code should run on CUDA. This depends on extra argument passed to the Lua file as well as on the availability of GPU on the machine. Next, I create the network, and then compute the elapsed time to trigger the network in the forward direction with random input data. The actual time taken is finally printed.

Building and Testing the Network

Building and Testing the Network

The following figure shows the output I get without CUDA option enabled:

Result Without CUDA

Result Without CUDA

Now, with CUDA enabled (pass any dummy argument to the file), here is the output:

Result With CUDA

Result With CUDA

You can see that when I enable CUDA, the program runs 5 times faster!

This experiment was done on a machine with the following configuration:


3.4 GHz



Ubuntu 14.04

Today’s example does not attempt to train the network with real samples and test the prediction accuracy. That will be the focus of a future post.

You can download the LUA code from here.

Thanks for visiting!

Tags: , ,


If you enjoyed this article, subscribe now to receive more just like it.

Subscribe via RSS Feed

Leave a Reply

Your email address will not be published. Required fields are marked *