Deep Neural Networks - Notes for lecture 2a

For the course by Geoffrey Hinton on Coursera

Notes for Deep learning focusing on types of neural network architectures
deep learning
neural networks
notes
coursera
Author

Oren Bochman

Published

Monday, July 17, 2017

{{< pdf lec2.pdf class="column-margin" >}}

In this lecture we covered the main types of networks studied in the course

Lecture 2a: Types of neural network architectures

Feed-forward neural networks

Feed-forward neural networks

  • Feed forward networks are the subject of the first half of the course.
  • These are the most common type of neural network.
  • The first layer is the input and
  • The last layer is the output.
    • If there is more than one hidden layer, we call them “deep” neural networks.
  • They compute a series of transformations that change the similarities between cases.
    • The activities of the neurons in each layer are a non-linear function of the activities in the layer below.

Recurrent networks

Recurrent neural networks

Recurrent neural networks
  • These have directed cycles in their connection graph.
    • That means you can sometimes get back to where you started by following the arrows.
  • They can have complicated dynamics and this can make them very difficult to train. – There is a lot of interest at present in finding efficient ways of training recurrent nets.
  • They are more biologically realistic.

Recurrent neural networks for modeling sequences

sequence to Sequence mapping

sequence to Sequence mapping
  • Recurrent neural networks are a very natural way to model sequential data:
    • They are equivalent to very deep nets with one hidden layer per time slice.
    • Except that they use the same weights at every time slice and they get input at every time slice.
  • They have the ability to remember information in their hidden state for a long time.
    • But its very hard to train them to use this potential

An example of what RNNs can now do

  • In (Sutskever, Martens, and Hinton 2011) the authors trained a special type of RNN to predict the next character in a sequence.
  • After training for a long time on a string of half a billion characters from English Wikipedia, he got it to generate new text.
    • It generates by predicting the probability distribution for the next character and then sampling a character from this distribution.
    • The next slide shows an example of the kind of text it generates. Notice how much it knows!

Sample text generated one character at a time by Ilya Sutskever’s RNN

In 1974 Northern Denver had been overshadowed by CNL, and several Irish intelligence agencies in the Mediterranean region. However, on the Victoria, Kings Hebrew stated that Charles decided to escape during an alliance. The mansion house was completed in 1882, the second in its bridge are omitted, while closing is the proton reticulum composed below it aims, such that it is the blurring of appearing on any well-paid type of box printer.

Symmetrically connected networks

  • These are like recurrent networks, but the connections between units are symmetrical (they have the same weight in both directions).
    • John Hopfield (and others) realized that symmetric networks are much easier to analyze than recurrent networks. – They are also more restricted in what they can do. because they obey an energy function.
      • For example, they cannot model cycles.
  • In (Hopfield 1982), the author introduced symmetrically connected nets without hidden units that are now called Hopfield networks.

Symmetrically connected networks with hidden units

  • Called Boltzmann machines.
    • They are much more powerful models than Hopfield nets.
    • They are less powerful than recurrent neural networks.
    • They have a beautifully simple learning algorithm.
  • We will cover Boltzmann machines towards the end of the course.

Summary of Networks Architectures

Schematic Description
Recurrent nets Feed forward nets - regression and classification for images and tabular data.
Recurrent nets Recurrent nets - sequence to sequence
Hopfield net Hopfield nets - associative memory using symmetric nets with no hidden units
Boltzmann machines Boltzmann machines - symmetric nets with hidden units

credit: images from The Neural Network Zoo

References

Hopfield, J J. 1982. “Neural Networks and Physical Systems with Emergent Collective Computational Abilities.” Proceedings of the National Academy of Sciences 79 (8): 2554–58. https://doi.org/10.1073/pnas.79.8.2554.
Sutskever, Ilya, James Martens, and Geoffrey E Hinton. 2011. “Generating Text with Recurrent Neural Networks.” In Proceedings of the 28th International Conference on Machine Learning (ICML-11), 1017–24. https://www.cs.toronto.edu/~jmartens/docs/RNN_Language.pdf.

Reuse

CC SA BY-NC-ND

Citation

BibTeX citation:
@online{bochman2017,
  author = {Bochman, Oren},
  title = {Deep {Neural} {Networks} - {Notes} for Lecture 2a},
  date = {2017-07-17},
  url = {https://orenbochman.github.io/notes/dnn/dnn-02/l02a.html},
  langid = {en}
}
For attribution, please cite this work as:
Bochman, Oren. 2017. “Deep Neural Networks - Notes for Lecture 2a.” July 17, 2017. https://orenbochman.github.io/notes/dnn/dnn-02/l02a.html.