Deep Neural Networks - Notes for Lesson 11

Lecture 11a: Hopfield Nets

(Hopfield 1982)

Now, we leave behind the feedforward deterministic networks that are trained with backpropagation gradients. We’re going to see quite a variety of different neural networks now. These networks do not have output units. These networks have units that can only be in states 0 and 1. These networks do not have units of which the state is simply a function of the state of other units. These networks are, instead, governed by an “energy function”. Best way to really understand Hopfield networks: Go through the example of the Hopfield network finding a low energy state, by yourself. Better yet, think of different weights, and do the exercise with those. Typically, we’ll use Hopfield networks where the units have state 0 or 1; not -1 or 1.

Lecture 11b: Dealing with spurious minima

The last in-video question is not easy. Try to understand how the perceptron learning procedure is used in a Hopfield net; it’s not very thoroughly explained.

Lecture 11c: Hopfield nets with hidden units

This video introduces some sophisticated concepts, and is not entirely easy. An “excitatory connection” is a connection of which the weight is positive. “inhibitory”, likewise, means a negative weight. We look for an energy minimum, “given the state of the visible units”. That means that we look for a low energy configuration, and we’ll consider only configurations in which the visible units are in the state that’s specified by the data. So we’re only going to consider flipping the states of the hidden units. Be sure to really understand the last two sentences that Geoffrey speaks in this video.

Lecture 11d: Using stochastic units to improve search

We’re still working with a mountain landscape analogy. This time, however, it’s not an analogy for parameter space, but for state space. A particle is, therefore, not a weight vector, but a configuration. What’s the same is that we’re, in a way, looking for low points in the landscape. We’re also using the physics analogy of systems that can be in different states, each with their own energy, and subject to a temperature. This analogy is introduced in slide 2. This is the analogy that originally inspired Hopfield networks. The idea is that at a high temperature, the system is more inclined to transition into configurations with high energy, even though it still prefers low energy.

3:25: “the amount of noise” means the extent to which the decisions are random. 4:20: If T really were 0, we’d have division by zero, which is not good. What we really mean here is “as T gets really, really small (but still positive)”. For mathematicians: it’s the limit as T goes to zero from above. Thermal equilibrium, and this whole random process of exploring states, is much like the exploration of weight vectors that we can use in Bayesian methods. It’s called a Markov Chain, in both cases.

Lecture 11e: How a Boltzmann machine models data

Now, we’re making a generative model of binary vectors. In contrast, mixtures of Gaussians are a generative model of real-valued vectors. 4:38: Try to understand how a mixture of Gaussians is also a causal generative model. 4:58: A Boltzmann Machine is an energy-based generative model. 5:50: Notice how this is the same as the earlier definition of energy. What’s new is that it’s mentioning visible and hidden units separately, instead of treating all units the same way.

References

Hopfield, John J. 1982. “Neural Networks and Physical Systems with Emergent Collective Computational Abilities.” Proceedings of the National Academy of Sciences 79 (8): 2554–58.

Reuse

CC SA BY-NC-ND

Citation

BibTeX citation:

@online{bochman2017,
  author = {Bochman, Oren},
  title = {Deep {Neural} {Networks} - {Notes} for {Lesson} 11},
  date = {2017-10-21},
  url = {https://orenbochman.github.io/notes/dnn/dnn-11/l_11.html},
  langid = {en}
}

For attribution, please cite this work as:

Bochman, Oren. 2017. “Deep Neural Networks - Notes for Lesson 11.” October 21, 2017. https://orenbochman.github.io/notes/dnn/dnn-11/l_11.html.