Lecture 1e: Three types of learning
The three main types of learning machine learning:
- Supervised learning
- Learn to predict an output given an input vector
- Reinforcement learning
- Learn to select an action to maximize payoff.
- Unsupervised learning
- Discover a good internal representation of the input.
- Semi supervised learning
- Semi-supervised uses a small amount of supervised data and large amount of unsupervised elarning
- Few/one shot learning
- Supervised learning with inference from one or a few examples
- Zero shot learning
- Supervised learning with inference for inputs not seen in training - usually based on learned structrure
- Transfer learning
- Learning something from one data set and use it on another
Two types of supervised learning
- Each training case consists of an input vector x and a target output t.
- Regression: The target output is a real number or a whole vector of real numbers.
- The price of a stock in 6 months time.
- The temperature at noon tomorrow.
- Classification: The target output is a class label.
- The simplest case is a choice between 1 and 0.
- We can also have multiple alternative labels.
How supervised learning typically works
- We start by choosing a model-class:
- A model-class, f, is a way of using some numerical y=f(x;W) parameters, W, to map each input vector, x, into a predicted output y.
- Learning usually means adjusting the parameters to reduce the discrepancy between the target output, t, on each training case and the actual output, y, produced by the model.
- For regression, \frac{1}{2}(y-t)^2is often a sensible measure of the discrepancy.
- For classification there are other measures that are generally more sensible (they also work better).
Reinforcement learning
- In reinforcement learning, the output is an action or sequence of actions and the only supervisory signal is an occasional scalar reward.
- The goal in selecting each action is to maximize the expected sum of the future rewards.
- We usually use a discount factor for delayed rewards so that we don’t have to look too far into the future.
- Reinforcement learning is difficult:
- The rewards are typically delayed so its hard to know where we went wrong (or right).
- A scalar reward does not supply much information.
- This course cannot cover everything and reinforcement learning is one of the important topics we will not cover.
Unsupervised learning
- For about 40 years, unsupervised learning was largely ignored by the machine learning community
- Some widely used definitions of machine learning actually excluded it.
- Many researchers thought that clustering was the only form of unsupervised learning.
- It is hard to say what the aim of unsupervised learning is.
- One major aim is to create an internal representation of the input that is useful for subsequent supervised or reinforcement learning.
- You can compute the distance to a surface by using the disparity between two images. But you don’t want to learn to compute disparities by stubbing your toe thousands of times.
Other goals for unsupervised learning
- It provides a compact, low-dimensional representation of the input.
- High-dimensional inputs typically live on or near a lowdimensional manifold (or several such manifolds).
- Principal Component Analysis is a widely used linear method for finding a low-dimensional representation.
- It provides an economical high-dimensional representation of the input in terms of learned features.
- Binary features are economical. – So are real-valued features that are nearly all zero.
- It finds sensible clusters in the input.
- This is an example of a very sparse code in which only one of the features is non-zero.
Reuse
CC SA BY-NC-ND
Citation
BibTeX citation:
@online{bochman2017,
author = {Bochman, Oren},
title = {Deep {Neural} {Networks} - {Notes} for Lecture 1e},
date = {2017-07-06},
url = {https://orenbochman.github.io/notes/dnn/dnn-01/l01e.html},
langid = {en}
}
For attribution, please cite this work as:
Bochman, Oren. 2017. “Deep Neural Networks - Notes for Lecture
1e.” July 6, 2017. https://orenbochman.github.io/notes/dnn/dnn-01/l01e.html.