01.08.2016 | etc

As part of my OpenNews fellowship I have been catching up on the theory behind machine learning and artificial intelligence. My fellowship is ending soon, so today I'm releasing the notes from the past year study to bookend my experience.

To get a good grasp of a topic, I usually went through one or two courses on it (and a lot of supplemental explanations and overviews). Some courses were fantastic, some were real slogs to get through - teaching styles were significantly different.

The best courses has a few things in common:

1. introduce intuition before formalism
2. provide convincing motivation before each concept
3. aren't shy about providing concrete examples
4. provide an explicit structure of how different concepts are hierarchically related (e.g. "these techniques belong to the same family of techniques, because they all conceptualize things this way")
5. provide clear roadmaps through each unit, almost in a narrative-like way

The worst courses, on the other hand:

1. lean too much (or exclusively) on formalism
2. seem allergic to concrete examples
3. sometimes take inexplicit/unannounced detours into separate topics
4. just march through concepts without explaining how they are connected or categorized

### Intuition before formalism

In machine learning, how you represent a problem or data is hugely important for successfully teaching the machine what you want it to learn. The same is, of course, for people. How we encode concepts into language, images, or other sensory modalities is critical to effective learning. Likewise, how we represent problems - that is, how we frame or describe them - can make all the difference in how easily we can solve them. It can even change whether or not we can solve them.

One of the courses I enjoyed the most was Patrick Winston's MIT 6.034 (Fall 2010): Artificial Intelligence. He structures the entire course around this idea of the importance of the right representation and for people, the right representation is often stories (he has a short series of videos, How to Speak, where he explains his lecturing methods - worth checking out).

Many technical fields benefit from presenting difficult concepts as stories and analogies - for example, many concepts in cryptography are taught with "Alice and Bob" stories, and introductory economics courses involve a lot of hypothetical beer and pizza. For some reason concepts are easier to grasp if we anthropomorphize them or otherwise put them into "human" terms, anointing them with volition and motivations (though not without its shortcomings). Presenting concepts as stories is especially useful because the concept's motivation is often clearly presented as part of the story itself, and they function as concrete examples.

Introducing concepts as stories also allows us to leverage existing intuitions and experiences, in some sense "bootstrapping" our understanding of these concepts. George Polya, in his Mathematics and Plausible Reasoning, argues that mathematical breakthroughs begin outside of formalism, in the fuzzy space of intuition and what he calls "plausible reasoning", and I would say that understanding in general also follows this same process.

On the other hand, when concepts are introduced as a dense mathematical formalism, it might make internal sense (e.g. "$a$ has this property so naturally it follows that $b$ has this property"), but it makes no sense in the grander scheme of life and the problems I care about. Then I start to lose interest.

### Concept hierarchies

Leaning on existing intuition is much easier when a clear hierarchy of concepts is presented. Many classes felt like, to paraphrase, "one fucking concept after another", and it was hard to know where understanding from another subject or idea could be applied. However, once it's made clear, I can build of existing understanding. For instance: if I'm told that this method and this method belongs to the family of Bayesian learning methods, then I know that I can apply Bayesian model selection techniques and can build off of my existing understandings around Bayesian statistics. Then I know to look at the method or problem in Bayesian terms, i.e. with the assumptions made in that framework, and suddenly things start to make sense.

## Some learning tips

Aside from patience and persistence (which are really important), two techniques I found helpful for better grasping the material were explaining things to other and reading many sources on the same concept.

### Explain things to other people

When trying to explain something, you often reveal your own conceptual leaps and gaps and assumptions and can then work on filling them in. Sometimes it takes a layperson to ask "But why is that?" for you to realize that you don't know either.

Most ideas are intuitive if they are explained well. They do not necessarily need to be convoluted and dense (though they are often taught this way!). If you can explain a concept well, you understand it well.