Hacker News Comments on
Connections between physics and deep learning
MITCBMM
·
Youtube
·
130
HN points
·
6
HN comments
- This course is unranked · view top recommended courses
Hacker News Stories and Comments
All the comments and stories posted to Hacker News that reference this video.An excellent video lecture on this by Max himself which is brilliant and very intuitive: https://youtu.be/5MdSE-N0bxs
Here's a related 2016 talk by Max Tegmark (second author) on connections between deep learning and physics:https://www.youtube.com/watch?v=5MdSE-N0bxs
The gist of it is that physical data tends to have symmetries, and these symmetries make descriptions of the data very compressible into relatively small neural circuits. Random data does not have this property, and cannot be learned easily. Super fascinating.
⬐ 3pt14159In that case, the fact that our minds run on similar substrates is probably non-coincidental.⬐ akvadrako⬐ NoneIndeed, the connections seem profound. It seems to be a general-purpose optimal algorithm for, well, optimisation. And that would explain why the universe, our brains and AIs all trend toward it.It could also be just that intelligence tends to mirror the outside world, but that seems a bit arbitrary.
⬐ vanderZwan> It could also be just that intelligence tends to mirror the outside world, but that seems a bit arbitrary.We are part of the universe, so why would it be arbitrary if our brains were structured in ways that match typical structures found in this universe?
⬐ sitkackhttps://en.wikipedia.org/wiki/Panpsychism⬐ vanderZwanI was thinking more along the lines that if our minds process outside information in a way that makes sense of that information, partially through simulating it, it does not seem so strange if the structures end up matching the outside structures through some form of convergent evolution.From a cursory reading of that article I do not see it argue the same thing.
⬐ sitkackThat the universe is the best simulator of itself? That say, simulating water flowing through a pipe, the system doing the simulation re-formulates itself into something that resembles the pipe and the water?⬐ vanderZwan> That the universe is the best simulator of itself?What is this even supposed to mean? Also, "pipe" and "water" are ridiculously high level constructs, categorisations made by humans. Neither says anything about structures inherent to the universe.
I mean that when working with symmetries, information flow, and fundamental building blocks, certain structures just tend to pop up naturally. Hence fractals and geometrically shapes in places where you might not expect them. Or how laws of thermodynamics suddenly seems to be everywhere in biology all of a sudden now that we started looking[0][1].
[0] http://ecologicalsociology.blogspot.de/2010/11/geoffrey-west...
[1] https://www.quantamagazine.org/a-thermodynamic-answer-to-why...
None⬐ yodonThank you!!! as a Physics PhD that was one of the first videos I found on deep learning, and having no idea what a big deal his insights were I promptly forgot who the speaker was (remembering only it was a name I knew intimately from my time in Physics). Have frequently gone back to try to find this unsuccessfully.
My random stream-of-consciousness reference reactions ...Tegmark & Lin's discussion of how deep learning maps onto the physical world: Why does deep and cheap learning work so well? [1][2]
The work of Aerts et al. more than 10 years ago, including how vector space models of human categorization show QM structure [3].
Lucien Hardy's exquisite teasing out of the difference between classical and quantum probability: Quantum Theory From Five Reasonable Axioms [4]
It turns out that the innovation, power and strangeness of QM is related to separating physical processes into:
1. Linear continuous unitary reversible evolution relying on complex amplitudes (wavefunction propagation).
2. Non-linear discontinuous irreversible 'collapse' or 'measurement' or 'entanglement' or 'correlated branching', with probabilities based on the Born Rule (real values from the square of wave function amplitudes).
Neural nets also decompose a problem into successive alternating layers of reversible continuous functions and discrete irreversible categorical decisions (e.g. softmax/sigmoid logistic classifiers).
An even more obscure tangent is that most of the financial market is now constructed from 'options', which track continuous values with a continuous payoff, but offer a choice of execution to collapse the option and create a real financial result, e.g. see N.N.Taleb's dicussion of optionality [5].
[1] http://arxiv.org/pdf/1608.08225.pdf
[2] http://www.youtube.com/watch?v=5MdSE-N0bxs
[3] http://en.wikipedia.org/wiki/Diederik_Aerts#Quantum_structur...
⬐ cs702Thank you for posting this.Similar thoughts have been percolating in my stream-of-consciousness for a while, especially after coming across an earlier version of Tegmark & Lin's paper some months ago.
My take is that deep neural nets work so well in practice because not all distributions of natural data are equally likely (and therefore, the no-free-lunch theorem(s), which assumes all distributions are equally likely, doesn't hold in the real world!); and that, in turn, is because the distribution of natural data is a consequence of the laws of Physics / symmetries of the universe in which we happen to live.
PS. You will enjoy the following papers too: "An exact mapping between the Variational Renormalization Group and Deep Learning" - https://arxiv.org/abs/1410.3831 ; and "Mutual Information, Neural Networks and the Renormalization Group" - https://arxiv.org/abs/1704.06279 .
I think the argument for using physics models for constraints in a DL system was a more clear argument https://www.youtube.com/watch?v=5MdSE-N0bxs
⬐ intjkMax Tegmark! I love his book "Our Mathematical Universe". This video was a lot of fun to watch--I'll have to watch it a few more times before I understand it though :P⬐ nickeleresSO GOOD. really rare insight into the problem solving processes of top-level research physicists.⬐ onemanahh, the metasystem reimplements itself⬐ deepnotderpOh yeah, this paper was super fun :)Refreshing departure from the total reliance upon the spin-glass model.
⬐ throwaway000002Thanks for sharing this lecture by Max Tegmark.He comments on the video linking to two papers on arXiv which relate to the material in the lecture: https://arxiv.org/abs/1608.08225 and https://arxiv.org/abs/1606.06737
According to a talk by Max Tegmark[0] (and its associated paper[1]), neural nets (particularly LSTMs) might be inherently better at this sort of thing due to the way they model mutual information.Markov models are best suited to situations where an observation k-steps in the past gives exponentially less information about the present[2] (decaying according to something like λ^k for 0 <= λ < 1). Intuitively, the amount of context imparted by a word or phrase decays somewhat more slowly. That is, if I know the previous five words, I can make a good prediction about the next one, and likely the next one, and slightly less likely the one after that, whereas in a Markovian setting my confidence in my predictions should decay much more quickly.
So in answer to the grandparent, such a thing should be reasonably straightforward to build if it doesn't exist already, and it may offer improvements over a similar model based on Markov chains.
---
0. https://www.youtube.com/watch?v=5MdSE-N0bxs
1. https://arxiv.org/abs/1606.06737
2. Why is this? Lin & Tegmark offer details in the paper, but it comes from the fact that the singular values of the transition matrix are all less than or equal to one (an aperiodic & ergodic transition matrix has only one singular value equal to one), and so the other singular vectors fall away exponentially quickly, with the exponent's base being their corresponding singular value.
⬐ tfggIt sounds like Tegmark is pointing out a pretty obvious and deliberately designed property of LSTMs... the entire point of them is to avoid exponentially decaying / exploding gradients and allow propagation of information over longer time-scales.
You should watch Max Tegmark's talk "Connections between Physics and Deep Learning" if this interests you [1].Additionally, a paper that has everyone excited about deep connections between the mathematical analysis of physical systems and the hierarchical feature learning paradigm speaks of the connection in terms of the Renormalization Group [2].
Regardless of it's practical utility, the philosophical implications do tickle the intellect. On a dreamy note, I wonder if it would be possible to draw Category-theoretic parallels between some physical theories and statistical learning theory. There is so much to learn, and I am trying my best to teach myself (on the side) the mathematics that they don't teach in my CS grad school. So much to learn, such little time. :)