Having heard Geoffrey Hinton’s somewhat dismissive account of the contribution by physicists to machine learning in his online MOOC, it was interesting to listen to one of those physicists, Naftali Tishby, here at PI:
The surprising success of learning with deep neural networks poses two fundamental challenges: understanding why these networks work so well and what this success tells us about the nature of intelligence and our biological brain. Our recent Information Theory of Deep Learning shows that large deep networks achieve the optimal tradeoff between training size and accuracy, and that this optimality is achieved through the noise in the learning process.
In this talk, I will focus on the statistical physics aspects of our theory and the interaction between the stochastic dynamics of the training algorithm (Stochastic Gradient Descent) and the phase structure of the Information Bottleneck problem. Specifically, I will describe the connections between the phase transition and the final location and representation of the hidden layers, and the role of these phase transitions in determining the weights of the network.
Based partly on joint works with Ravid Shwartz-Ziv, Noga Zaslavsky, and Shlomi Agmon.
(See also Steve Hsu’s discussion of a similar talk Tishby gave in Berlin, plus other notes on history.)
I was familiar with the general concept of over-fitting, but I hadn’t realized you could talk about it quantitatively by looking at the mutual information between the output of a network and all the information in the training data that isn’t the target label.… [continue reading]
Lots of matter interference experiments this time, because they are awesome.
We propose and analyze an all-magnetic scheme to perform a Young’s double slit experiment with a micron-sized superconducting sphere of mass
amu. We show that its center of mass could be prepared in a spatial quantum superposition state with an extent of the order of half a micrometer. The scheme is based on magnetically levitating the sphere above a superconducting chip and letting it skate through a static magnetic potential landscape where it interacts for short intervals with quantum circuits. In this way a protocol for fast quantum interferometry is passively implemented. Such a table-top earth-based quantum experiment would operate in a parameter regime where gravitational energy scales become relevant. In particular we show that the faint parameter-free gravitationally-induced decoherence collapse model, proposed by Diósi and Penrose, could be unambiguously falsified.
An extremely exciting and ambitious proposal. I have no ability to assess the technical feasibility, and my prior is that this is too hard, but the authors are solid. Their formalism and thinking is very clean, and hence quite abstracted away from the nitty gritty of the experiment.
Do the laws of quantum physics still hold for macroscopic objects -- this is at the heart of Schrodinger's cat paradox -- or do gravitation or yet unknown effects set a limit for massive particles? What is the fundamental relation between quantum physics and gravity? Ground-based experiments addressing these questions may soon face limitations due to limited free-fall times and the quality of vacuum and microgravity.
… [continue reading]