Having heard Geoffrey Hinton’s somewhat dismissive account of the contribution by physicists to machine learning in his online MOOC, it was interesting to listen to one of those physicists, Naftali Tishby, here at PI:
The surprising success of learning with deep neural networks poses two fundamental challenges: understanding why these networks work so well and what this success tells us about the nature of intelligence and our biological brain. Our recent Information Theory of Deep Learning shows that large deep networks achieve the optimal tradeoff between training size and accuracy, and that this optimality is achieved through the noise in the learning process.
In this talk, I will focus on the statistical physics aspects of our theory and the interaction between the stochastic dynamics of the training algorithm (Stochastic Gradient Descent) and the phase structure of the Information Bottleneck problem. Specifically, I will describe the connections between the phase transition and the final location and representation of the hidden layers, and the role of these phase transitions in determining the weights of the network.
Based partly on joint works with Ravid Shwartz-Ziv, Noga Zaslavsky, and Shlomi Agmon.
(See also Steve Hsu’s discussion of a similar talk Tishby gave in Berlin, plus other notes on history.)
I was familiar with the general concept of over-fitting, but I hadn’t realized you could talk about it quantitatively by looking at the mutual information between the output of a network and all the information in the training data that isn’t the target label.… [continue reading]
Lots of matter interference experiments this time, because they are awesome.
We propose and analyze an all-magnetic scheme to perform a Young’s double slit experiment with a micron-sized superconducting sphere of mass
amu. We show that its center of mass could be prepared in a spatial quantum superposition state with an extent of the order of half a micrometer. The scheme is based on magnetically levitating the sphere above a superconducting chip and letting it skate through a static magnetic potential landscape where it interacts for short intervals with quantum circuits. In this way a protocol for fast quantum interferometry is passively implemented. Such a table-top earth-based quantum experiment would operate in a parameter regime where gravitational energy scales become relevant. In particular we show that the faint parameter-free gravitationally-induced decoherence collapse model, proposed by Diósi and Penrose, could be unambiguously falsified.
An extremely exciting and ambitious proposal. I have no ability to assess the technical feasibility, and my prior is that this is too hard, but the authors are solid. Their formalism and thinking is very clean, and hence quite abstracted away from the nitty gritty of the experiment.
Do the laws of quantum physics still hold for macroscopic objects -- this is at the heart of Schrodinger's cat paradox -- or do gravitation or yet unknown effects set a limit for massive particles? What is the fundamental relation between quantum physics and gravity? Ground-based experiments addressing these questions may soon face limitations due to limited free-fall times and the quality of vacuum and microgravity.
… [continue reading]
I’m trying out a new type of post: a selection of abstracts I thought were particularly interesting this month (though not necessarily released this month). Some papers I’ll have read in detail, some not. I would be particularly interested in hearing commentary on them.
Note that although the OTIMA matter interferometers require an amount of time proportional to the superposed mass to confirm a superposition, the proportionality constant is vastly larger than the one apparently demonstrated here.
We consider a thought experiment where the preparation of a macroscopically massive or charged particle in a quantum superposition and the associated dynamics of a distant test particle apparently allow for superluminal communication. We give a solution to the paradox which is based on the following fundamental principle: any local experiment, discriminating a coherent superposition from an incoherent statistical mixture, necessarily requires a minimum time proportional to the mass (or charge) of the system. For a charged particle, we consider two examples of such experiments, and show that they are both consistent with the previous limitation. In the first, the measurement requires to accelerate the charge, that can entangle with the emitted photons. In the second, the limitation can be ascribed to the quantum vacuum fluctuations of the electromagnetic field. On the other hand, when applied to massive particles our result provides an indirect evidence for the existence of gravitational vacuum fluctuations and for the possibility of entangling a particle with quantum gravitational radiation.
In this talk we explain the elements of symplectic geometry, and sketch the proof of one of its foundational results — Gromov’s nonsqueezing theorem — using J-holomorphic curves.
… [continue reading]