Tishby on physics and deep learning

Having heard Geoffrey Hinton’s somewhat dismissive account of the contribution by physicists to machine learning in his online MOOC, it was interesting to listen to one of those physicists, Naftali Tishby, here at PI:

(See also Steve Hsu’s discussion of a similar talk Tishby gave in Berlin, plus other notes on history.)

I was familiar with the general concept of over-fitting, but I hadn’t realized you could talk about it quantitatively by looking at the mutual information between the output of a network and all the information in the training data that isn’t the target label.

One often hears the refrain that a lot of ML techniques were known for decades but only became useful when big computational power and huge datasets arrived relatively recently. The unreasonable effectiveness of data is often described as a surprise, but Tishby claims that (part of?) this was predicted by the physicists based on large-N limits of statistical mechanics models, but that this was ignored by the computer scientists. I don’t know near enough about this topic to assess.

He clearly has a chip on his shoulder — which naturally makes me like him. His “information bottleneck” paper with Pereira and Bialek was posted to the arXiv in 2000 and apparently rejected by the major CS conferences, but has since accumulated fourteen hundred citations.… [continue reading]

Abstracts for July 2017

  • It is well known that, despite the misleading imagery conjured by the name, entanglement in a multipartite system cannot be understood in terms of pair-wise entanglement of the parts. Indeed, there are only N(N-1) pairs of N systems, but the number of qualitatively distinct types of entanglement scales exponentially in N. A good way to think about this is to recognize that a quantum state of a multipartite system is, in terms of parameters, much more akin to a classical probability distribution than a classical state. When we ask about the information stored in a probability distributions, there are lots and lots of “types” of information, and correlations can be much more complex than just knowing all the pairwise correlations. (“It’s not just that A knows something about B, it’s that A knows something about B conditional on a state of C, and that information can only be unlocked by knowing information from either D or E, depending on the state of F…”).

    However, Gaussian distributions (both quantum and classical) are described by a number of parameters that grows on quadratically with the number of variables. The pairwise correlations really do tell you everything there is to know about the quantum state or classical distribution. The above paper makes me wonder to what extent we can understand multipartite Gaussian entanglement in terms of pairs of modes. They have shown that this works at a single level, that entanglement across a bipartition can be decomposed into modewise entangled pairs. But since this doesn’t work for mixed states, it’s not clear how to proceed in understanding the remain entanglement within a partition. My intuition is that there is a canonical decomposition of the Gaussian state that, in some sense, lays bare all the multipartite entanglement it has in any possible partitioning, in much the same way that the eigendecomposition of a matrix exposes its the inner workings.

[continue reading]

Abstracts for March 2017

  • The technique of using “laser grating”, in place of physical grating (slits), for producing spatial interference of molecules relies on the laser’s ability to ionize the molecule. (Once ionized, standing electric fields can sweep it out of the way.) But for some molecules, especially large nanoparticles, this is ineffective. Solution: attach a molecular tag to the nanoparticle that reliably cleaves in the presence of a laser, allowing the nanoparticle to be vacuumed up. Rad.

  • Berry points out that the \hbar \to 0 limit of quantum mechanics is singular, implying that things like Ehrenfest’s theorem and the canceling of the path integral are not adequate to describe the quantum-classical transition. A similar situation can be found with critical points in statistical mechanics, where the N \to \infty limit similarly becomes ill-defined. If you think that the huge intellectual investment in understanding critical points is justified by their fundamental significance (regardless of practical applications), I claim you should think similarly about the quantum-classical limit.

    Even in what philosophers might regard as the simplest reductions, between different areas within physics, the detailed working-out of how one theory can contain another has been achieved in only a few cases and involves sophisticated ideas on the forefront of physics and mathematics today….It should be clear from the foregoing that a subtle and sophisticated understanding of the relation between theories within physics requires real mathematics, and not only verbal, conceptual and logical analysis as currently employed by philosophers.

    For introductions, see these popular and non-technical treatments.

  • I have no intelligent comments about this, and have no idea if the paper is interesting. It’s just a crazy long coherence time.

  • (Note that the ACM version is a much shorter “abstract”, missing most of the content.)

  • (H/t Sean Carroll.) Jarzynski’s equality and the Crooks fluctuation theorem are recent and important strengthening of the second law of thermodynamics.

[continue reading]

Abstracts for October 2016

  • [arXiv:math-ph/0411058]

    One of the key subtleties about trying to study quantum information in a field theory is that you can’t formally decompose the Hilbert space into a tensor product of spatially local subsystems. The reasons are technical, and rarely explained well. This paper is an exception, giving an excellent introduction to the key ideas, in a manner accessible to a quantum (non-field) information theorist. (See related work by Yngvason this blogpost by Tobias Osborne and my previous discussion re: Reeh-Schielder theorem.)

  • [PDF.]

    This paper is Dieter Zeh’s in-line commentary on what might be Feynman’s most explicit exposition of his interpretation of quantum mechanics:

    As far as I know, Feynman never participated in the published debate about interpretational problems, such as quantum measurements. So I was surprised when I recently discovered a little known report about a conference regarding the role of gravity and the need for its quantization, held at the University of North Carolina
    in Chapel Hill in 1957, since it led at some point to a discussion of the measurement problem and of the question about the existence and meaning of macroscopic superpositions. This session was dominated by Feynman’s presentation of a version of Schrodinger’s cat, in which the cat with its states of being dead or alive is replaced by a macroscopic massive ball being centered at two different positions with their distinguishable gravitational fields. I found this part of the report so remarkable for historical reasons that I am here quoting it in detail for the purpose of discussing and commenting it from a modern point of view….The discussion to be quoted below certainly deserves to become better known and discussed because of the influence it seems to have had on several later developments.

[continue reading]

Abstracts for July-August 2016

  • (arXiv:1404.0686) This is the most readable introduction to many-body localization I’ve found. Gratifyingly, it describes the universal connection to thermalization and the ETH, rather than casting localization as just some peculiar wave phenomena.

  • David Wallace investigates the weird way that quantum mechanics is actually put into practice as compared with how people attempt to formalize it: the collapse postulate (leading to an updated wavefunction) is almost never used in practice, even if the Born rule is. Some striking observations that motivate this:

    Firstly, collapse is conspicuously absent from second courses in QM, and in particular in courses on relativistic QM. This ought to strike a student as peculiar… the point is not that collapse is unsatisfactory in the relativistic regime. Of course it is;…But relativistic QM textbooks contain, not an unsatisfactory collapse rule, but no collapse rule at all. One concludes that the theory must be applicable without any mention of collapse….Secondly, the theoretical physics community has been worrying for forty years about the so-called “black hole information loss paradox”…At its heart, the paradox is simply that black hole decay is non-unitary and as
    such can’t be described within the Schrodinger-equation framework. But state vector collapse is also non-unitary!… One has the clear impression that (at least this part of) the theoretical physics community does not in fact think that dynamics is non-unitary in any other contexts in physics, rendering black hole decay uniquely problematic. Tempting though it might be for this advocate of the Everett interpretation to claim that the community has adopted the many-worlds theory en masse, a more mundane account is simply that (what they regard as) orthodox QM does not include the collapse postulate…Thirdly, modern quantum field theory largely abandons Hamiltonian methods in favour of the path-integral approach.

[continue reading]

Abstracts for May-June 2016

Lots of matter interference experiments this time, because they are awesome.

  • An extremely exciting and ambitious proposal. I have no ability to assess the technical feasibility, and my prior is that this is too hard, but the authors are solid. Their formalism and thinking is very clean, and hence quite abstracted away from the nitty gritty of the experiment.

  • This is the other proposal for a super-large quantum superposition.

  • Rad.

  • Classical E&M with traditional boundary conditions is not the closest classical approximation to QED. Much better is classical E&M with boundary conditions corresponding to \hbar-size randomly fluctuating coming radiation. Timothy Boyer shows that this appears to recover all sorts of phenomena that are claimedJust more evidence that the canonical way of teaching quantum mechanics is 80% lies…a   to be distinctly quantum, such as the Casimir effect, the stability of the Bohr-radius atom, and (I think?) Bell experiments with unclosed detection loopholes.

    (H/t Godfrey Miller.)


(↵ returns to text)

  1. Just more evidence that the canonical way of teaching quantum mechanics is 80% lies…
[continue reading]

Abstracts for March-April 2016

  • See also the related works by Gooding and Unruh, which connect to Pikovski et al. (blogged here).

  • The systematic approach of this paper is very gratifying. It also comes with an accessible introduction on Sean’s blog by his grad student Grant.

  • [arXiv.]

    The authors construct the Wigner function in what seems to me to be the most sensible way: instead of position and momentum, the quasiprobability distribution is a function over field amplitudes and conjugate momenta. From scanning the paper it looks like everything behaves as you’d expect. Here’s are slides from a talk.

    Other approaches to Wigner functions for fields seem to get caught up on the idea that QFTs are often Lorentz covariant, and so your Wigner object has to be too. But the Wigner functions only break Lorentz covariance in the same way as Hamiltonian treatments of QFTs or GR (the ADM formalism). Yes, a Hamiltonian formulation misses some of beauty and simplicity found in a formalism where time and space are on equal footing, but there’s nothing broken or wrong with it. It’s just another, equally valid representation, like the Wigner function is in single-particle, nonrelativistic quantum mechanics. Even within fundamental physics, if you’re operating somewhere which dramatically breaks Lorentz symmetry — like the entire field of cosmology — then this shouldn’t bother you.

  • Philosophically related to the previous abstract. Phase-space formulations don’t have to break Lorentz covariance! (H/t Peter Woit.)

  • [arXiv.] (H/t Elliot Nelson.)

    Wavelets give a surprisingly illuminating window into quantum field theory. See here for the Daubechies wavelets, which are discretely indexed and orthonormal.

[continue reading]

Abstracts for February 2016

  • Galve and collaborators recognize that the recent Nat. Comm. by Brandao et al is not as universal as it is sometimes interpretted, because the records that are proved to exist can be trivial (no info). So Galve et al. correctly emphasize that Darwinism is dependent on the particular dynamics found in our universe, and the effectiveness of record production is in principle an open question.

    Their main model is a harmonic oscillator in an oscillator bath (with bilinear spatial couplings, as usual) and with a spectral density that is concentrated as a hump in some finite window. (See black line with grey shading in Fig 3.) They then vary the system’s frequency with respect to this window. Outside the window, the system and environment decouple and nothing happens. Inside the window, there is good productions of records and Darwinism. At the edges of the window, there is non-Markovianity as information about the system leaks into the environment but then flows back into the system from time to time. They measure non-Markovianity as the time when the fidelity between the system’s state at two different times is going up (rather than down monotonically, as it must for completely positive dynamics).

  • Although this little paper has several non-sequitors suggesting it’s been assembled like Frankenstein (Z_i is the Z Pauli operator/error for the ith qubit; don’t worry the first time he mentions Shor’s code without explaining it; etc.), it’s actually a very nice little introduction. Gottesman introduces several key ideas very quickly and logically. Good for beginners like me. See also “Operator quantum error correction” (arXiv:quant-ph/0504189) by Kribs et a.

  • (H/t Martin Ganahl.)

    Another excellent introduction, this time to matrix product states and the density-matrix renormalization group, albeit as part of a much larger review.

[continue reading]