Approach to equilibrium in a pure-state universe

(This post is vague, and sheer speculation.)

Following a great conversation with Miles Stoudenmire here at PI, I went back and read a paper I forgot about: “Entanglement and the foundations of statistical mechanics” by Popescu et al.S. Popescu, A. Short, and A. Winter, “Entanglement and the foundations of statistical mechanics” Nature Physics 2, 754 – 758 (2006) [Free PDF].. This is one of those papers that has a great simple idea, where you’re not sure if it’s profound or trivial, and whether it’s well known or it’s novel. (They cite references 3-6 as “Significant results along similar lines”; let me know if you’ve read any of these and think they’re more useful.) Anyways, here’s some background on how I think about this.

If a pure quantum state \vert \psi \rangle is drawn at random (according to the Haar measure) from a d_S d_E-dimensional vector space \mathcal{H}, then the entanglement entropy

    \[S(\rho_S) = \mathrm{Tr}[\rho_S \mathrm{log} \rho_S], \qquad \rho_S = \mathrm{Tr}_E[\vert \psi \rangle \langle \psi \vert]\]

across a tensor decomposition into system \mathcal{S} and environment \mathcal{E} is highly likely to be almost the maximum

    \[S_{\mathrm{max}} = \mathrm{log}_2(\mathrm{min}(d_S,d_E)) \,\, \mathrm{bits},\]

for any such choice of decomposition \mathcal{H} = \mathcal{S} \otimes \mathcal{E}. More precisely, if we fix d_S/d_E and let d_S\to \infty, then the fraction of the Haar volume of states that have entanglement entropy more than an exponentially small (in d_S) amount away from the maximum is suppressed exponentially (in d_S). This was known as Page’s conjectureD. Page, Average entropy of a subsystem., and was later provedS. Foong and S. Kanno, Proof of Page’s conjecture on the average entropy of a subsystem.J. Sánchez-Ruiz, Simple proof of Page’s conjecture on the average entropy of a subsystem.; it is a straightforward consequence of the concentration of measure phenomenon.… [continue reading]

Undetected photon imaging

Lemos et al. have a relatively recent letterG. Lemos, V. Borish, G. Cole, S. Ramelow, R. Lapkiewicz, and A. Zeilinger, “Quantum imaging with undetected photons”, Nature 512, 409 (2014) [ arXiv:1401.4318 ]. in Nature where they describe a method of imaging with undetected photons. (An experiment with the same essential quantum features was performed by Zou et al.X. Y. Zou, L. J. Wang, and L. Mandel, “Induced coherence and indistinguishability in optical interference”, Phys. Rev. Lett. 67, 318 (1991) [ PDF ]. way back in 1991, but Lemos et al. have emphasized its implications for imaging.) The idea is conceptually related to decoherence detection, and I want to map one onto the other to flesh out the connection. Their figure 1 gives a schematic of the experiment, and is copied below.

Figure 1 from Lemos et al.: ''Schematic of the experiment. Laser light (green) splits at beam splitter BS1 into modes a and b. Beam a pumps nonlinear crystal NL1, where collinear down-conversion may produce a pair of photons of different wavelengths called signal (yellow) and idler (red). After passing through the object O, the idler reflects at dichroic mirror D2 to align with the idler produced in NL2, such that the final emerging idler f does not contain any information about which crystal produced the photon pair. Therefore, signals c and e combined at beam splitter BS2 interfere. Consequently, signal beams g and h reveal idler transmission properties of object O.''

The first two paragraphs of the letter contain all the meat, encrypted and condensed into an opaque nugget of the kind that Nature loves; it stands as a good example of the lamentable way many quantum experimental articles are written.… [continue reading]

Quantum Brownian motion: Definition

In this post I’m going to give a clean definition of idealized quantum Brownian motion and give a few entry points into the literature surrounding its abstract formulation. A follow-up post will give an interpretation to the components in the corresponding dynamical equation, and some discussion of how the model can be generalized to take into account the ways the idealization may break down in the real world.

I needed to learn this background for a paper I am working on, and I was motivated to compile it here because the idiosyncratic results returned by Google searches, and especially this MathOverflow question (which I’ve answered), made it clear that a bird’s eye view is not easy to find. All of the material below is available in the work of other authors, but not logically developed in the way I would prefer.


Quantum Brownian motion (QBM) is a prototypical and idealized case of a quantum system \mathcal{S}, consisting of a continuous degree of freedom, that is interacting with a large multi-partite environment \mathcal{E}, in general leading to varying degrees of dissipation, dispersion, and decoherence of the system. Intuitively, the distinguishing characteristics of QBM is Markovian dynamics induced by the cumulative effect of an environment with many independent, individually weak, and (crucially) “phase-space local” components. We will defined QBM as a particular class of ways that a density matrix may evolve, which may be realized (or approximately realized) by many possible system-environment models. There is a more-or-less precise sense in which QBM is the simplest quantum model capable of reproducing classical Brownian motion in a \hbar \to 0 limit.

In words to be explained: QBM is a class of possible dynamics for an open, quantum, continuous degree of freedom in which the evolution is specified by a quadratic Hamiltonian and linear Lindblad operators.… [continue reading]

In what sense is the Wigner function a quasiprobability distribution?

For the upteenth time I have read a paper introducing the Wigner function essentially like this:

The Wigner-representation of a quantum state \rho is a real-valued function on phase space definedActually, they usually use a more confusing definition. See my post on the intuitive definition of the Wigner function. (with \hbar=1) as

(1)   \begin{align*} W_\rho(x,p) \equiv \int \! \mathrm{d}\Delta x \, e^{i p \Delta x} \langle x+\Delta x /2 \vert \rho \vert x-\Delta x /2 \rangle. \end{align*}

It’s sort of like a probability distribution because the marginals reproduce the probabilities for position and momentum measurements:

(2)   \begin{align*} P(x) \equiv \langle x \vert \rho \vert x \rangle = \int \! \mathrm{d}p \, W_\rho(x,p) \end{align*}


(3)   \begin{align*} P(p) \equiv  \langle p\vert \rho \vert p \rangle = \int \! \mathrm{d}x \, W_\rho(x,p). \end{align*}

But the reason it’s not a real probability distribution is that it can be negative.

The fact that W_\rho(x,p) can be negative is obviously a reason you can’t think about it as a true PDF, but the marginals property is a terribly weak justification for thinking about W_\rho as a “quasi-PDF”. There are all sorts of functions one could write down that would have this same property but wouldn’t encode much information about actual phase space structure, e.g., the Jigner“Jess” + “Wigner” = “Jigner”. Ha! function J_\rho(x,p) \equiv P(x)P(p) = \langle x \vert \rho \vert x \rangle \langle p \vert \rho \vert p \rangle, which tells as nothing whatsoever about how position relates to momentum.

Here is the real reason you should think the Wigner function W_\rho is almost, but not quite, a phase-space PDF for a state \rho:

  1. Consider an arbitrary length scale \sigma_x, which determines a corresponding momentum scale \sigma_p = 1/2\sigma_x and a corresponding setNot just a set of states, actually, but a Parseval tight frame. They have a characteristic spatial and momentum width \sigma_x and \sigma_p, and are indexed by \alpha = (x,p) as it ranges over phase space. of coherent states \{ \vert \alpha \rangle \}.
  2. If a measurement is performed on \rho with the POVM of coherent states \{ \vert \alpha \rangle \langle \alpha \vert \}, then the probability of obtaining outcome \alpha is given by the Husimi Q function representation of \rho:

    (4)   \begin{align*} Q_\rho(\alpha) = \langle \alpha \vert \rho \vert \alpha \rangle. \end{align*}

  3. If \rho can be constructed as a mixture of the coherent states \{ \vert \alpha \rangle \}, thenOf course, the P function cannot always be defined, and sometimes it can be defined but only if it takes negative values.
[continue reading]

Planck, BICEP2, dust, and science news

The Planck Collaboration has released a paper describing the dust polarization in the CMB for the patch of sky used recently by BICEP2 to announce evidence for primordial gravitational waves. Things look bleak for BICEP2’s claims. See Peter Woit, Sean Carroll, Quanta, Nature, and the New York Times.

In the comments, Peter Woit criticizes the asymmetric way the whole story is likely to be reported:

I think it’s completely accurate at this point to say that BICEP2 has provided zero evidence for primordial gravitational waves, instead is seeing pretty much exactly the expected dust signal.

This may change in the future, based on Planck data, new BICEP2 data, and a joint analysis of the two data sets (although seeing a significant signal this way doesn’t appear very likely), but that’s a separate issue. I don’t think it’s fair to use this possibility to try and evade the implications of the bad science that BICEP2 has done, promoted by press conference, and gotten on the front pages of prominent newspapers and magazines.

This is a perfectly good example of normal science: a group makes claims, they are checked and found to be incorrect. What’s not normal is a massive publicity campaign for an incorrect result, and the open question is what those responsible will now do to inform the public of what has happened. “Science communicators” often are very interested in communicating over-hyped news of a supposed great advance in science, much less interested in explaining that this was a mistake. Some questions about what happens next:

1. Will the New York Times match their front page story “Space Ripples Reveal Big Bang’s Smoking Gun” with a new front page story “Sorry, these guys had it completely wrong?” Or will they bury it in the specialized “Science” section tomorrow with some sort of mealy-mouthed headline like the BBC’s today that BICEP just “underestimated” a problem?

[continue reading]

Links for September 2014

  • In discussions about the dangers of increasing the prevalence of antibiotic-resistant bacteria by treating farm animals with antibotics, it’s a common (and understandable) misconception that antibiotics serve the same purpose with animals as for people: to prevent disease. In fact, antibiotics serve mainly as a way to increase animal growth. We know that this arises from the effect on bacteria (and not, say, by the effect of the antibiotic molecule on the animal’s cells), but it is not because antibiotics are reducing visible illness among animals:

    Studies conducted in germ free animals have shown that the actions of these AGP [antimicrobial growth promoters] substances are mediated through their antibacterial activity. There are four hypotheses to explain their effect (Butaye et al., 2003). These include: 1) antibiotics decrease the toxins produced by the bacteria; 2) nutrients may be protected against bacterial destruction; 3) increase in the absorption of nutrients due to a thinning of the intestinal wall; and 4) reduction in the incidence of sub clinical infections. However, no study has pinpointed the exact mechanism by which the AGP work in the animal intestine. [More.]

  • You’ve probably noticed that your brain will try to reconcile contradictory visual info. Showing different images to each eye will causes someone to essentially see only one or the other at a time (although it will switch back and forth). Various other optical illusions bring out the brain’s attempts to solve visual puzzles. But did you know the brain jointly reconciles visual info with audio info? Behold, the McGurk effect:

  • The much-hyped nanopore technique for DNA sequencing is starting to mature. Eventually this should dramatically lower the cost and difficulty of DNA sequencing in the field, but the technology is still buggy.

[continue reading]

State-independent consistent sets

In May, Losada and Laura wrote a paperM. Losada and R. Laura, Annals of Physics 344, 263 (2014). pointing out the equivalence between two conditions on a set of “elementary histories” (i.e. fine-grained historiesGell-Mann and Hartle usually use the term “fine-grained set of histories” to refer to a set generated by the finest possible partitioning of histories in path integral (i.e. a point in space for every point in time), but this is overly specific. As far as the consistent histories framework is concerned, the key mathematical property that defines a fine-grained set is that it’s an exhaustive and exclusive set where each history is constructed by choosing exactly one projector from a fixed orthogonal resolution of the identity at each time.). Let the elementary histories \alpha = (a_1, \dots, a_N) be defined by projective decompositions of the identity P^{(i)}_{a_i}(t_i) at time steps t_i (i=1,\ldots,N), so that

(1)   \begin{align*} P^{(i)}_a &= (P^{(i)}_a)^\dagger \quad \forall i,a \\ P^{(i)}_a P^{(i)}_b &= \delta_{a,b} P^{(i)}_a \quad \forall i,a,b\\ \sum_{a} P^{(i)}_a (t_i) &= I \quad  \forall i,k \\ C_\alpha &= P^{(N)}_{a_N} (t_N) \cdots P^{(1)}_{a_1} (t_1) \\ I &= \sum_\alpha C_\alpha = \sum_{a_1}\cdots \sum_{a_N} C_\alpha \\ \end{align*}

where C_\alpha are the class operators. Then Losada and Laura showed that the following two conditions are equivalent

  1. The set is consistent“Medium decoherent” in Gell-Mann and Hartle’s terminology. Also note that Losada and Laura actually work with the obsolete condition of “weak decoherence”, but this turns out to be an unimportance difference. For a summary of these sorts of consistency conditions, see my round-up. for any state: D(\alpha,\beta) = \mathrm{Tr}[C_\alpha \rho C_\beta^\dagger] = 0 \quad \forall \alpha \neq \beta, \forall \rho.
  2. The Heisenberg-picture projectors at all times commute: [P^{(i)}_{a} (t_i),P^{(j)}_{b} (t_j)]=0 \quad \forall i,j,a,b.

However, this is not as general as one would like because assuming the set of histories is elementary is very restrictive. (It excludes branch-dependent sets, sets with inhomogeneous histories, and many more types of sets that we would like to work with.) Luckily, their proof can be extended a bit.

Let’s forget that we have any projectors P^{(i)}_{a} and just consider a consistent set \{ C_\alpha \}.… [continue reading]

Links for August 2014

  • Jester (Adam Falkowski) on physics breakthroughs:

    This year’s discoveries follow the well-known 5-stage Kübler-Ross pattern: 1) announcement, 2) excitement, 3) debunking, 4) confusion, 5) depression. While BICEP is approaching the end of the cycle, the sterile neutrino dark matter signal reported earlier this year is now entering stage 3.

  • The ultimate bounds on possible nuclides are more-or-less known from first principles.
  • UPower Technologies is a nuclear power start-up backed by Y-Combinator.
  • It is not often appreciated that “[s]mallpox eradication saved more than twice the number of people 20th century world peace would have achieved.” Malaria eradication would be much harder, but the current prospects are encouraging. Relatedly, the method for producing live but attenuated viruses is super neat:

    Attenuated vaccines can be made in several different ways. Some of the most common methods involve passing the disease-causing virus through a series of cell cultures or animal embryos (typically chick embryos). Using chick embryos as an example, the virus is grown in different embryos in a series. With each passage, the virus becomes better at replicating in chick cells, but loses its ability to replicate in human cells. A virus targeted for use in a vaccine may be grown through—“passaged” through—upwards of 200 different embryos or cell cultures. Eventually, the attenuated virus will be unable to replicate well (or at all) in human cells, and can be used in a vaccine. All of the methods that involve passing a virus through a non-human host produce a version of the virus that can still be recognized by the human immune system, but cannot replicate well in a human host.

    When the resulting vaccine virus is given to a human, it will be unable to replicate enough to cause illness, but will still provoke an immune response that can protect against future infection.

[continue reading]

Grade inflation and college investment incentives

Here is Raphael Boleslavsky and Christopher Cotton discussing their model of grade deflation in selective undergraduate programs:

Grade inflation is widely viewed as detrimental, compromising the quality of education and reducing the information content of student transcripts for employers. This column argues that there may be benefits to allowing grade inflation when universities’ investment decisions are taken into account. With grade inflation, student transcripts convey less information, so employers rely less on transcripts and more on universities’ reputations. This incentivises universities to make costly investments to improve the quality of their education and the average ability of their graduates. [Link. h/t Ben Kuhn.]

I’ve only read the column rather than the full paper, but it sounds like their model simply posits that “schools can undertake costly investments to improve the quality of education that they provide, increasing the average ability of graduates”.

But if you believe folks like Bryan Caplan, then you think colleges add very little value. (Even if you think the best schools do add more value than worse schools, it doesn’t at all follow that this can be increased in a positive-sum way by additional investment. It could be that all the value-added is from being around other smart students, who can only be drawn away from other schools.) Under Boleslavsky and Cotton’s model, schools are only incentivized to increase the quality of their exiting graduates, and this seems much easier to accomplish by doing better advertising to prospective students than by actually investing more in the students that matriculate.

Princeton took significant steps to curb grade inflation, with some success. However, they now look to be relaxing the only part of the policy that had teeth.… [continue reading]

How to think about Quantum Mechanics—Part 2: Vacuum fluctuations

[Other parts in this series: 1,2,3,4,5,6,7.]

Although it is possible to use the term “vacuum fluctuations” in a consistent manner, referring to well-defined phenomena, people are usually way too sloppy. Most physicists never think clearly about quantum measurements, so the term is widely misunderstood and should be avoided if possible. Maybe the most dangerous result of this is the confident, unexplained use of this term by experienced physicists talking to students; it has the awful effect of giving these student the impression that their inevitable confusion is normal and not indicative of deep misunderstanding“Professor, where do the wiggles in the cosmic microwave background come from?” “Quantum fluctuations”. “Oh, um…OK.” (Yudkowsky has usefully called this a “curiosity-stopper”, although I’m sure there’s another term for this used by philosophers of science.).

Here is everything you need to know:

  1. A measurement is specified by a basis, not by an observable. (If you demand to think in terms of observables, just replace “measurement basis” with “eigenbasis of the measured observable” in everything that follows.)
  2. Real-life processes amplify microscopic phenomena to macroscopic scales all the time, thereby effectively performing a quantum measurement. (This includes inducing the implied wave-function collapse). These do not need to involve a physicist in a lab, but the basis being measured must be an orthogonal one.W. H. Zurek, Phys. Rev. A 76, 052110 (2007). [arXiv:quant-ph/0703160]
  3. “Quantum fluctuations” are when any measurement (whether involving a human or not) is made in a basis which doesn’t commute with the initial state of the system.
  4. A “vacuum fluctuation” is when the ground state of a system is measured in a basis that does not include the ground state; it’s merely a special case of a quantum fluctuation.
[continue reading]

Risk aversion of class-action lawyers

The two sides in the potentially massive class-action lawsuit by silicon-valley engineers against Google, Apple, and other big tech companies reached an agreement, but that settlement was rejected by the judge. New York Times:

After the plaintiffs’ lawyers took their 25 percent cut, the settlement would have given about $4,000 to every member of the class.

Judge Koh said that she believed the case was stronger than that, and that the plaintiffs’ lawyers were taking the easy way out by settling. The evidence against the defendants was compelling, she said.

(Original court order.)

I would like to be able to explain this by understanding the economic/sociological motivations of the lawyers. People often complain about a huge chunk of the money going to the class-action lawyers who are too eager to settle, but the traditional argument is that a fixed percentage structure (rather than an hourly or flat rate) gives the lawyers the proper incentive to pursue the interests of the class by tying their compensation directly to the legal award. So this should lead to maximizing the award to the plaintiffs.

My best guess, doubtlessly considered by many others, is this: Lawyers, like most people, are risk adverse for sufficiently large amounts of money. (They would rather have $10 million for sure than a 50% chance at $50 million.) On the other hand, the legal award will be distributed over many more plaintiffs. Since it will be much smaller per person, the plaintiffs are significantly less risk adverse. So the lawyers settle even though it’s not in the best interests of the plaintiffs.

This suggests the following speculative solution for correctly aligning the incentives of the lawyers and the class action plaintiffs: Ensure that the person with the final decision-making power for the plaintiff legal team receives a percentage of the award that is small enough for that person’s utility function to be roughly as linear as the plaintiffs’.… [continue reading]

Lindblad Equation is differential form of CP map

The Master equation in Lindblad form (aka the Lindblad equation) is the most general possible evolution of an open quantum system that is Markovian and time-homogeneous. Markovian means that the way in which the density matrix evolves is determined completely by the current density matrix. This is the assumption that there are no memory effects, i.e. that the environment does not store information about earlier state of the system that can influence the system in the future.Here’s an example of a memory effect: An atom immersed in an electromagnetic field can be in one of two states, excited or ground. If it is in an excited state then, during a time interval, it has a certain probability of decaying to the ground state by emitting a photon. If it is in the ground state then it also has a chance of becoming excited by the ambient field. The situation where the atom is in a space of essentially infinite size would be Markovian, because the emitted photon (which embodies a record of the atom’s previous state of excitement) would travel away from the atom never to interact with it again. It might still become excited because of the ambient field, but its chance of doing so isn’t influenced by its previous state. But if the atom is in a container with reflecting walls, then the photon might be reflected back towards the atom, changing the probability that it becomes excited during a later period. Time-homogeneous just means that the rule for stochastically evolving the system from one time to the next is the same for all times.

Given an arbitrary orthonormal basis L_n of the space of operators on the N-dimensional Hilbert space of the system (according to the Hilbert-Schmidt inner product \langle A,B \rangle = \mathrm{Tr}[A^\dagger B]), the Lindblad equation takes the following form:

(1)   \begin{align*} \frac{\mathrm{d}}{\mathrm{d}t} \rho=- i[H,\rho]+\sum_{n,m = 1}^{N^2-1} h_{n,m}\left(L_n\rho L_m^\dagger-\frac{1}{2}\left(\rho L_m^\dagger L_n + L_m^\dagger L_n\rho\right)\right) , \end{align*}

with \hbar=1.… [continue reading]

Potentials and the Aharonov–Bohm effect

[This post was originally “Part 1” of my HTTAQM series. However, it’s old, haphazardly written, and not a good starting point. Therefore, I’ve removed it from that series, which now begins with “Measurements are about bases”. Other parts are here: 1,2,3,4,5,6,7. I hope to re-write this post in the future.]

It’s often remarked that the Aharonov–Bohm (AB) effect says something profound about the “reality” of potentials in quantum mechanics. In one version of the relevant experiment, charged particles are made to travel coherently along two alternate paths, such as in a Mach-Zehnder interferometer. At the experimenter’s discretion, an external electromagnetic potential (either vector or scalar) can be applied so that the two paths are at different potentials yet still experience zero magnetic and electric field. The paths are recombined, and the size of the potential difference determines the phase of the interference pattern. The effect is often interpreted as a demonstration that the electromagnetic potential is physically “real”, rather than just a useful mathematical concept.

The magnetic Aharanov-Bohm effect. The wavepacket of an electron approaches from the left and is split coherently over two paths, L and R. The red solenoid in between contains magnetic flux \Phi. The region outside the solenoid has zero field, but there is a non-zero curl to the vector potential as measured along the two paths. The relative phase between the L and R wavepackets is given by \Theta = e \Phi/c \hbar.

However, Vaidman recently pointed out that this is a mistaken interpretation which is an artifact of the semi-classical approximation used to describe the AB effect. Although it is true that the superposed test charges experience zero field, it turns out that the source charges creating that macroscopic potential do experience a non-zero field, and that the strength of this field is dependent on which path is taken by the test charges.… [continue reading]

A dark matter model for decoherence detection

[Added 2015-1-30: The paper is now in print and has appeared in the popular press.]

One criticism I’ve had to address when proselytizing the indisputable charms of using decoherence detection methods to look at low-mass dark matter (DM) is this: I’ve never produced a concrete model that would be tested. My analysis (arXiv:1212.3061) addressed the possibility of using matter interferometry to rule out a large class of dark matter models characterized by a certain range for the DM mass and the nucleon-scattering cross section. However, I never constructed an explicit model as a representative of this class to demonstrate in detail that it was compatible with all existing observational evidence. This is a large and complicated task, and not something I could accomplish on my own.

I tried hard to find an existing model in the literature that met my requirements, but without luck. So I had to argue (with referees and with others) that this was properly beyond the scope of my work, and that the idea was interesting enough to warrant publication without a model. This ultimately was successful, but it was an uphill battle. Among other things, I pointed out that new experimental concepts can inspire theoretical work, so it is important that they be disseminated.

I’m thrilled to say this paid off in spades. Bateman, McHardy, Merle, Morris, and Ulbricht have posted their new pre-print “On the Existence of Low-Mass Dark Matter and its Direct Detection” (arXiv:1405.5536). Here is the abstract:

Dark Matter (DM) is an elusive form of matter which has been postulated to explain astronomical observations through its gravitational effects on stars and galaxies, gravitational lensing of light around these, and through its imprint on the Cosmic Microwave Background (CMB).

[continue reading]