Branches and matrix-product states

I’m happy to use this bully pulpit to advertise that the following paper has been deemed “probably not terrible”, i.e., published.

When the wave function of a large quantum system unitarily evolves away from a low-entropy initial state, there is strong circumstantial evidence it develops “branches”: a decomposition into orthogonal components that is indistinguishable from the corresponding incoherent mixture with feasible observations. Is this decomposition unique? Must the number of branches increase with time? These questions are hard to answer because there is no formal definition of branches, and most intuition is based on toy models with arbitrarily preferred degrees of freedom. Here, assuming only the tensor structure associated with spatial locality, I show that branch decompositions are highly constrained just by the requirement that they exhibit redundant local records. The set of all redundantly recorded observables induces a preferred decomposition into simultaneous eigenstates unless their records are highly extended and delicately overlapping, as exemplified by the Shor error-correcting code. A maximum length scale for records is enough to guarantee uniqueness. Speculatively, objective branch decompositions may speed up numerical simulations of nonstationary many-body states, illuminate the thermalization of closed systems, and demote measurement from fundamental primitive in the quantum formalism.

Here’s the figureThe editor tried to convince me that this figure appeared on the cover for purely aesthetic reasons and this does not mean my letter is the best thing in the issue…but I know better! a   and caption:

Spatially disjoint regions with the same coloring (e.g., the solid blue regions \mathcal{F}, \mathcal{F}', \ldots) denote different records for the same observable (e.g., \Omega_a = \{\Omega_a^{\mathcal{F}},\Omega_a^{\mathcal{F}'},\ldots\}). (a) The spatial record structure of the Shor-code family of states, which can exhibit arbitrary redundancy (in this case four-fold) for two incompatible observables. (b) The solid orange observable pair-covers the hashed blue observable because the top two orange records overlap all blue records. However, if one of the top two orange records is dropped, then neither observable pair-covers the other, and hence both are compatible, despite many overlaps of individual records. (c) Any spatially bounded set of records can be contained inside a single record of a sufficiently dilated but otherwise identical set of records for an incompatible observable; such a state is given in Eq. (9). (d) Any observable with records satisfying the hypothesis of the Corollary for some length \ell cannot pair-cover, or be pair-covered by, any other such observable.

It is my highly unusual opinion that identifying a definition for the branches in the wavefunction is the most conceptually important problem is physics. The reasoning is straightforward: (1) quantum mechanics is the most profound thing we know about the universe, (2) the measurement process is at the heart of the weirdness, and (3) the critical roadblock to analysis is a definition of what we’re talking about. (Each step is of course highly disputed, and I won’t defend the reasoning here.) In my biased opinion, the paper represents the closest yet anyone has gotten to giving a mathematically precise definition.

On the last page of the paper, I speculate on the possibility that branch finding may have practical (!) applications for speeding up numerical simulations of quantum many-body systems using matrix-product states (MPS), or tensor networks in general. The rough idea is this: Generic quantum systems are exponentially hard to simulate, but classical systems (even stochastic ones) are not. A definition of branches would identify which degrees of freedom of a quantum system could be accurately simulated classically, and when. Although classical computational transitions are understood in many certain special cases, our macroscopic observations of the real world strongly suggest that all systems we study admit classical descriptions on large enough scales.Note that whether certain degrees of freedom admit a classical effective description is a computational question. The existence of such a description is incompatible with them exhibiting quantum supremacy, but it is compatible with them encoding circumstantial evidence (at the macro scale) for quantum mechanics (at the micro scale). b   There is reason to think a general (abstract) description of the quantum-classical transition is possible, and would allow us to go beyond special cases.

In the rest this blog post I’m going to construct a simple MPS state featuring branches with records that are arbitrarily redundant. The state’s key features will be that it is translationally invariant and has finite correlation length. Translational invariance makes the state much easier to study and compare with the literature, and is a property shared with the simple inflationary model I’m studying with Elliot Nelson. Finite correlation length eliminates the trivial solution of a generalized GHZ state, guarantees that there are an infinite number of branches (if there are any at all), and is in some sense more natural.

An MPS with branches

Our strategy will be to build the state up from a classical probability distribution that already has the key features. Let \vec{b} = (b_1, \dots, b_N), be a classical state of a finite 1D lattice of N bits (b_n = \pm 1), and consider the canonical ensemble starting with a uniform distribution and adding an energetic penalty of \gamma for misaligned nearest-neighborsWe assume periodic boundary conditions: b_{N+1} \equiv b_1 c  :

(1)   \begin{align*} P(\vec{b}) &= Z^{-1}  e^{\frac{\gamma}{2}\sum_n b_n b_{n+1}} \propto \exp\left[-\gamma\sum_n \left(\frac{b_{n+1}-b_n}{2}\right)^2\right] \end{align*}

with Z = [2\, \mathrm{cosh} (\gamma/2)]^N. It’s not hard to check that the classical expectation value obeys \langle b_n \rangle = 0 and

(2)   \begin{align*} \langle b_n b_{n+m} \rangle = \left(\frac{e^\gamma -1}{e^\gamma +1}\right)^m . \end{align*}

If we define our correlation length \ell by the asymptotic behavior \vert \langle b_n b_{n+m} \rangle\vert \sim e^{-m/\ell}, then for large \vert\gamma\vert (large \ell) we have

(3)   \begin{align*} \ell = \frac{e^{\vert \gamma \vert}}{2} \end{align*}

The idea is to define our quantum state as a superposition of configurations of a spin chain weighted by this distribution: \vert \Psi \rangle \equiv \sum_{\vec{b}} P(\vec{b}) \bigotimes_n \vert \chi(b_n) \rangle_n. Contiguous regions of size \ell will be highly correlated. The coarse-grained variables that are being recorded are something like “whether this local region is mostly +1 or mostly -1”, but we don’t necessarily want each qubit to have a perfect record of this information. Rather, we choose

(4)   \begin{align*}  \vert \chi(b_n) \rangle_n &\equiv \cos(b_n\theta) \vert \uparrow \rangle_n + \sin(b_n\theta) \vert \downarrow \rangle_n \\ &= \cos(\theta) \vert \uparrow \rangle_n + b_n\sin(\theta) \vert \downarrow \rangle_n \end{align*}

for some fixed (typically small) angle \theta. For \vert\theta\vert < \pi/4 one can’t reliably infer the classical variable from a measurement on a single spin, but one can make an arbitrarily reliable inference by measuring lots of them: \vert\langle +1 \vert -1 \rangle\vert^m = (1-2 \sin^2 \theta)^m. If we define our “recording length” r by the asymptotic behavior \vert\langle +1 \vert -1 \rangle\vert^m \sim e^{-m/r}, then for small \theta we have

(5)   \begin{align*} r = \frac{1}{2\theta^2} \end{align*}

Thus our final (unnormalized) state is

(6)   \begin{align*} \mkern-18mu \vert \Psi (\ell,r)\rangle \equiv \sum_{\vec{b}} \exp\left[\frac{-\ln (2 \ell)}{2}\sum_n b_n b_{n+1}\right] \bigotimes_n \Big[\cos \sqrt{1/2r} \vert \uparrow \rangle_n + b_n\sin \sqrt{1/2r} \vert \downarrow \rangle_n\Big] \hspace{2cm}\textcolor{white}{.} \end{align*}

for arbitrary positive parameters \ell and r. The state inherits from the classical probability distribution the properties of finite correlation length and translational invariance. Any contiguous set of m \ll \ell spins is very likely to all be in the same state, and by measuring a subset of m' \gg r spins one can distinguish between the cases \vert +1 \rangle and \vert -1 \rangle with high reliability. Therefore in the limit r \ll \ell it makes sense to define the redundancy R \sim r/\ell; this is the approximate number of disjoint records that are available about each branch, where there is a (binary) branch density of \ell^{-1}.

Now lets turn this into a matrix-product state.This will not be a general algorithm for building MPSs since it exploits the special form of the state (6) we’re trying to build, which allows us to identify the b‘s with the \alpha‘s. d   That is, it should take the formAgain, periodic boundary conditions e  The tikzpicture code was obtained from Piotr Migdał. f  

(7)   \begin{align*} \vert \Psi \rangle &= \sum_{i_1,\dots,i_N}\sum_{\alpha_1,\dots,\alpha_N} A^{i_1}_{\alpha_1 \alpha_2}A^{i_2}_{\alpha_2 \alpha_3}\dots A^{i_{N-1}}_{\alpha_{N-1} \alpha_{N}} A^{i_N}_{\alpha_{N} \alpha_{1}} \vert i_1\dots i_N\rangle\\ &= \sum_{i_1,\dots,i_N} \mathrm{Tr}\left[ A^{i_1} A^{i_2}\dots A^{i_{N-1}}A^{i_N}\right] \vert i_1\dots i_N\rangle \end{align*}

Rendered by

where the i_n = \uparrow,\downarrow range over the local Hilbert spaces of the qubits and the \alpha_n are the “bonds” representing contraction of matrices of as-yet unspecified dimension (depicted in the figure as lines connecting the A‘s horizontally). The trick is to pretend that our classical probability distribution was achieved by starting with a quantum state of appropriately weighted bras,

(8)   \begin{align*} \langle P \vert \equiv \sum_{\vec{b}} P(\vec{b}) \left[ \bigotimes_n \, {}_n \! \langle b_n \vert \right], \end{align*}

and contracting (i.e., multiplying from the right to inner-product over the fictional bras) with a specially designed state that picks out the correct conditional states of the original Hilbert space:

(9)   \begin{align*} \vert \Omega \rangle % &\equiv \bigotimes_n \vert \omega \rangle_n\\ &\equiv \bigotimes_n \left[\sum_{b_n = \pm 1} \vert b_n \rangle_n \otimes \vert \chi(b_n) \rangle_n \right]\\ &= \bigotimes_n \Big[\vert +1 \rangle_n \left(\cos\theta \vert \uparrow \rangle_n + \sin\theta \vert \downarrow \rangle_n \right) + \vert -1 \rangle_n \left(\cos\theta \vert \uparrow \rangle_n - \sin\theta \vert \downarrow \rangle_n \right)\Big].\hspace{2cm}\textcolor{white}{.} \end{align*}

The object

(10)   \begin{align*} \langle P \vert \Omega \rangle = \sum_{i_1,\dots,i_N}\sum_{b_1,\dots,b_N} A^{i_1}_{b_1 b_2}\dots  A^{i_N}_{b_{N} b_{1}} \vert i_1\dots i_N\rangle.   \end{align*}

then takes the form of (7) and is equal to \vert \Psi(\ell,r)\rangle so long as we choose A to satisfy

(11)   \begin{align*} A^{i_n}_{b_n b_{n+1}} % &= \langle b_n \vert \Omega \rangle_n\\ &= e^{\frac{\gamma}{2} b_n b_{n+1}} \left(\cos\theta \vert \uparrow \rangle_n + b_n \sin\theta \vert \downarrow \rangle_n \right)%\\ %&= e^{\frac{\gamma}{2} b_n b_{n+1}} \big[\delta_{b_n,+1} \left(\delta_{i_n,\uparrow} \cos \theta + \delta_{i_n,\downarrow} \sin \theta\right) \\ %& \qquad\qquad\qquad\quad + \delta_{b_n,-1} \left(\delta_{i_n,\uparrow} \cos \theta - \delta_{i_n,\downarrow} \sin \theta\right)\big],\hspace{1cm}\textcolor{white}{.} \end{align*}

or, more cleanly,

(12)   \begin{align*} A^\uparrow = \cos \sqrt{\frac{1}{2r}} \left(\begin{array}{cc}\sqrt{2 \ell} &  1/\sqrt{2 \ell}\\ \phantom{-}1/\sqrt{2 \ell} &  \phantom{-}\sqrt{2 \ell} \end{array}\right), \\  A^\downarrow = \sin \sqrt{\frac{1}{2r}} \left(\begin{array}{cc}\sqrt{2 \ell} &  1/\sqrt{2 \ell}\\ -1/\sqrt{2 \ell} &  -\sqrt{2 \ell} \end{array}\right). \end{align*}

[I think Martin Ganahl and Guifre Vidal for discussion.]


(↵ returns to text)

  1. The editor tried to convince me that this figure appeared on the cover for purely aesthetic reasons and this does not mean my letter is the best thing in the issue…but I know better!
  2. Note that whether certain degrees of freedom admit a classical effective description is a computational question. The existence of such a description is incompatible with them exhibiting quantum supremacy, but it is compatible with them encoding circumstantial evidence (at the macro scale) for quantum mechanics (at the micro scale).
  3. We assume periodic boundary conditions: b_{N+1} \equiv b_1
  4. This will not be a general algorithm for building MPSs since it exploits the special form of the state (6) we’re trying to build, which allows us to identify the b‘s with the \alpha‘s.
  5. Again, periodic boundary conditions
  6. The tikzpicture code was obtained from Piotr Migdał.
Bookmark the permalink.


  1. I have a stupid(?) question. The criteria you give in the paper doesn’t involve time-evolution in any way. Intuitively, it seems that the reason we can treat branches separately is that they will evolve roughly independently of each other. We have U(B1 + B2) = U(B1) + U(B2), at least on the time scales we care about. Would a satisfactory abstract criteria for identifying branches make use of the form of the time evolution operator?

    • You’re right that the criteria for defining branches I give in the paper doesn’t make explicit reference to time evolution. Rather, the criteria for branches at a given time is based only on the wavefunction at that time, so that, in principle, the branch decomposition could change dramatically from one moment in time to another.

      Our expectation (or hope) is that the number of branches increases monotonically in time and that, furthermore, the branches at an earlier time, when evolved forward, are just a coarse-graining of the branches at a later time. Here, “coarse-graining” reflects that fact that individual branches may subdivide (e.g., when a measurement is made), but if you add the sub-branches together you should recover the parent branches (suitably time-evolved). More precisely, we expect that if the branch decomposition is \vert \psi(t_1) \rangle = \sum_i \vert \phi_i \rangle at time t_1, and is \vert \psi(t_2) \rangle = \sum_j \vert \chi_j \rangle at some later time t_2, then \exp[i(t_2-t_1)H] \vert \phi_i \rangle = \sum_{j \in J_i}  \vert \chi_j \rangle, where the J_i form a partition of the set \{j\}, i.e., \bigcup_i J_i = \{j\} and J_i \cap J_{i'} = \emptyset for i \neq i'. Let us call this property “branch fine-graining under time evolution” (BFGUTE)

      BFGUTE is merely an unproven desiderata. Indeed, one way to describe the way in which our understanding of quantum mechanics is incomplete is that we have never proven BFGUTE, which is basically the statement that we haven’t proven the Copenhagen and Everett interpretations equivalent. The first step in proving BFGUTE is to find a precise definition of wavefunction branches. Although it would be very nice if we could find a definition of branches that automatically implied BFGUTE, a little thought shows that the BFGUTE property is not enough, on its own, to define branches.

      I’m not sure if that answered your questions, but maybe it clarified things enough that you could re-ask?

      • Right, so to argue that the branches evolve independently, one must first define what branches are. That makes sense.

        I guess my question might be, is it necessarily the case that the branch decomposition can be derived from the tensor product structure/ entanglement alone? It seems that more structure is needed, as in the preferred length scale in the paper. The time evolution operator is a natural source of such a structure, e.g. if the qubits were fixed in some grid such that only neighbours can influence each other, that would be reflected in the Hamiltonian. So then the ‘correct’ branch decomposition would be a function of the Hamiltonian/additional structure, not just the state itself. Is this basically correct?

        • Great question. It is not obvious that we should define branches from the tensor structure alone, but here are two considerations:

          (1) Strategically, we want to start with the accepted recipe for measurement. As shown by Zurek (discussed here), measurements are really about amplification, and the best way I know how to formalize amplification is in terms of copies of information. Although I’m very interested in finding another formalization involving time (e.g., divergent information flow?), the simplest mathematical criteria seems to involve just identifying correlated information at some fixed time after the measurement.

          (2) Intuitively, if you hand me the macroscopic wavefunction describing Schrödinger’s cat, it seems to me that we can identify the branches without any reference to the historical evolution that generated that state. All I need to do is just look at the entanglement structure of a single time-slice and it’s obvious.

          That said, it’s conceivable a toy model with the following features could be found: On a single late-time slice, there are two incompatible candidate branch decompositions with equivalent spatial entanglement structure, but earlier time evolution unambiguously picks out one branch decomposition as the “correct” one. (This might be based on one decomposition having the BFGUTE property.)

          I have spent some time trying to find an additional criterion — grounded in the Hamiltonian, the lattice spacing, or the time evolution — that would unambiguously pick out a unique decomposition (unlike in the paper), but everything I tried was ugly/ad-hoc. Obviously, you can just declare a criterion, but I’d like something that was as compelling as the idea (used in the paper) that anything that deserves to be called a measurement must make at least three copies.

          • Thanks. Do you know of any systems, toy or real, where we can identify a naturally emerging branch structure?

            I would be interested in trying to run a simulation of a simple system with some of the features that seem important to branching in the real world, such as locality. A QCA for instance. Seeing such a system evolve, and hopefully form branch-like structures, might be helpful for trying to figure out what ‘branches’ really are (although of course we could only do it for a handful of particles/sites) Has anybody tried to do anything like this? Do you know what sort of software/methods could be used to implement such a simulation?

            • This prompted me to take an old email where I listed some models of branching and convert it into a blog post here.

              I think the most promising model would be a 1D chain of spins with some sort of local interactions. See especially the Brun-Halliwell model [arXiv:quant-ph/9601004]. The idea is that hydrodynamic variables (i.e., locally conserved densities) are excellent candidates for variables that would be redundantly recorded, and would follow quasiclassical trajectories on short timescales.

              I don’t know anyone who has tried to do a numerical simulation to find branches based on spatial entanglement structure (rather than by imposing a system-environment distinction by hand). This is something I’ve wanted to do for a while. Email me if you want to discuss more:

              My technical experience here doesn’t go beyond some simple Mathematica models, but I imagine this could only feasibly be done for a reasonably large spin chain by using a tensor network (in this case a matrix-product state) .

              A quantum cellular automata would be very interesting. One drawback would (I think) be that there wouldn’t be quasi-classical evolution, which would be ideal for convincingly deriving the appearance of classical trajectories.

Leave a Reply

Include [latexpage] in your comment to render LaTeX equations with $'s. (More info.) May not be rendered in the live preview.

Your email address will not be published. Required fields are marked with a *.