How to think about Quantum Mechanics—Part 1: Measurements are about bases

[This post was originally “Part 0”, but it’s been moved. Other parts in this series: 1,2,3,4,5,6.]

In an ideal world, the formalism that you use to describe a physical system is in a one-to-one correspondence with the physically distinct configurations of the system. But sometimes it can be useful to introduce additional descriptions, in which case it is very important to understand the unphysical over-counting (e.g., gauge freedom). A scalar potential V(x) is a very convenient way of representing the vector force field, F(x) = \partial V(x), but any constant shift in the potential, V(x) \to V(x) + V_0, yields forces and dynamics that are indistinguishable, and hence the value of the potential on an absolute scale is unphysical.

One often hears that a quantum experiment measures an observable, but this is wrong, or very misleading, because it vastly over-counts the physically distinct sorts of measurements that are possible. It is much more precise to say that a given apparatus, with a given setting, simultaneously measures all observables with the same eigenvectors. More compactly, an apparatus measures an orthogonal basis – not an observable.We can also allow for the measured observable to be degenerate, in which case the apparatus simultaneously measures all observables with the same degenerate eigenspaces. To be abstract, you could say it measures a commuting subalgebra, with the nondegenerate case corresponding to the subalgebra having maximum dimensionality (i.e., the same number of dimensions as the Hilbert space). Commuting subalgebras with maximum dimension are in one-to-one correspondence with orthonormal bases, modulo multiplying the vectors by pure phases. a   You can probably start to see this by just noting that there’s no actual, physical difference between measuring X and X^3; the apparatus that would perform the two measurements are identical.

In the rest of this post, I’ll lay things out very explicitly. I’m going to show how simply acknowledging that a measurement is carried out by a physical apparatus is enough to infer

  1. that the set of possible eigenstate outcomes (the basis) is all that physically matters,
  2. that the basis must be orthogonal, and consequently that,
  3. it’s just as sensible to talk about the measurement of non-Hermitian normal operators as traditional observables (Hermitian operators).

I’ll mostly be following ZurekWojciech H. Zurek, Phys. Rev. A 76, 052110 (2007), [arXiv:quant-ph/0703160]; Phys. Rev. A 87, 052111 (2013) [arXiv:1212.3245]. b  , who first pointed out (2). For simplicity we’ll assume a finite-dimensional Hilbert space. None of this requires you to adopt a many-worlds interpretation or anything; feel free to just stick with Copenhagen and pull the Heisenberg cut up a bit higher so the apparatus is contained within the quantum description.

Toy measurement model

Consider what a physical measuring apparatus \mathcal{A} actually does when it measures a system \mathcal{S}. From some “ready” state \vert A_0 \rangle, initially unentangled with the system, the apparatus interacts unitarily such that different possible states \vert S_i \rangle of the system are recorded in distinct conditional out-states \vert A_i \rangle of the apparatus. These out-states will correspond, at the least, to different macroscopic configurations of the apparatus’s readout system (the “pointer”), e.g., the macroscopic arrangements of atoms in a screen interpreted as “up” rather than “down”.

Let us first assume the apparatus can make a non-disturbing measurement. Then for each i, the unitary U_{\mathrm{M}} describing the measurement process must act in this manner:

(1)   \begin{align*} \vert S_i \rangle \vert A_0 \rangle \quad \overset{U_{\mathrm{M}}}\to \quad U_{\mathrm{M}} \left(\vert S_i \rangle \vert A_0 \rangle \right) =   \vert S_i \rangle \vert A_i \rangle \end{align*}

A defining characteristic of unitaries is that they preserve the inner product between vectors, so

(2)   \begin{align*} \langle S_i \vert S_j \rangle \langle A_0 \vert A_0 \rangle =   \langle S_i \vert S_j \rangle \langle A_i \vert A_j \rangle. \end{align*}

Since \langle A_0 \vert A_0 \rangle = 1, the requirement that the measuring device evolves into distinct states, \langle A_i \vert A_j \rangle \neq 1, for different outcomes i immediately implies that \langle S_i \vert S_j \rangle = \delta_{ij}, i.e., that the set of system states being distinguished must be orthogonal.

Now, let’s relax the assumption that the measurement is non-disturbing. Instead, we will appeal to the key characteristic of a measuring apparatus — that it must amplify. More precisely, the apparatus must contain many parts \mathcal{A} = \otimes_n \mathcal{A}^{(n)} in which the outcome is recorded distinctly. For simplicity, let us simply define \mathcal{A}^{(n)} for n = 1,\dots, N to be the minimal degrees of freedom which are put into a distinct state conditional on the outcome of the measurement, and let \mathcal{A}^{(\varnothing)} be the (messy) rest of the apparatus, which will generally become entangled with the system. Then for each i we must have

(3)   \begin{align*} \vert S_i \rangle \vert A_0 \rangle = \vert S_i \rangle \vert A_0^{(\varnothing)} \rangle \vert A_0^{(1)} \rangle \vert A_0^{(2)} \rangle \cdots   \quad \overset{U_{\mathrm{M}}}\to \quad  \vert SA^{(\varnothing)}_i \rangle \vert A_i^{(1)} \rangle \vert A_i^{(2)}\rangle \cdots \end{align*}

where \vert SA^{(\varnothing)}_i \rangle is an arbitrary, possibly entangled joint state of \mathcal{S}\otimes \mathcal{A}^{(\varnothing)}. We again have not assumed a priori that the \vert S_i \rangle or the \vert A_i^{(n)} \rangle are orthogonal, just that they are distinct states. Nonetheless, unitary evolution preserves the inner product between states, so

(4)   \begin{align*} \langle S_i\vert S_j \rangle =  \langle SA^{(\varnothing)}_i \vert SA^{(\varnothing)}_j \rangle \left( \prod_{n=1}^N \langle A_i^{(n)} \vert A_j^{(n)} \rangle\right). \end{align*}

Then, regardless of the value of \langle SA^{(\varnothing)}_i \vert SA^{(\varnothing)}_j \rangle, we must have that \langle S_i\vert S_j \rangle = \delta_{ij} unless \vert \langle A_i^{(n)} \vert A_j^{(n)} \rangle \vert \to 1 for almost all n. Since a functioning amplifier must produce many distinct copies (records) of the amplified information, we conclude that the system states we are distinguishing, \{\vert S_i \rangle\}, are orthogonal.

Note that we have not lost the generality of our argument by assuming that the various components of the apparatus \mathcal{A}^{(i)} end up in pure states, unentangled with the rest of the apparatus and system. Our only requirement is that, for something to be a proper amplifier, one can choose some tensor structure \mathcal{A} = \otimes_n \mathcal{A}^{(n)} in which this is so, and that’s always possible even if the natural, intuitive parts of the system in which the copies of the information are stored (e.g., the atoms of the macroscopic pointer readout) are in mixed states (so long as the mixed states are distinct). See Zurek for details.

Implications

So it’s clear from what a measuring apparatus actually does that there is no physical difference between measuring two observables with the same eigenvectors, for the same reason that, even classically, there’s no physical difference between measuring in centimeters and inches; it’s just the labeling on your ruler. The only thing that is meaningful is the orthogonal basis \{\vert S_i \rangle\} defining the measurement process. All that talking about observables adds to this is naming the eigenstates.Of course there is a physical difference between measuring X and X^2, since the latter would imply an apparatus that moves into the exact same conditional out-state if the system starts in an either eigenstate \vert X = x_0\rangle or \vert X = -x_0\rangle. c  

In fact, it makes as much sense to measure a normal operator as a Hermitian one. Recall that normal and Hermitian operators are defined by the conditions [\mathcal{O},\mathcal{O}^\dagger]=0 and \mathcal{O} = \mathcal{O}^\dagger, respectively. (Obviously, Hermitian operators are a subset of normal operators.) Equivalently, we can say that normal operators are defined by the fact that they are non-singular and have orthogonal eigenvectors, while Hermitian operators must additionally have real eigenvalues. It’s perfectly sensible to say, when we are determining the amplitude and phase of a macroscopic electromagnetic field, that we are measuring a single normal operator whose eigenvalues are complex. And if we wanted to be ornery, we could point out that there’s really nothing objectionable about measuring an operator

(5)   \begin{align*} \mathcal{O} = \eta_\mathrm{red}\vert S_\mathrm{red} \rangle \langle S_\mathrm{red} \vert + \eta_\mathrm{green}\vert S_\mathrm{green} \rangle \langle S_\mathrm{green} \vert \end{align*}

where \eta_\mathrm{red} and \eta_\mathrm{green} are elements of some (possibly finite) field which is neither the reals nor the complex numbers. In all these cases, the only thing that matter is the set of states \{\vert S_i \rangle\}.

(Note that there are still plenty of places in quantum mechanics where the Hermiticity of an operator is critical, such as the Hamiltonian. But then the meaningfulness of the reality of the eigenvalues is connected to the fact that the Hamiltonian is not just something that can be measured, but is used to generate time translation, in which case the eigenvalues are “doing work”.)

Blame

Why are the above simple observations not known by undergraduates, or even by professors? I tentatively blame the axiomatic approaches to quantum mechanics as put forth by the titans like Dirac and von Neumann, or at least their typical presentation to other physicist. In particular, when you take

  • The expectation value of an observable A for a system in a state \psi is given by \langle \psi \vert A \vert \psi \rangle.

as an irreducible axiom of the universe, you obscure a great deal. This seems to be grounded in early formulations of Copenhagen, where the measurement operation was a definitive event, linking the quantum description with observed classical variables at a time and place. (This is to be contrasted with modern Copenhagen approaches where arbitrarily large objects can in principle be given quantum descriptions and the Heisenberg cut is fluid…as long as it is placed somewhere.Heisenberg: “The dividing line between the system to be observed and the measuring apparatus is immediately defined by the nature of the problem but it obviously signifies no discontinuity of the physical process. For this reason there must, within limits, exist complete freedom in choosing the position of the dividing line.” See Schlosshauer and Camilleri (2011). d  )

Of course, it’s clear that von Neumann made deep, deep insights about the completeness of the quantum description and the problems with hidden variablesThis work strongly contributed to Bell’s theorem. There is disagreement as to whether von Neumann’s proof against hidden variables was foolish or whether von Neumann understood the limitations of his conclusions. e  , and that this was achieved by linking what could actually be discovered about a system to complete sets of observables (maximal sets of commuting Hermitian operator). Nonetheless, there is a danger in taking these mathematical objects too seriously, and not taking seriously enough the fundamentally quantum nature of an apparatus.

Footnotes

(↵ returns to text)

  1. We can also allow for the measured observable to be degenerate, in which case the apparatus simultaneously measures all observables with the same degenerate eigenspaces. To be abstract, you could say it measures a commuting subalgebra, with the nondegenerate case corresponding to the subalgebra having maximum dimensionality (i.e., the same number of dimensions as the Hilbert space). Commuting subalgebras with maximum dimension are in one-to-one correspondence with orthonormal bases, modulo multiplying the vectors by pure phases.
  2. Wojciech H. Zurek, Phys. Rev. A 76, 052110 (2007), [arXiv:quant-ph/0703160]; Phys. Rev. A 87, 052111 (2013) [arXiv:1212.3245].
  3. Of course there is a physical difference between measuring X and X^2, since the latter would imply an apparatus that moves into the exact same conditional out-state if the system starts in an either eigenstate \vert X = x_0\rangle or \vert X = -x_0\rangle.
  4. Heisenberg: “The dividing line between the system to be observed and the measuring apparatus is immediately defined by the nature of the problem but it obviously signifies no discontinuity of the physical process. For this reason there must, within limits, exist complete freedom in choosing the position of the dividing line.” See Schlosshauer and Camilleri (2011).
  5. This work strongly contributed to Bell’s theorem. There is disagreement as to whether von Neumann’s proof against hidden variables was foolish or whether von Neumann understood the limitations of his conclusions.
Bookmark the permalink.

Leave a Reply

Include [latexpage] in your comment to render LaTeX equations with $'s. (More.)

Your email address will not be published. Required fields are marked with a *.