# How to think about Quantum Mechanics—Part 1: Measurements are about bases

[This post was originally “Part 0”, but it’s been moved. Other parts in this series: 1,2,3,4,5,6,7.]

In an ideal world, the formalism that you use to describe a physical system is in a one-to-one correspondence with the physically distinct configurations of the system. But sometimes it can be useful to introduce additional descriptions, in which case it is very important to understand the unphysical over-counting (e.g., gauge freedom). A scalar potential is a very convenient way of representing the vector force field, , but any constant shift in the potential, , yields forces and dynamics that are indistinguishable, and hence the value of the potential on an absolute scale is unphysical.

One often hears that a quantum experiment measures an observable, but this is wrong, or very misleading, because it vastly over-counts the physically distinct sorts of measurements that are possible. It is much more precise to say that a given apparatus, with a given setting, simultaneously measures all observables with the same eigenvectors. More compactly, an apparatus measures an orthogonal basis – not an observable.We can also allow for the measured observable to be degenerate, in which case the apparatus simultaneously measures all observables with the same degenerate eigenspaces. To be abstract, you could say it measures a commuting subalgebra, with the nondegenerate case corresponding to the subalgebra having maximum dimensionality (i.e., the same number of dimensions as the Hilbert space). Commuting subalgebras with maximum dimension are in one-to-one correspondence with orthonormal bases, modulo multiplying the vectors by pure phases.a   You can probably start to see this by just noting that there’s no actual, physical difference between measuring and ; the apparatus that would perform the two measurements are identical.

In the rest of this post, I’ll lay things out very explicitly. I’m going to show how simply acknowledging that a measurement is carried out by a physical apparatus is enough to infer

1. that the set of possible eigenstate outcomes (the basis) is all that physically matters,
2. that the basis must be orthogonal, and consequently that,
3. it’s just as sensible to talk about the measurement of non-Hermitian normal operators as traditional observables (Hermitian operators).

I’ll mostly be following ZurekWojciech H. Zurek, Phys. Rev. A 76, 052110 (2007), [arXiv:quant-ph/0703160]; Phys. Rev. A 87, 052111 (2013) [arXiv:1212.3245].b  , who first pointed out (2). For simplicity we’ll assume a finite-dimensional Hilbert space. None of this requires you to adopt a many-worlds interpretation or anything; feel free to just stick with Copenhagen and pull the Heisenberg cut up a bit higher so the apparatus is contained within the quantum description.

#### Toy measurement model

Consider what a physical measuring apparatus actually does when it measures a system . From some “ready” state , initially unentangled with the system, the apparatus interacts unitarily such that different possible states of the system are recorded in distinct conditional out-states of the apparatus. These out-states will correspond, at the least, to different macroscopic configurations of the apparatus’s readout system (the “pointer”), e.g., the macroscopic arrangements of atoms in a screen interpreted as “up” rather than “down”.

Let us first assume the apparatus can make a non-disturbing measurement. Then for each , the unitary describing the measurement process must act in this manner:

(1)

A defining characteristic of unitaries is that they preserve the inner product between vectors, so

(2)

Since , the requirement that the measuring device evolves into distinct states, , for different outcomes immediately implies that , i.e., that the set of system states being distinguished must be orthogonal.

Now, let’s relax the assumption that the measurement is non-disturbing. Instead, we will appeal to the key characteristic of a measuring apparatus — that it must amplify. More precisely, the apparatus must contain many parts in which the outcome is recorded distinctly. For simplicity, let us simply define for to be the minimal degrees of freedom which are put into a distinct state conditional on the outcome of the measurement, and let be the (messy) rest of the apparatus, which will generally become entangled with the system. Then for each we must have

(3)

where is an arbitrary, possibly entangled joint state of . We again have not assumed a priori that the or the are orthogonal, just that they are distinct states. Nonetheless, unitary evolution preserves the inner product between states, so

(4)

Then, regardless of the value of , we must have that unless for almost all . Since a functioning amplifier must produce many distinct copies (records) of the amplified information, we conclude that the system states we are distinguishing, , are orthogonal.

Note that we have not lost the generality of our argument by assuming that the various components of the apparatus end up in pure states, unentangled with the rest of the apparatus and system. Our only requirement is that, for something to be a proper amplifier, one can choose some tensor structure in which this is so, and that’s always possible even if the natural, intuitive parts of the system in which the copies of the information are stored (e.g., the atoms of the macroscopic pointer readout) are in mixed states (so long as the mixed states are distinct). See Zurek for details.

#### Implications

So it’s clear from what a measuring apparatus actually does that there is no physical difference between measuring two observables with the same eigenvectors, for the same reason that, even classically, there’s no physical difference between measuring in centimeters and inches; it’s just the labeling on your ruler. The only thing that is meaningful is the orthogonal basis defining the measurement process. All that talking about observables adds to this is naming the eigenstates.Of course there is a physical difference between measuring and , since the latter would imply an apparatus that moves into the exact same conditional out-state if the system starts in an either eigenstate or .c

In fact, it makes as much sense to measure a normal operator as a Hermitian one. Recall that normal and Hermitian operators are defined by the conditions and , respectively. (Obviously, Hermitian operators are a subset of normal operators.) Equivalently, we can say that normal operators are defined by the fact that they have orthogonal eigenvectors, while Hermitian operators must additionally have real eigenvalues. It’s perfectly sensible to say, when we are determining the amplitude and phase of a macroscopic electromagnetic field, that we are measuring a single normal operator whose eigenvalues are complex. And if we wanted to be ornery, we could point out that there’s really nothing objectionable about measuring an operator

(5)

where and are elements of some (possibly finite) field which is neither the reals nor the complex numbers. In all these cases, the only thing that matter is the set of states .

(Note that there are still plenty of places in quantum mechanics where the Hermiticity of an operator is critical, such as the Hamiltonian. But then the meaningfulness of the reality of the eigenvalues is connected to the fact that the Hamiltonian is not just something that can be measured, but is used to generate time translation, in which case the eigenvalues are “doing work”.)

#### Blame

Why are the above simple observations not known by undergraduates, or even by professors? I tentatively blame the axiomatic approaches to quantum mechanics as put forth by the titans like Dirac and von Neumann, or at least their typical presentation to other physicist. In particular, when you take

• The expectation value of an observable for a system in a state is given by .

as an irreducible axiom of the universe, you obscure a great deal. This seems to be grounded in early formulations of Copenhagen, where the measurement operation was a definitive event, linking the quantum description with observed classical variables at a time and place. (This is to be contrasted with modern Copenhagen approaches where arbitrarily large objects can in principle be given quantum descriptions and the Heisenberg cut is fluid…as long as it is placed somewhere.Heisenberg: “The dividing line between the system to be observed and the measuring apparatus is immediately defined by the nature of the problem but it obviously signifies no discontinuity of the physical process. For this reason there must, within limits, exist complete freedom in choosing the position of the dividing line.” See Schlosshauer and Camilleri (2011).d  )

Of course, it’s clear that von Neumann made deep, deep insights about the completeness of the quantum description and the problems with hidden variablesThis work strongly contributed to Bell’s theorem. There is disagreement as to whether von Neumann’s proof against hidden variables was foolish or whether von Neumann understood the limitations of his conclusions.e  , and that this was achieved by linking what could actually be discovered about a system to complete sets of observables (maximal sets of commuting Hermitian operator). Nonetheless, there is a danger in taking these mathematical objects too seriously, and not taking seriously enough the fundamentally quantum nature of an apparatus.

### Footnotes

(↵ returns to text)

1. We can also allow for the measured observable to be degenerate, in which case the apparatus simultaneously measures all observables with the same degenerate eigenspaces. To be abstract, you could say it measures a commuting subalgebra, with the nondegenerate case corresponding to the subalgebra having maximum dimensionality (i.e., the same number of dimensions as the Hilbert space). Commuting subalgebras with maximum dimension are in one-to-one correspondence with orthonormal bases, modulo multiplying the vectors by pure phases.
2. Wojciech H. Zurek, Phys. Rev. A 76, 052110 (2007), [arXiv:quant-ph/0703160]; Phys. Rev. A 87, 052111 (2013) [arXiv:1212.3245].
3. Of course there is a physical difference between measuring and , since the latter would imply an apparatus that moves into the exact same conditional out-state if the system starts in an either eigenstate or .
4. Heisenberg: “The dividing line between the system to be observed and the measuring apparatus is immediately defined by the nature of the problem but it obviously signifies no discontinuity of the physical process. For this reason there must, within limits, exist complete freedom in choosing the position of the dividing line.” See Schlosshauer and Camilleri (2011).
5. This work strongly contributed to Bell’s theorem. There is disagreement as to whether von Neumann’s proof against hidden variables was foolish or whether von Neumann understood the limitations of his conclusions.

1. Peter Morgan

I wonder whether you’re OK with one particular consequence of allowing normal operators, that the sum of two normal operators, , may not be a normal operator, , whereas the sum of Hermitian operators is Hermitian ?

Secondly, if we measure the spectra of Hermitian operators , , and , comparison of the latter (taking various real values of ) with the first two gives us some information about the relative orientation of the eigenvectors of and of (and more so if we consider more Hermitian operators, though the computation looks daunting), so the eigenvalues are significant at least to that extent?

Though I agree that a focus on the spectra and the relative orientations of eigenbases of measurements would not be amiss, these two aspects taken together might perhaps be enough to make it distracting to introduce your “simple observations” to undergraduates?

Operationally, however, I’m curious whether I know how to measure just because I know how to measure both and ?

• Thanks Peter, these are great questions for refining my position.

Yea, I’m very comfortable with the fact that the set of things we measure with an apparatus is not closed under addition for the same reason that I’m fine that the sum of a position and a momentum is undefined. Operationally, two things that I can measure don’t have to have a well defined sum. And mathematically, it shouldn’t bother us that the sum of two different bases of the same vector space isn’t defined.

The thing I’m trying to do is (1) identify the set of mathematical structures that correspond to what we can physically measure and (2) point out that the Hermitian operators are not — and are not in 1-to-1 correspondence with — that set. In particular, I didn’t claim that normal operators are measurable, I claimed that they are just as measurable as Hermitian operators. But maybe it’s better to simply emphasize that measurements should be identified with PVMs/POVMs.

<rant>Regarding undergraduates: I remember as an undergraduate being utterly baffled by the bizarre declaration that Hermitian operators corresponded to observables. (In some sense, it was the ultimate distraction since a decade later I am still following the path it sent me down!) Such an axiom had no counterpart in classical mechanics, and it was never justified. In my opinion, the people who managed to avoid this distraction were just learning to accept things without understanding them. This ability to suspend disbelief is great if you want to create good graduate student slaves for doing computations, but not so good for training the next generation of physicists. </rant>

The question of whether you can operationalize the notion of measuring the sum of two normal operators is an interesting one. I don’t know the answer. I haven’t even seen someone try to operationalize the sum of two Hermitian operators. If you have a device that can measure spin Z and spin X, there doesn’t seem to me to be any reason for the sum of those two to be measurable with that device. I interpret this as more evidence that “the set of things we can measure” should not be identified with the Hermitian operators.

Indeed, it was only after I had been in graduate school for a while that I discovered a bit of the historical background behind the mathematical importance of Hermitian operators, especially as elucidated by von Neumann. The mistake wasn’t the emphasis on understanding these objects, but rather the terrible attempt to identify them with the physical process of measurement.

• Peter Morgan

FWIW, your answers seem OK to me, though I suppose different materials are needed for different undergraduates.
My questions came out of a current interest in digital (that is, binary, on a computer) records of experimental results. An Analogue-Digital Conversion is effectively constructed as a set of projections [is the recorded value associated with the real-valued observable in the range , or more generally in a set (a set in the complex plane if is normal)? Answer=0 or 1; lots of such questions gives us, say, a 16-bit recorded result in computer memory; any such process is determinedly neither linear nor continuous, yet they’re routine, millions of times over, at CERN, say].
If we have an apparatus that we say measures and as two bits, we might think of that as measuring , which is OK if , however if the eigenvalues of in general will not be . We could say that represents the ‘th eigenvalue of , or we could say that we have applied a second nonlinear ADC process to obtain our two bit record (or we might say something altogether different?). [Again at CERN, computer records are as often clearly of events that are at time-like separation and hence a priori do not commute.]
It has always seemed to me rather more interesting to know what we have measured than what the measurement results are, which is as if to say in theory-speak that the eigenvectors of an operator give us more information than the eigenvalues, which is, I take it, a crude paraphrase of part of your blog post.
Perhaps I need <rant>…</rant> round all that? Answer any part of it that you find it useful to contemplate, but leave it alone otherwise.

• I’m pretty sure that the eigenvalues of are not in general for , so measuring that composite variable just doesn’t have a simple interpretation as measuring two binary variables. One way to try to make such a measurement would be to measure followed by , but then the order becomes important.

Even when two measurements are not space-like separated, they very often commute (to very high accuracy) simply because the coupling between them is very weak. Different detectors within ATLAS often correspond to commuting measurements for the same reason that time-like separated measurements made on different continents do. And even when they don’t, there’s an objective ordering guaranteed by their time-like separation.

• Peter Morgan

Apparently my last paragraph “rant”…”/rant” was converted to just … . Hey ho.

• Yea, I had to use a raw unicode character for the less-than signs to keep it from trying to parse it as html. This is typed in as

& # 60 ;

only without any spaces. Took me a couple of tries on my first attempt.

Hope you don’t mind that I edited your comment to fix this for you :)

2. Peter Morgan

This morning, we have this in a Springer notification e-mail, https://link.springer.com/article/10.1007/s40509-016-0098-2, “Are observables necessarily Hermitian?” I haven’t looked at the paper yet past the abstract, which says, almost exactly on this post’s topic, that “observables should be reformulated as normal operators including Hermitian operators as a subclass” (the arXiv version is https://arxiv.org/abs/1601.04287, which predates your post).

• Thanks Peter! Although their paper (Jan 2016) predates this post (Nov 2016), this post is actually just a fleshed out version of the same point which I have had on my website since 2013:

> Wrong: Observables are “represented” (?) by Hermitian operators.
> Right: Measurements necessarily amplify, and therefore (!) are associated
> with an orthogonal basis. This is the Schmidt basis of the entangled
> joint state of the measuring apparatus and the measured system.
> More: Wojciech H. Zurek, Phys. Rev. A 76, 052110 (2007),
> [arXiv:quant-ph/0703160]. Also: [arXiv:1212.3245].
> Implication: Observables can be associated with normal, not just
> Hermitian, operators.

Furthermore, the idea that normal operators are observables follows almost immediately from Wojciech’s 2007 paper (which Hu et al. cite). That paper was one of the reasons I sought him out as my advisor, physically moving from California to New Mexico, and I probably just picked the idea up from him during discussion while I was there.

Of course, the idea itself is much more important than priority. I’d be very glad if their publication makes this idea common knowledge taught in introductory quantum mechanics courses. But I’m not holding my breath…

3. Pingback: Blogs I Follow – EGO PON's Blog

4. Teddy Parker

In your “Implications” section, when you say that a normal operator is “non-singular”, do you mean “non-defective”? A normal operator can certainly be singular.

• Thanks Teddy. Honestly I can’t remember what I was trying to say with that condition back when I wrote this. You’re right of course that normal (and Hermitian) operators can perfectly well be singular (i.e., not have an inverse); they just need zero eigenvalues. I’ve now removed that condition from the post.