Here is the abstract:
There is a vast number of people who will live in the centuries and millennia to come. In all probability, future generations will outnumber us by thousands or millions to one; of all the people who we might affect with our actions, the overwhelming majority are yet to come. In the aggregate, their interests matter enormously. So anything we can do to steer the future of civilization onto a better trajectory, making the world a better place for those generations who are still to come, is of tremendous moral importance. Political science tells us that the practices of most governments are at stark odds with longtermism. In addition to the ordinary causes of human short-termism, which are substantial, politics brings unique challenges of coordination, polarization, short-term institutional incentives, and more. Despite the relatively grim picture of political time horizons offered by political science, the problems of political short-termism are neither necessary nor inevitable. In principle, the State could serve as a powerful tool for positively shaping the long-term future. In this chapter, we make some suggestions about how we should best undertake this project. We begin by explaining the root causes of political short-termism. Then, we propose and defend four institutional reforms that we think would be promising ways to increase the time horizons of governments: 1) government research institutions and archivists; 2) posterity impact assessments; 3) futures assemblies; and 4) legislative houses for future generations. We conclude with five additional reforms that are promising but require further research. To fully resolve the problem of political short-termism we must develop a comprehensive research program on effective longtermist political institutions.
In the rest of the post, I am going to ask a few pointed questions and make comments. Fair warning: I am trying to get back into frequent low-overhead blogging, so this post is less polished by design, and won’t be very useful if you don’t read the paper (since I don’t summarize it). My comments are largely critical, but needless to say I usually only bother to comment on the tiny minority of papers that I think are important and interesting, which this certainly is.
I know this is just an early attempt at formalizing these ideas, but I would want to see substantially more discussion of the public choice problems that will arise with all these proposals, not just the legislative house. I think such problems are immediate and large (i.e., not just a perturbation that can be handled later), and would strongly drive the best solution. In particular:
Lastly, some tangents:
Although the way this problem tends to be formalized varies with context, I don’t think we have confidence in any of the formalizations. The different versions are very tightly related, so that a solution in one context is likely give, or at least strongly point toward, solutions for the others.
As a time-saving device, I will just quote a few paragraphs from existing papers that review the literature, along with the relevant part of their list of references. I hope to update this from time to time, and perhaps turn it into a proper review article of its own one day. If you have a recommendation for this bibliography (either a single citation, or a paper I should quote), please do let me know.
From “Quantum Mereology: Factorizing Hilbert Space into Subsystems with Quasi-Classical Dynamics”, arXiv:2005.12938:
While this question has not frequently been addressed in the literature on quantum foundations and emergence of classicality, a few works have highlighted its importance and made attempts to understand it better. Brun and Hartle [2] studied the emergence of preferred coarse-grained classical variables in a chain of quantum harmonic oscillators. Efforts to address the closely related question of identifying classical set of histories (also known as the “Set Selection” problem) in the Decoherent Histories formalism [3–7, 10] have also been undertaken. Tegmark [9] has approached the problem from the perspective of information processing ability of subsystems and Piazza [8] focuses on emergence of spatially local sybsystem structure in a field theoretic context. Hamiltonian induced factorization of Hilbert space which exhibit k-local dynamics has also been studied by Cotler et al [14]). The idea that tensor product structures and virtual subsystems can be identified with algebras of observables was originally introduced by Zanardi et al in [15, 16] and was further extended in Kabernik, Pollack and Singh [17] to induce more general structures in Hilbert space. In a series of papers (e.g. [18–21]; see also [22]) Castagnino, Lombardi, and collaborators have developed the self-induced decoherence (SID) program, which conceptualizes decoherence as a dynamical process which identifies the classical variables by inspection of the Hamiltonian, without the need to explicitly identify a set of environment degrees of freedom. Similar physical motivations but different mathematical methods have led Kofler and Brukner [23] to study the emergence of classicality under restriction to coarse-grained measurements.
[1] S. M. Carroll and A. Singh, “Mad-Dog Everettianism: Quantum Mechanics at Its Most Minimal,” arXiv:1801.08132 [quant-ph].
[2] T. A. Brun and J. B. Hartle, “Classical dynamics of the quantum harmonic chain,” Physical Review D 60 no. 12, (1999) 123503.
[3] M. Gell-Mann and J. Hartle, “Alternative decohering histories in quantum mechanics,” arXiv preprint arXiv:1905.05859 (2019) .
[4] F. Dowker and A. Kent, “On the consistent histories approach to quantum mechanics,” Journal of Statistical Physics 82 no. 5-6, (1996) 1575–1646.
[5] A. Kent, “Quantum histories,” Physica Scripta 1998 no. T76, (1998) 78.
[6] C. Jess Riedel, W. H. Zurek, and M. Zwolak, “The rise and fall of redundancy in decoherence and quantum Darwinism,” New Journal of Physics 14 no. 8, (Aug, 2012) 083010, arXiv:1205.3197[quant-ph].
[7] R. B. Griffiths, “Consistent histories and the interpretation of quantum mechanics,” J. Statist. Phys.
36 (1984) 219.
[8] F. Piazza, “Glimmers of a pre-geometric perspective,” Found. Phys. 40 (2010) 239–266,
arXiv:hep-th/0506124 [hep-th].
[9] M. Tegmark, “Consciousness as a state of matter,” Chaos, Solitons & Fractals 76 (2015) 238–270.
[10] J. P. Paz and W. H. Zurek, “Environment-induced decoherence, classicality, and consistency of quantum histories,” Physical Review D 48 no. 6, (1993) 2728.
[11] N. Bao, S. M. Carroll, and A. Singh, “The Hilbert Space of Quantum Gravity Is Locally Finite-Dimensional,” arXiv:1704.00066 [hep-th].
[12] T. Banks, “QuantuMechanics and CosMology.” Talk given at the festschrift for L. Susskind, Stanford University, May 2000, 2000.
[13] W. Fischler, “Taking de Sitter Seriously.” Talk given at Role of Scaling Laws in Physics and Biology (Celebrating the 60th Birthday of Geoffrey West), Santa Fe, Dec., 2000.
[14] J. S. Cotler, G. R. Penington, and D. H. Ranard, “Locality from the spectrum,” Communications in Mathematical Physics 368 no. 3, (2019) 1267–1296.
[15] P. Zanardi, “Virtual quantum subsystems,” Phys. Rev. Lett. 87 (2001) 077901, arXiv:quant-ph/0103030 [quant-ph].
[16] P. Zanardi, D. A. Lidar, and S. Lloyd, “Quantum tensor product structures are observable induced,” Phys. Rev. Lett. 92 (2004) 060402, arXiv:quant-ph/0308043 [quant-ph].
[17] O. Kabernik, J. Pollack, and A. Singh, “Quantum State Reduction: Generalized Bipartitions from Algebras of Observables,” Phys. Rev. A 101 no. 3, (2020) 032303, arXiv:1909.12851 [quant-ph].
[18] M. Castagnino and O. Lombardi, “Self-induced decoherence: a new approach,” Studies in the History and Philosophy of Modern Physics 35 no. 1, (Jan, 2004) 73–107.
[19] M. Castagnino, S. Fortin, O. Lombardi, and R. Laura, “A general theoretical framework for decoherence in open and closed systems,” Class. Quant. Grav. 25 (2008) 154002, arXiv:0907.1337 [quant-ph].
[20] O. Lombardi, S. Fortin, and M. Castagnino, “The problem of identifying the system and the environment in the phenomenon of decoherence,” in EPSA Philosophy of Science: Amsterdam 2009, H. W. de Regt, S. Hartmann, and S. Okasha, eds., pp. 161–174. Springer Netherlands, Dordrecht, 2012.
[21] S. Fortin, O. Lombardi, and M. Castagnino, “Decoherence: A Closed-System Approach,” Brazilian Journal of Physics 44 no. 1, (Feb, 2014) 138–153, arXiv:1402.3525 [quant-ph].
[22] M. Schlosshauer, “Self-induced decoherence approach: Strong limitations on its validity in a simple spin bath model and on its general physical relevance,” Phys. Rev. A 72 no. 1, (Jul, 2005) 012109, arXiv:quant-ph/0501138 [quant-ph].
[23] J. Kofler and C. Brukner, “Classical World Arising out of Quantum Physics under the Restriction of Coarse-Grained Measurements,” Phys. Rev. Lett. 99 no. 18, (Nov, 2007) 180403, arXiv:quant-ph/0609079 [quant-ph].
From “The Objective past of a quantum universe: Redundant records of consistent histories”, arXiv:1312.0331:
“Into what mixture does the wavepacket collapse?” This is the preferred basis problem in quantum mechanics [1]. It launched the study of decoherence [2, 3], a process central to the modern view of the quantum-classical transition [4–9]. The preferred basis problem has been solved exactly for so-called pure decoherence [1, 10]. In this case, a well-defined pointer basis [1] emerges whose origins can be traced back to the interaction Hamiltonian between the quantum system and its environment [1, 2, 4]. An approximate pointer basis exists for many other situations (see, e. g., Refs. [11–17]).
The consistent (or decoherent) histories framework [18–21] was originally introduced by Griffiths. It has evolved into a mathematical formalism for applying quantum mechanics to completely closed systems, up to and including the whole universe. It has been argued that quantum mechanics within this framework would be a fully satisfactory physical theory only if it were supplemented with an unambiguous mechanism for identifying a preferred set of histories corresponding, at the least, to the perceptions of observers [22–29] (but see counterarguments [30–35]). This would address the Everettian [36] question: “What are the branches in the wavefunction of the Universe?” This defines the set selection problem, the global analog to the preferred basis problem.
It is natural to demand that such a set of histories satisfy the mathematical requirement of consistency, i.e., that their probabilities are additive. The set selection problem still looms large, however, as almost all consistent sets bear no resemblance to the classical reality we perceive [37–39]. Classical reasoning can only be done relative to a single consistent set [20, 31, 32]; simultaneous reasoning from different sets leads to contradictions [22–24, 40, 41]. A preferred set would allow one to unambiguously compute probabilities^{1} for all observations from first principles, that is, from (1) a wavefunction of the Universe and (2) a Hamiltonian describing the interactions.
To agree with our expectations, a preferred set would describe macroscopic systems via coarse-grained variables that approximately obey classical equations of motion, thereby constituting a “quasiclassical domain” [14, 23, 24, 40, 49, 50]. Various principles for its identification have been explored, both within the consistent histories formalism [15, 26, 39, 49, 51–56] and outside it [57–61]. None have gathered broad support.
^{1}We take Born’s rule for granted, putting aside the question of whether it should be derived from other principles [9, 36, 42–48] or simply assumed. That issue is independent of (and cleanly separated from) the topic of this paper.
[1] W. H. Zurek, Phys. Rev. D 24, 1516 (1981).
[2] W. H. Zurek, Phys. Rev. D 26, 1862 (1982).
[3] E. Joos and H. D. Zeh, Zeitschrift für Physik B Condensed Matter 59, 223 (1985).
[4] H. D. Zeh, Foundations of Physics 3, 109 (1973).
[5] W. H. Zurek, Physics Today 44, 36 (1991).
[6] W. H. Zurek, Rev. Mod. Phys. 75, 715 (2003).
[7] E. Joos, H. D. Zeh, C. Kiefer, D. Giulini, J. Kupsch, and I.-O. Stamatescu, Decoherence and the Appearance of a Classical World in Quantum Theory, 2nd ed. (SpringerVerlag, Berlin, 2003).
[8] M. Schlosshauer, Decoherence and the Quantum-toClassical Transition (Springer-Verlag, Berlin, 2008); in Handbook of Quantum Information, edited by M. Aspelmeyer, T. Calarco, and J. Eisert (Springer, Berlin/Heidelberg, 2014).
[9] W. H. Zurek, Physics Today 67, 44 (2014).
[10] M. Zwolak, C. J. Riedel, and W. H. Zurek, Physical Review Letters 112, 140406 (2014).
[11] J. R. Anglin and W. H. Zurek, Physical Review D 53, 7327 (1996); D. A. R. Dalvit, J. Dziarmaga, and W. H. Zurek, Physical Review A 72, 062101 (2005).
[12] O. Kübler and H. D. Zeh, Annals of Physics 76, 405 (1973).
[13] W. H. Zurek, S. Habib, and J. P. Paz, Phys. Rev. Lett. 70, 1187 (1993).
[14] M. Gell-Mann and J. B. Hartle, Phys. Rev. D 47, 3345 (1993).
[15] M. Gell-Mann and J. B. Hartle, Phys. Rev. A 76, 022104 (2007).
[16] J. J. Halliwell, Phys. Rev. D 58, 105015 (1998).
[17] J. Paz and W. H. Zurek, Phys. Rev. Lett. 82, 5181 (1999).
[18] R. B. Griffiths, Journal of Statistical Physics 36, 219 (1984).
[19] R. Omnès, The Interpretation of Quantum Mechanics (Princeton University Press, Princeton, NJ, 1994).
[20] R. B. Griffiths, Consistent Quantum Theory (Cambridge University Press, Cambridge, UK, 2002).
[21] J. J. Halliwell, in Fundamental Problems in Quantum Theory, Vol. 775, edited by D.Greenberger and A.Zeilinger (Blackwell Publishing Ltd, 1995) arXiv:grqc/9407040.
[22] F. Dowker and A. Kent, Phys. Rev. Lett. 75, 3038 (1995).
[23] F. Dowker and A. Kent, Journal of Statistical Physics. 82, 1575 (1996).
[24] A. Kent, Phys. Rev. A 54, 4670 (1996).
[25] A. Kent, Phys. Rev. Lett. 78, 2874 (1997).
[26] A. Kent and J. McElwaine, Phys. Rev. A 55, 1703 (1997).
[27] A. Kent, in Bohmian Mechanics and Quantum Theory: An Appraisal, edited by A. F. J. Cushing and S. Goldstein (Kluwer Academic Press, Dordrecht, 1996) arXiv:quant-ph/9511032.
[28] E. Okon and D. Sudarsky, Stud. Hist. Philos. Sci. B 48, Part A, 7 (2014).
[29] E. Okon and D. Sudarsky, arXiv:1504.03231 (2015).
[30] R. B. Griffiths and J. B. Hartle, Physical Review Letters 81, 1981 (1998).
[31] R. B. Griffiths, Physical Review A 57, 1604 (1998).
[32] R. B. Griffiths, Studies in History and Philosophy of Science Part B: Studies in History and Philosophy of Modern Physics 44, 93 (2013).
(↵ returns to text)
On microscopic scales, sound is air pressure fluctuating in time . Taking the Fourier transform of gives the frequency distribution , but in an eternal way, applying to the entire time interval for .
Yet on macroscopic scales, sound is described as having a frequency distribution as a function of time, i.e., a note has both a pitch and a duration. There are many formalisms for describing this (e.g., wavelets), but a well-known limitation is that the frequency of a note is only well-defined up to an uncertainty that is inversely proportional to its duration .
At the mathematical level, a given wavefunction is almost exactly analogous: macroscopically a particle seems to have a well-defined position and momentum, but microscopically there is only the wavefunction . The mapping of the analogy^{a } is . Wavefunctions can of course be complex, but we can restrict ourself to a real-valued wavefunction without any trouble; we are not worrying about the dynamics of wavefunctions, so you can pretend the Hamiltonian vanishes if you like.
In order to get the acoustic analog of Planck’s constant , it helps to imagine going back to a time when the pitch of a note was measured with a unit that did not have a known connection to absolute frequency, i.e., to inverse-time units. To my (very limited) understanding, by the 6th century BC it was already understood that an octave was the difference between two notes when one is vibrating twice as fast as another, but the absolute frequency (oscillations per second) of any particular note — say C♯ — was not known.^{b } Let’s arbitrarily declare to be the interval between pitches C♯ and C in the one-lined octave. Today we know that C♯ and C correspond to 277.18 and 261.63 Hz, respectively, so that corresponds to 64.31 ms , but in the distant past for humanity (or the very recent past for me), this was unknown.
If I listened very carefully, or at least if I built special equipment, I would find that the purity of a note’s pitch begins to degrade as the duration of the note approaches its inverse frequency; this would be a hint about the location of the acoustic microscopic scale . That is, I would find it harder and harder to distinguish between the notes C♯ and C as the duration of the notes approached of order 60 ms (although it might happen with even longer durations due to imperfections of my ears/equipment). To confidently establish the relationship between perceived pitch and inverse time, I would probably want to listen to sound made by objects whose frequency of vibration I could measure directly. That would be easy today, but very difficult three thousand years ago.
The analog of , then, would be the ratio . Whenever someone says “middle C is 261.63 Hz”, they are effectively setting (i.e., measuring pitch in units of inverse-time), just as physicists commonly set (measuring momentum in units of inverse-distance). But crucially, for understanding the history of science, this is not possible until you have equipment that is sensitive to the microscopic scale. Before this, you needed two separate systems of units that could not (then) be connected in a principled manner.
The quantum-acoustic analogy is not just conceptual, it is a mathematically precise correspondence, to the point that there are book-length treatments that apply almost equally well to both.^{c } In particular, the Wigner function (previous posts: 1,2) for simultaneously representing the position and momentum of a particle can be used fruitfully in acoustics for simultaneously representing the duration and pitch of sounds (which is closely related to the short-time Fourier transform). And, importantly, it is true for both quantum mechanics and acoustics that the macroscopic limit () is “singular”: Just as only a few -indexed families of quantum states have a sensible classical limit, only a few -indexed families of acoustic waveforms have a sensible decomposition into notes (“music”). When this limit fails, it’s not a case of bad music, it’s a case of your speakers blowing out.
What makes quantum mechanics “mysterious” then is clearly not things like the form of the uncertainty principle per se. (There is an acoustic uncertainty principle of identical mathematical form.) Rather, it is the interpretation of the wavefunction in terms of probability amplitudes and our inability to probe it except indirectly through disturbing measurements, as opposed to acoustic waves which can be measured to arbitrary precision.^{d } Relatedly, there is, to my understanding, no acoustic analog of a mixed quantum state.
Basic harmonic analysis of acoustics is an interesting topic in elementary physics in its own right. Maybe teaching more of it (especially in a phase-space formulation using a Wigner function) before presenting quantum mechanics would help students more easily see what’s truly unusual about quantum mechanics and what’s just an unfamiliar mathematical framework.
(↵ returns to text)
Physicists often define a Lindbladian superoperator as one whose action on an operator can be written as
(1)
for some operator with positive anti-Hermitian part, , and some set of operators . But how does one efficiently check if a given superoperator is Lindbladian? In this post I give an “elementary” proof of a less well-known characterization of Lindbladians:
for all . Here “” denotes a partial transpose, is the “superprojector” that removes an operator’s trace, is the identity superoperator, and is the dimension of the space upon which the operators act.
Thus, we can efficiently check if an arbitrary superoperator is Lindbladian by diagonalizing and seeing if all the eigenvalues are positive.
The terms superoperator, completely positive (CP), trace preserving (TP), and Lindbladian are defined below in Appendix A in case you aren’t already familiar with them.
Confusingly, the standard practice is to say a superoperator is “positive” when it is positivity preserving: . This condition is logically independent from the property of a superoperator being “positive” in the traditional sense of being a positive operator, i.e., for all operators (matrices) , where
is the Hilbert-Schmidt inner product on the space of matrices. We will refer frequently to this latter condition, so for clarity we call it op-positivity, and denote it with the traditional notation .
It is reasonably well known by physicists that Lindbladian superoperators, Eq. (1), generate CP time evolution of density matrices, i.e., is completely positive when and satisfies Eq. (1). This evolution is furthermore trace-preserving when is Hermitian.^{a }
Indeed, for any one parameter family of CP maps obeying the semigroup property that is differentiable about , the family is necessarily generated by some Lindbladian superoperator: . The Hamiltonian and Lindblad operators defining the Lindbladian superoperator in Eq. (1) can be extracted from the eigendecomposition of for small . Although this procedure is highly enlightening, it does not yield an easily “checkable” criterion for when a given superoperators can be put in the Lindbladian form. How could we easily see whether satisfies Eq. (1) without searching exhaustively through all choices of and ?
It has long been known, but is not always widely appreciated by physicists^{b }, that CP maps are exactly those superoperators^{c } that are op-positive under the partial transpose operation^{d }: , where is called the Choi matrix.^{e } (We use the index convention .) This is just one of several elegant relationships between the most important properties of a superoperator, considered as a map on density matrices, and its corresponding Choi matrix :
Map property | Choi property | |
---|---|---|
Map preserves Hermiticity: ) |
⇔ | Choi is Hermitian: |
Map is CP: |
⇔ | Choi is op-positive: |
Map is trace-preserving: |
⇔ | Unit “outer” trace of Choi: |
Map is unital: |
⇔ | Unit “inner” trace of Choi: |
Here, we define the “outer” and “inner” partial traces^{f } as, respectively,
The equivalences in the above table can all be checked explicitly with index manipulation.
We can use the first two equivalences in the table to show the following.
We define to be the superprojector that removes an operator’s trace, , so that .
Totally true fact: These two statements are equivalent:
Proof: First, we’ll show that (1) implies (2).
If is CP, then . For this to hold for arbitrarily small , it must also be true when dropping the terms for all sufficiently small , i.e., for all positive below some threshold. Then we use our first lemma, which is proved in Appendix B.
Lemma 1: The superoperator is an op-positive superoperator for all sufficiently small if and only if is an op-positive superoperator.
Applying Lemma 1, we conclude that (1) implies (2).
Now we’ll show that (2) implies (1). If is op-positive, then by Lemma 1 we know for sufficiently small . Now we make use of our second lemma, also proved in Appendix B.
Lemma 2: If the partial trace of a superoperator is positive, then the partial traces of all positive powers of that superoperator are also positive, i.e., for all positive integers .
If for sufficiently small , then by Lemma 2 we know, for the same values of , that the object
is an op-positive superoperator (for any positive integer ). If we define
then one can then check that and . Since is a mixture (convex combination) of op-positive superoperators , it itself is an op-positive superoperator for all , and hence its limit
is also an op-positive superoperator. This makes completely positive for sufficiently small , but since complete positivity is preserved under composition we conclude that is CP for all . ☐
(2)
to be an equivalent definition to say that is a Lindbladian superoperator. It can be supplemented with the trace-preserving condition (implying for all ) to define the subset of Lindbladians generating CPTP evolution. Lindblad called this subset “completely dissipative”, and it is equivalent to Eq. (2) with Hermitian .
Although Eq. (2) is not as useful as Eq. (1) for understanding the action of a Lindbladian, it is much easier to use Eq. (2) to check whether a given superoperator is Lindbladian.
With a bit of manipulation, we can re-write Eq. (2) in a more quantum-information-y (and less linear-algebraic) way:
where is some maximally entangled state and projects onto the orthogonal subspace.^{g } (This condition is independent of the choice of basis and hence the choice of maximally entangled state.)
If you’ve interested in learning more, Tarasov’s “Quantum Mechanics of Non-Hamiltonian and Dissipative Systems” is the most thorough yet readable monograph I’ve found.
For our purposes, superoperators are just linear operators on the vector space of linear operators.^{h } If we represent finite-dimensional linear operators as matrices, then superoperators are matrices. A superoperator can be indexed as and its action on an operator is given by the matrix elements
Here, are the matrix elements of and the parentheses on just emphasize that we treat this as a joint index (taking values) of . (It does not denote antisymmeterization.)
Lindbladians are the subset of superoperators that can be put in the form
for some Hermitian operator (the Hamiltonian) and some set of operator (the Lindblad operators), where of course . You can express this more elegantly as
where “” is just a tensor product with a bit of syntactic sugar: Given any two operators and , the superoperator is defined to have action .
As explained near the beginning, we distinguish two notions of superoperator “positivity”:
When the first condition holds, people usually just say that is a “positive map”, but this can be confusing because it is logically independent of being a “positive operator” when thought of, naturally enough, as an operator acting on the space of matrices (the second condition).
A superoperator is said to be completely positive (CP) when is positive for all positive integers , where is the indentity superoperator on a separate space of matrices. The tensor product on superoperators is naturally defined as , extended by linearity. (Note that we do not necessarily require CP maps to preserve operator trace.) Complete positivity is a strengthening of positivity preservation (not op-positivity).
Lemma 1: The superoperator is an op-positive superoperator for all sufficiently small if and only if is an op-positive superoperator.
Proof. For a fixed vector and Hermitian operator , consider the family of Hermitian operators for all . If another vector has non-zero overlap with , then
is positive for sufficiently small . Therefore, if there is a such that for arbitrarily small , we know that vector is orthogonal to . Such a vector exists if and only if is not a positive operator, where .☐
Lemma 2: If the partial trace of a superoperator is positive, then the partial traces of all positive powers of that superoperator are also positive, i.e., for all positive integers .
Proof. If is positive then it has the eigendecomposition or, with indices, . The eigenvalues are positive and the eigenoperators are orthonormal under the Hilbert-Schmidt norm. The previous expression also gives us the matrix elements for the original superoperator, allowing us to use simple index manipulation to show that
This expression is manifestly positive since it’s a mixture (convex combination) of superprojectors, so we conclude for positive integers .☐
Remark. This is equivalent to the statement that complete positivity is preserved by composition.
(↵ returns to text)
In this post I review the 2010 book “Lifecycle Investing” by Ian Ayres and Barry Nalebuff. (Amazon link here; no commission received.) They argue that a large subset of investors should adopt a (currently) unconventional strategy: One’s future retirement contributions should effectively be treated as bonds in one’s retirement portfolio that cannot be efficiently sold; therefore, early in life one should balance these low-volatility assets by gaining exposure to volatile high-return equities that will generically exceed 100% of one’s liquid retirement assets, necessitating some form of borrowing.
“Lifecycle Investing” was recommended to me by a friend who said the book “is extremely worth reading…like learning about index funds for the first time…Like worth paying >1% of your lifetime income to read if that was needed to get access to the ideas…potentially a lot more”. Ayres and Nalebuff lived up to this recommendation. Eventually, I expect the basic ideas, which are simple, to become so widespread and obvious that it will be hard to remember that it required an insight.
In part, what makes the main argument so compelling is that (as shown in the next section), it is closely related to an elegant explanation for something we all knew to be true — you should increase the bond-stock ratio of your portfolio as you get older — yet previously had bad justifications for. It also gives new actionable, non-obvious, and potentially very important advice (buy equities on margin when young) that is appropriately tempered by real-world frictions. And, most importantly, it means I personally feel less bad about already being nearly 100% in stocks when I picked up the book.
My main concerns, which are shared by other reviewers and which are only partially addressed by the authors, are:
By far the best review of this book I’ve found after a bit of Googling is the one by Fredrick Vars, a law professor at the University of Alabama: [PDF]. Read that. I wrote most of my review before Vars, and he anticipated almost all of my concerns while offering illuminating details on some of the legal aspects.
One way to frame the insight, slightly different than as presented in the book, is as arising out of a solution to a basic puzzle.
The vast majority of financial advisors agree that retirement investments should have a higher percentage of volatile assets (stocks, essentially) when the person is young and less when they are old. This is often justified by the argument that volatile returns can be averaged out over the years, but, taken naively, this is flat out wrong. As Alex Tabarrok puts it^{a }
Many people think that uncertainty washes out when you buy and hold for a long period of time. Not so, that is the fallacy of time diversification. Although the average return becomes more certain with more periods you don’t get the average return you get the total payoff and that becomes more uncertain with more periods.
More quantitatively: When a principal is invested over years in a fund with a given annual expected return and volatility (standard deviation) , the average of the yearly returns becomes more certain for more years and approaches for the usual central-limit reasons. However, your payout is not the average return! Rather, your payout is the compounded amount^{b }
and the uncertainty of that does not go down with more time…even in percentage terms. That is, the ratio of the standard deviation in payout to the mean payout, , goes up the larger the number of years that the principal is invested.
Sometimes when confronted with this mathematical reality people backtrack to a justification like this: If you are young and you take a large downturn, you can adapt to this by absorbing the loss over many years of slightly smaller future consumption (adaptation), but if you are older you must drastically cut back, so the hit to your utility is larger. This is a true but fairly minor consideration. Even if we knew we would be unable to adapt our consumption (say because it was dominated by fixed costs), it would still be much better to be long on stocks when young and less when old.
Another response is to point out that, although absolute uncertainty in stock performance goes up over time, the odds of beating bonds also keeps going up. That is, on any given day the odds that stocks outperform bonds is maybe only a bit better than a coin flip, but as the time horizon grows, the odds get progressively better.^{c } This is true, but some thought shows it’s not a good argument. In short, even if the chance of doing worse than bonds keeps falling, the distribution of scenarios where you lose to bonds could get more and more extreme; when you do worse, maybe you do much worse. (For an extensive explanation, see the section “Probability of Shortfall” in the John Norstad’s “Risk and Time“, which Tabarrok above linked to as “fallacy of time diversification”.) This, it turns out, is not true — we see below that stocks do in fact get safer over time — but the possibility of extreme distributions shows why the probability-of-beating-bonds-goes-up-over-time argument is unsound.
To neatly resolve this puzzle, the authors make a strong simplifying assumption. (Importantly, the main idea is robust to relaxing this assumption somewhat,^{d } but for now let’s accept it in its idealized form.)
The main assumption is that the portion of your future income that you will be saving for retirement (e.g., your stream of future 401(k) contributions) can be predicted with relative confidence and are financially equivalent to today holding a sequence of bonds that pay off on a regular schedule in the future (but cannot be sold). When we consider how our retirement portfolio today should be split between bonds and stocks, we should include the net present value of our future contributions. That is the main idea.
Under some not-unreasonable simplifying assumptions, Samuelson and Merton showed long ago^{e } that if, counterfactually, you had to live off an initial lump sum of wealth, then the optimal way to invest that sum would be to maintain a constant split between assets of different volatility (e.g., 40% stocks and 60% bonds), with the appropriate split determined by your personal risk tolerance. However, even though you won’t magically receive your future retirement contributions as a lump sum in real life, it follows that if those contributions were perfectly predictable, and if you could borrow money at the risk-free rate, then you should borrow against your future contributions, converting them to their net present value, and keep the same constant fraction of the money in the stock market. Starting today.
Crucially, when you are young your liquid retirement portfolio (the sum of your meager contribution up to that point, plus a bit of accumulated interest) is dwarfed by your expected future contributions. Even if you invest 100% of your retirement account into stocks you are insufficiently exposed to the stock market. In order to get sufficient stock exposure, you should borrow lots of money at the risk-free rate and put it in the stock market. It is only as you get older, when the ratio between your retirement account and the present value of future earnings increases, that you should move more and more of your (visible) retirement account into regular bonds.
The resolution of the puzzle is that the optimal portfolio (in the idealized case) only looks like it’s stock-heavy early in life because you’re forgetting about your stream of future retirement contributions (a portion of your future salary), which, the authors claim, is essentially like a bond that can’t be traded.
(If the above concept isn’t immediately compelling to you, my introduction has failed. Close this blog and just go read the first couple chapters of their book.)
Most of the book is devoted to fleshing out and defending the implications of this idea for the real world where there are a variety of complications, most notably that you cannot borrow unlimited amounts at the risk-free rate. Nevertheless, the authors conclude that when many people are young they should buy equities on margin (i.e., with borrowed money) up to 2:1 leverage, at least if they have access to low enough interests rates to make it worthwhile.
The organization of chapters are as follows:
In general the authors compare their lifecycle investing strategy to two conventional strategies: the “birthday rule” (aka an “age-in-bonds rule“), where the investor allocates a percentage of their portfolio to stocks given by 100 (or 110) minus their age, and the “constant percentage rule”, where the investor keeps a constant fraction of their portfolio in stocks.
In Chapter 3, the authors argue that the lifecycle strategy consistently beats conventional strategies when (a) holding fixed expected return and minimizing variance, (b) holding fixed variance and maximizing expected return, (c) holding fixed very bad (first percentile) returns while maximizing expected return. If you look at a hypothetical ensemble of investors on historical data, one retiring during each year between 1914 and 2010 (when the book was published), every single investor would have been had more at retirement by adopting the lifecycle strategy, and generally by an enormous 50% or more. Here’s the total return of the investors vs. retirement year depending on whether they following a lifecycle strategy, birthday rule, or the constant percentage rule:
And here are the quantiles:
Although they rely on historical simulations for this, it’s really grounded in a very simple theoretical idea: your liquid retirement portfolio is extremely small when you’re young, so for any plausible level of risk aversion, you are better off leveraging equities initially.
Chapter 4 considers more testing variations: international stocks returns, Monte Carlo simulations with historically anomalous stock performance, higher interest rates, etc. They also show the strategy can easily be modified to incorporate (possibly EMH-violating) beliefs about one’s ability to time the market. (The authors use Robert Shiller’s theory of cyclically adjusted price-to-earnings ratio, which they neither endorse nor reject.)
In Chapter 7, the authors draw on the work of Samuelson and Merton to address the key question: what is the constant fraction in stocks that you should be targeting anyways? Assuming assumptions, the optimal “Samuelson share” to have invested in stocks is
The variables above are defined as follows.
The authors give reasons to be wary of taking this formula too seriously, especially because it’s not so easy to know what you should choose (discussed more below). However, it is very notable that as equity volatility increases — say, because the world is gripped by a global pandemic — the appropriate amount of the portfolio to have exposed to the stock market drops drastically. The authors suggest using the VIX to estimate the equity volatility, and appropriately rebalancing your portfolio when that metric changes. Continuously hitting the correct Samuelson share without shooting yourself in the foot looks hard, in practice, which the authors admit. Still, there is so much to gain from leverage that it’s very likely you can collect a good chunk of the upside even with a conservative and careful approach.
The first general point of caution tempers (but definitely does not eliminate) the suggestion to invest in equities on margin: one’s risk tolerance is not an easy thing to elicit. To a large extent we do this by imagining various outcomes, deciding which outcomes we would prefer, and then inferring (with regularity assumptions) what our risk tolerance must be. Therefore, it would likely be a mistake to immediately take whatever risk tolerance you previously thought you had as deployed in conventional investment strategies and then follow the advice in this book. After introspection, I’ve sorta decided that although I am still less risk averse than the general population, I’m more risk averse than I thought because I was following the intuition (which I can now justify better) that I should be heavy in stocks at my age. The authors address the general difficulty of someone identifying their own risk tolerance (e.g., how dependent it is on framing effects), but they do not discuss how your beliefs about your risk tolerance might be entangled with what investment strategy you have previously been using.
However, this bears repeating: For every level of risk tolerance, there exists a form of this strategy that beats (both in expectation and risk) the best conventional strategy. The fact that, when young, you are buying stocks on margin makes it tempting to interpret this strategy is only good when one is not very risk averse or when the stock market has a good century. But for any time-homogeneous view you have on what stocks will do in the future, there is a version of this strategy that is better than a conventional strategy. (A large fraction of casual critics seem to miss this point.) The authors muddy this central feature a bit because, on my reading, they are a bit less risk averse than the average person. The book would have been more pointed if they had erred toward risk aversion in their various examples of the lifecycle strategy.
The second point of caution is gestured at in the criticism by Nobel winner Paul Samuelson^{f }. (He was also a mentor of the authors.) The costs of going truly bust would be catastrophic:
The ideas that I have been criticizing do not shrivel up and die. They always come back… Recently I received an abstract for a paper in which a Yale economist and a Yale law school professor advise the world that when you are young and you have many years ahead of you, you should borrow heavily, invest in stocks on margin, and make a lot of money. I want to remind them, with a well-chosen counterexample: I always quote from Warren Buffett (that wise, wise man from Nebraska) that in order to succeed, you must first survive. People who leverage heavily when they are very young do not realize that the sky is the limit of what they could lose and from that point on, they would be knocked out of the game.
The authors respond to these sorts of concerns by emphasizing that (1) the risk of losing everything is highest when you are very young, which is exactly when the amount you have in your retirement account is very small, and (2) they are recommending adding leverage to your retirement account, not all your assets. If you expect the total of your retirement contributions to be roughly $1 million by the time you retire, losing $20,000 and zeroing out your retirement account when you are 25 is not catastrophic (and is still a rare outcome under their strategy). You should still have a rainy day fund, and you’ll just earn more money in the future.
However, I don’t think this response seriously grapples with the best concrete form of the wary intuition many people have to their strategy. I think the main problem is that most people are implicitly using their retirement account not just as a place to save for retirement assuming a normal healthy life, but also as a rainy day fund for a variety of bad events. In the US, 12% of people are disabled; I don’t know how much you can push down those odds knowing you are healthy at a given time, but it seems like you need to allow for a ~3% chance you are partially or totally disabled at some point. Although people buy disability insurance, they also know that if they ever needed to tap into their retirement account they could, possibly with a modest tax penalty. (Likewise for other unforeseen crises.)
Another way to say this: your future earning are substantially more likely to fail than the US government, so they cannot be idealized as a bond. By purchasing the right insurance, keeping enough in savings account, etc., I’m sure there’s a way to hedge against this, and I’m confident the core ideas in this book survive this necessary hedging. But I would have liked the authors to discuss how to do that in at least as much concrete detail as they describe the mechanics of how to invest on margin.^{g } If people have been relying on the conventional strategy and have consequently been implicitly enjoying a form of buffer/insurance, it is paramount to highlight this and find a substitute before moving on to an unconventional strategy that lacks that buffer.
Now, if we only had to insure against tail risks, that would be fine, but there is an extreme version of this issue that has the potential to undermine the entire idea: why is my future income stream like a bond rather than a stock? I have a ton of uncertainty about how my income will increase in the future. Indeed, personally, I think I trust the steady growth of the stock market more! The authors do advise against adopting their strategy if your future income stream is highly correlated with the market (e.g., you’re a banker), but they don’t get very quantitative, and they don’t say much about what do if that stream is highly volatile but not very correlated with stocks. (Sure, if it’s uncorrelated then you’re want to match your “investment” in your future income stream with some actual investment in stocks for diversification, but how much should this overall high volatility change the strategy?^{h })
It will take some time before I have mulled this around enough to even start assessing whether I should be investing with significant leverage. It seems pretty plausible to me that my future income is much more uncertain than a bond, although that’s something I’ll need to meditate on.
I, like the authors, really wish there was a mutual fund that automatically implemented this strategy, like target-date funds do for (strategies similar to) the birthday rule. At the very least it would induce pointed discussion about the benefits and risks of the strategy. Unfortunately, a decade after this book was released there is no such option and, as the authors admit in the book, concretely implementing the strategy yourself in the real world can be a headache.
However, because of this book I can at least feel less guilty for being overwhelmingly in equities. After finishing this book I finally exchanged much of my remaining Vanguard 2050 target-date funds, which contain bonds, for pure equity index funds. I had been keeping them around in part because going 100% equities felt vaguely dangerous. Now that there is a good argument that the optimal allocation is greater than 100% equities — though that is by no means assured — this no longer feels so extreme. Crossing the 100% barrier by acquiring leverage involves many real-world complications, but in the platonic realm there is nothing special about the divide.
[This has been cross-posted to LessWrong (comments).](↵ returns to text)
Countries around the world have been developing mobile phone apps to alert people to potential exposure to COVID-19. There are two main mechanism used:
The first mechanism generally uses the phone’s location data, which is largely inferred from GPS.^{a } The second method can also be accomplished with GPS, by simply measuring the distance between users, but it can instead be accomplished with phone-to-phone bluetooth connections^{b } (described in more detail below).
Private Kit: Safe Paths is the most well-known COVID-tracking app in the US (Science coverage). It is GPS-based app, currently at a preliminary prototype stage, under development by MIT’s Ramesh Raskar and his colleagues.^{c } Their description of their future abilities:
The Private Kit: SafePaths solution, in its first iteration, enables individuals to log their own location on their own phones. With consent, diagnosed carriers can share an accurate location trail with health officials once they are diagnosed positive…In its second iteration, Private Kit: Safe Paths provides users with information on whether they have crossed paths with a diagnosed carrier. Governments are equipped to redact location trails and thus broadcast location information with privacy protection for diagnosed carriers and local businesses. In its third iteration, Private Kit: Safe Paths enables privacy-protected participatory sharing of location trails of diagnosed carriers and direct notification to users who have been in close proximity to a diagnosed carrier, without allowing a third party, particularly a government, to access individual location trails.
The app is now at “Phase 2” functionality. Even when it is operational, Safe Paths will have two significant drawbacks associated with GPS: The location accuracy can be mediocre in urban areas, and it is necessary to upload the infected user’s minute-by-minute location data to a central server to get the full benefits, which appears to be a privacy vulnerability even if anonymized. (Even when GPS data is stripped of explicit personal information, identify can often be inferred.)
Singapore’s COVID-tracking app, TraceTogether (CNBC coverage), fixes the first problem and partially fixes the second. TraceTogether uses bluetooth signals between mobile phones to determine when app users come into contact with each other so that, in the event that a user contracts the virus, Singapore’s health ministry can find those who had close contact with the infected user. The data logged is stored on the phone in encrypted form. Information regarding potential close contact is stored with “cryptographically generated temporary IDs,” but the information can be decrypted, and the users identified by Singapore’s health ministry. It has been praised for its efficacy.
While TraceTogether is mainly focused on identifying those with whom infected users have made contact, the Singapore governments also sends citizens WhatsApp updates twice a day regarding the total number of cases, the suspected locations of outbreaks, and advice for avoiding them. TraceTogether is also pursuing greater privacy through additional decentralization, as well as a bluetooth tracing standard “BlueTrace”, but these have not been released and the current code is not open source.
COVID Watch is an upcoming open-source mobile app from Tina White, James Petrie, and Rhys Fenwick, in coordination with Stanford University. (AFP coverage.) It also uses bluetooth, but in a decentralized way: When two users come in close proximity, they exchange randomly generated codes. If a user later is diagnosed with the coronavirus, they can obtain a passcode from a central health authority allowing them to anonymously add their proximity code history to a central database. All other users can then cross-reference the proximity history they have received with this database. In this way, they plan to get the accuracy of bluetooth-based apps like TraceTogether with the privacy features promised by future version of Safe Paths.
Here is their graphical representation of the privacy model:
In addition to displaying CDC general COVID-19 advice, symptoms and resources, COVID Watch will also offer personalized advice. If a contact is infected, the app will display the number of the local public health department and advise users to call for information about next steps. Depending on progress with GPS anonymization, in the future COVID Watch may also allow infected users to upload their location history in order to create infection density heat maps of areas where there may be a risk of people or inanimate objects transmitting the disease. Further details are available in their online white paper. You can also see Tina White give a short talk here. If you’re interested in helping out, COVID Watch is looking for contributors.
Below is a necessarily incomplete list of existing and forthcoming apps from other countries:
Non-governmental efforts include CoEpi, HealthLynked, and Bandemic. For a more extensive but under-construction list, see the section “Relevant projects & Circulation list” in this public Google Doc from Mitra Ardron and Peter Eckersley.
(↵ returns to text)
The point is to consider a few thought experiments that share many of the same important features, but for which we have very different intuitions, and to identify if there are any substantive difference that can be used to justify these intuitions.
I will use the term “shocked” (in the sense of “I was shocked to see Bob levitate off the ground”) to refer to the situation where we have made observations that are extremely unlikely to be generated by our implicit background model of the world, such that good reasoners would likely reject the model and start entertaining previously disfavored alternative models like “we’re all brains in a vat”, the Matrix, etc. In particular, to be shocked is not supposed to be merely a description of human psychology, but rather is a normative claim about how good scientific reasoners should behave.
Here are the three scenarios:
Scenario 2: Terrorists release a carefully engineered virus that wouldn’t naturally arise and has a vastly higher fatality rate than natural pathogens. In fact, it is designed to incubate for several years, infect every human on the planet, and then kill each host unless the host has a randomly chosen precise selection of genetic variants, which is expected to occur only once per 4 billion people. (There are 7 billion people on the planet at the time.) One of the following takes place:
In all versions of this scenario, having one or two people worldwide survive is a probabilistically reasonable outcome.
Scenario 3: You live 50 years in the future when there are lots of highly agile robots; however, like today, they are very dumb with no signs of anything we might call intelligence of consciousness. One day you are abducted by evil robots to an underground lair and subjected to a cruel punishment: You are given a fair coin^{b }, which you inspect very carefully, and you are forced to flip it 33 times. If you flip 33 heads in a row (a chance of 1 in 2^{33} = 8,589,934,592 ≈ 8.6 billion), you will be released unharmed. But if any of the flips land tails up, then on the following day you will be…
You proceed to flip 33 heads in a row, and the robots release you alive and with a comfortable scalp. You later find out that Earth was taken over by an insane dictator with robot armies who rounded up everyone and subjected them to the same cruel experiment, and that you were the lone winner.
In all cases, we are reasoning from a situation where our past contains an extremely unlikely event that was necessary (except for Scenario 3(b)) for us to exist in the present when we are doing our reasoning. Furthermore, the event, though extremely unlikely considered in isolation, was likely to occur somewhere because it was attempted a very large number of times.
In Scenario 1, it’s seems we should not be shocked. Abiogenesis may be extremely rare in any given galaxy, but we are not bothered by finding evidence for such an event in our past because it is necessary (given the laws of physics in our universe) for an intelligent observers to trace their origins to such an event, and the universe is big enough that this should have happened at least once.
In Scenario 3, it seems you should be shocked, and should conclude that the coin must have been rigged, or something. For Scenario 3(b), it’s pretty obvious that you personally have been singled out beyond all reasonable odds. And to see that Scenario 3(a) is equivalent, just consider how you should react immediately after flipping the coins, before the time you were scheduled to be executed had you not succeeded. At that point in time, the unlikely event is not necessary for you to exist.^{c } And if flipping all those heads in a row is shocking immediately after it happens, how could it suddenly become un-shocking later, after the execution would have taken place?
In Scenario 2, I am confused. In all of these cases, the unlikely event can be said to take place before you’re born^{d }, and you wouldn’t be around to do the reasoning unless the unlikely event had happened. Scenario 2(a) is especially close to Scenario 1 on the key factors. Yet, there also don’t seem to be any substantive differences between Scenarios 2(c) and Scenario 3(a).
(↵ returns to text)
Here are some quotes. First, the phase-space formulation should be placed on equal footing with the Hilbert-space and path-integral formulations:
When Feynman first unlocked the secrets of the path integral formalism and presented them to the world, he was publicly rebuked: “It was obvious”, Bohr said, “that such trajectories violated the uncertainty principle”.
However, in this case, Bohr was wrong. Today path integrals are universally recognized and widely used as an alternative framework to describe quantum behavior, equivalent to although conceptually distinct from the usual Hilbert space framework, and therefore completely in accord with Heisenberg’s uncertainty principle…
Similarly, many physicists hold the conviction that classical-valued position and momentum variables should not be simultaneously employed in any meaningful formula expressing quantum behavior, simply because this would also seem to violate the uncertainty principle…However, they too are wrong. Quantum mechanics (QM) can be consistently and autonomously formulated in phase space, with c-number position and momentum variables simultaneously placed on an equal footing, in a way that fully respects Heisenberg’s principle. This other quantum framework is equivalent to both the Hilbert space approach and the path integral formulation. Quantum mechanics in phase space (QMPS) thereby gives a third point of view which provides still more insight and understanding.
What does it get you?
[Quantum mechanics in phase space] can obviously shed light on subtle quantization problems as the comparison with classical theories is more systematic and natural. Since the variables involved are the same in both classical and quantum cases, the connection to the classical limit as ħ → 0 is more readily apparent. But beyond this and self-evident pedagogical intuition, what is this alternate formulation of QM and its panoply of satisfying mathematical structures good for?
It is the natural language to describe quantum transport, and to monitor decoherence of macroscopic quantum states in interaction with the environment, a pressing central concern of quantum computing. It can also serve to analyze and quantize physics phenomena unfolding in an hypothesized noncommutative spacetime with various noncommutative geometries. Such phenomena are most naturally described in Groenewold’s and Moyal’s language.
However, it may be fair to say that, as was true for the path integral formulation during the first few decades of its existence, the best QMPS “killer apps” are yet to come.
Dirac was not a fan^{a }:
A representative, indeed authoritative, opinion, dismissing even the suggestion that quantum mechanics can be expressed in terms of classical-valued phase space variables, was expressed by Paul Dirac in a letter to Joe Moyal on 20 April 1945…Dirac said, “I think it is obvious that there cannot be any distribution function F(p, q) which would give correctly the mean value of any f(p,q) …” He then tried to carefully explain why he thought as he did, by discussing the underpinnings of the uncertainty relation.
On the trials of Hip Groenewold:
Ever since his return from England in 1935 until his permanent appointment at theoretical physics in Groningen in 1951, Groenewold experienced difficulties finding a paid job in physics. He was an assistant to Zernike in Groningen for a few years, then he went to the Kamerlingh Onnes Laboratory in Leiden, and taught at a grammar school in the Hague from 1940 to 1942. There, he met the woman whom he married in 1942. He spent the remaining war years at several locations in the north of the Netherlands. In July 1945, he began work for another two years as an assistant to Zernike. Finally, he worked for four years at the KNMI (Royal Dutch Meteorological Institute) in De Bilt.
During all these years, Groenewold never lost sight of his research
The authors later give a long list of results where the phase-space formulation led to key insights. I can’t evaluate them, but I am familiar with Diosi and Kiefer’s elegant results on the finite-time positivity of phase-space representations, which has broad implications for the fragility of entanglement under even weak decoherence.
(↵ returns to text)
Ground-state cooling of nanoparticles in laser traps is a very important milestone on the way to producing large spatial superpositions of matter, and I have a long-standing obsession with the possibility of using such superpositions to probe for the existence of new particles and forces like dark matter. In this post, I put this milestone in a bit of context and then and then toss up a speculative plot for the estimated dark-matter sensitivity of a follow-up to Delić et al.’s device.
One way to organize the quantum states of a single continuous degree of freedom, like the center-of-mass position of a nanoparticle, is by their sensitivity to displacements in phase space. This can be formalized as the fidelity between a state and its displacement ,
where has displacement components and in space and momentum, and where is the displacement operator. (The fidelity reduces to the squared overlap when the states are pure.) If the displaced state is highly distinguishable from (has low fidelity with) the undisplaced state, then there are no quantum limitations^{a } on distinguishing the two potential outcomes. This might be mean, e.g., detecting the momentum transfer from a scattering dark-matter particle. States that are hot (large mixedness, smeared over phase space) have low sensitivity to displacements, and sensitivity goes up as the state is cooled, localizing it toward a known location and momentum. However, the sensitivity saturates at a fixed finite value at zero temperature, when the Wigner function has irreducible area in phase space.
To increase sensitivity beyond this limit (the standard quantum limit, SQL), we need to move to non-classical states. One possibility is squeezing, producing increased sensitivity in one direction (e.g., position) at the expense of decreased sensitivity in the other (e.g., momentum). Another class of possibilities are “cat states”, i.e., a coherent superposition of two states which are individually roughly classical (localized in phase space) but are distant from each other in phase space. Squeezing or superposing states lets one keep increasing the displacement sensitivity as far as one’s equipment can manage. In a restricted sense, squeezed and superposed states have a “negative effective temperature” with regards to displacement sensitivity. Ground state cooling is a crucial step on the road from a hot messy state to an exquisitely sensitive quantum superposition. Here’s a cartoon I’ve posted previously:
Joyously, Delić et al. not only report cooling a nanoparticle to its ground state, they also ambitiously claim that producing and verifying a spatial superposition of the nanoparticle over length scales similar to its radius may be achieved with some relatively straightforward modifications.^{b } First, here are parameters from the (super-impressive) completed experiment:
Now their delicious speculation:
We expect that a combination of ultra-high vacuum with free-fall dynamics will allow to further expand the spatio-temporal coherence of such nanoparticles by several orders of magnitude, thereby opening up new opportunities for macrosopic quantum experiments….What conditions are required to achieve an expansion of the wavepacket until it reaches the size of the nanosphere itself?…Given the expansion of the undisturbed wavepacket…we require an expansion time of …12 ms, demanding a decoherence rate below 84 Hz. This is achievable by a reduction of the pressure by at least a factor of to below mbar. However, at these pressures blackbody radiation of the internally hot particle becomes relevant. To further reduce decoherence to the desired level (below the gas scattering contribution) requires cryogenic temperatures (below 130K) for both the internal particle temperature and the environment. This could be achieved either by combining a cryogenic (ultra-high) vacuum environment with laser refrigeration of the nanoparticle or with low-absorption materials.
Assuming they can do this, we are looking at a spatial superposition with something in the neighborhood of these properties:
This would be a truly mammoth amount of matter to superpose, beating the current world record — also in Vienna! — by some five orders of magnitude.
It’s known that, in a way that can be made precise, big superpositions are sensitive to very small momentum transfers that are otherwise undetectable. In our PRD, Itay Yavin and I looked at some simple models of dark matter to see if any would be identifiable by recently proposed experiments pushing the bounds of superposition size. There were two experiments that would be highly sensitive to a range of parameter space, but which would require many years of technical advances to achieve. (One of them was to operate in space, at a cost of hundreds of millions of euros.) The other, nearer-term experiments could not be sensitive to dark matter except under quite optimistic assumptions in a narrow region of parameter space. In particular, the improved limits on new light scalar mediators from estimated plasma mixing effects in stellar cores by Hardy and Lasenby probably rule out all models that these more tractable experiments might have been sensitive to.
The hypothetical interaction between the dark matter and matter we considered looks like this:
Emboldened by Delić et al., let’s rashly modify one of the sensitivity plots from our paper to get a sense for what we could do with the huge superpositions they suggest are achievable. The solid green curve in the figure below delineates the fraction of the allowed parameter space where dark matter would induce detectable decoherence in such an experiment.
Pretty rad. Producing superpositions of the kind suggested by Delić et al. are likely the most tractable path to begin probing dark matter through decoherence. Of course, there are many caveats:
The primary reason to be optimistic about future experiments is the strong scaling of the sensitivity with nanoparticle radius due to the coherent scattering enhancement.^{e } In contrast, the primary sources of decoherence (collisions with ambient gas molecules and emission of blackbody radiation) scale much more slowly with the radius ( and , respectively).
[I thank Robert Lasenby for discussion.](↵ returns to text)
Summary: The correct measure of bit-erasure capacity N for an isolated system is the negentropy, the difference between the system’s current entropy and the entropy it would have if allowed to thermalize with its current internal energy. The correct measure of erasure capacity for a constant-volume system with free access to a bath at constant temperature is the Helmholtz free energy (divided by , per Landauer’s principle), provided that the additive constant of the free energy is set such that the free energy vanishes when the system thermalizes to temperature . That is,
where and are the internal energy and entropy of the system if it were at temperature . The system’s negentropy lower bounds this capacity, and this bound is saturated when .
Traditionally, the Helmholtz free energy of a system is defined as , where and are the internal energy and entropy of the system and is the constant temperature of an external infinite bath with which the system can exchange energy.^{a } (I will suppress the “Helmholtz” modifier henceforth; when the system’s pressure rather than volume is constant, my conclusion below holds for the Gibbs free energy if the obvious modifications are made.)
However, even in the case of fixed bath temperature, we cannot naively use Landauer’s principle to divide the free energy by to get the erasure capacity. Indeed, the free energy definition of above is only meaningful up to an additive constant, and most traditional results are about free-energy differences as the properties of the system change. More specifically, when the system evolves from one state to the next, the free energy difference tells us how much work must have been done by the system^{b }: .
In order to fix the absolute value of the free energy, we want to set it to zero when when the system has equilibrated to the same temperature as the bath, i.e., when the system has ceased to be useful for powering erasures by exploiting either internal system resources or the system-bath differential. (Recall that generically the free energy is well defined even when the system isn’t internally thermalized and therefore doesn’t have a well-defined temperature.) Thus we want to define
where and are the energy and entropy of the system if it were thermalized to temperature . (Note that and cannot be trivially inferred from macroscopic variables of the system when it’s in an arbitrary state since they depend on material properties like the heat capacity.) In particular, this free energy does not vanish when the system is in its ground state, which you might have thought if you took the common definition to be meaningful in absolute value rather than just for differences. This makes sense because we can extract work (and hence perform bit erasures) when the system is hotter or colder than the bath.
With our new definition it’s easy to check directly from the 1st and 2nd law that Landauer’s principle can be used to determine the erasure capacity of the system-bath combination. When brought from some initial state to a final (bath-thermalized) state, the system’s entropy increases by . If bit erasures are made on a memory tape, the tape’s entropy changes by bits. So the total amount of entropy pushed into the bath is (at least) by the 2nd law, requiring minimum energy . The latter quantity lower bounds , the system’s internal energy change, by the 1st law. Re-arranging that bound defines the erasure capacity: .
To see this achieved explicitly in the case of a system initially thermalized to some different temperature, just
(↵ returns to text)
I think this claim is “probably approximately” false, but no one knows for sure. Our critique shows that, contra Sandberg et al., its validity can’t be assessed by appealing solely to basic facts about cosmology and the principles of the thermodynamics of computation. Rather, the claim depends in detail on whether the optimal devices that can be created in the universe have certain physical parameters (e.g., of all the ways of assembling atoms in a cubic light-year, the minimal thermal conductivity achievable is 10^{-15} Watts per meter-Kelvin). These are engineering/chemistry questions which their paper doesn’t seriously address. We demonstrated this dependence in Section 2 of our critique by constructing a toy model where an aestivation incentive exists when making a (contrived) assumption about the physical devices that are possible, but where that inventive disappears when the assumption is relaxed. The assumption is logically independent of the laws of thermodynamics.
The crux is then whether that assumption (or one with equivalent implications) is physically reasonable in our universe, which we address in Sections 3 and 4. We argue that
In personal correspondence responding to our critique, Sandberg et al. have advanced physical arguments for the assumption based on things like black holes and large asymmetries in effective insulation strengths. These arguments don’t smell right to me at all, but they are almost impossible to assess because I haven’t yet seen them presented thoroughly.
]]>