experiment | ref. | object composition |
object radius (nm) |
nucleon count |
superposition size (nm) |
lifetime (ms) |
repetition rate (Hz) |
---|---|---|---|---|---|---|---|
KDTL | [1-3] | Oligoporphyrin^{a } | ∼1 | 2.7 × 10^{41} | 266 | 1.24 | 10,000 |
OTIMA | [4-6] | Gold (Au) | 5 | 6 | × 10^{61}79 | 94 | 600 |
Bateman et al. | [7] | Silicon (Si) | 5.5 | 1.1 × 10^{61} | 150 | 140 | 0.5 |
Geraci et al. | [8] | Silica (SiO_{2}) | 6.5 | 1.6 × 10^{61} | 250 | 250 | 0.5 |
Wan et al. | [9] | Diamond (C) | 95 | 7.5 × 10^{91} | 100 | 0.05 | 1 |
MAQRO | [10-13] | Silica (SiO_{2}) | 120 | 1 | × 10^{10}100 | 100,000 | 0.01 |
Pino et al. | [14] | Niobium (Nb) | 1,000 | 2.2 × 10^{13} | 290 | 450 | 0.1 |
Stickler et al.^{b } | [15-17] | Silicon (Si) | 5 | 5 | × 10^{51}20 | 20 | 5 |
Delić et al. | [18] | Silica (SiO_{2}) | 71 | 2 | × 10^{91}100 | 10 | 10 |
Marshman et al. | [19] | Diamond (C) | 100 | 1 | × 10^{10}20,000 | 1,000 | 0.1 |
Wood et al. | [20] | Diamond (C) | 125 | 1.7 × 10^{10} | 250 | 100 | 1 |
The proposals^{c } above the black line are an updated version of those appearing in Table 1 of my 2017 paper with Itay Yavin. The proposals below the black line have were made more recently. Delić et al. was discussed by me here. Above the line, the repetition rate is either taken directly from the relevant proposal or was estimated based on private correspondence with the authors. Below the line, I have just inverted the the superposition lifetime and added a factor of ten of overhead.
(↵ returns to text)
Here is Google’s new language model PaLM having a think:
Alex Tabarrok writes
It seems obvious that the computer is reasoning. It certainly isn’t simply remembering. It is reasoning and at a pretty high level! To say that the computer doesn’t “understand” seems little better than a statement of religious faith or speciesism…
It’s true that AI is just a set of electronic neurons none of which “understand” but my neurons don’t understand anything either. It’s the system that understands. The Chinese room understands in any objective evaluation and the fact that it fails on some subjective impression of what it is or isn’t like to be an AI or a person is a failure of imagination not an argument…
These arguments aren’t new but Searle’s thought experiment was first posed at a time when the output from AI looked stilted, limited, mechanical. It was easy to imagine that there was a difference in kind. Now the output from AI looks fluid, general, human. It’s harder to imagine there is a difference in kind.
Tabarrok uses an illustration of Searle’s Chinese room featuring a giant look-up table:
But as Scott Aaronson has emphasized [PDF], a machine that simply maps inputs to outputs by consulting a giant look-up table should not be considered “thinking” (although it could be considered to “know”). First, such a look-up table would be beyond astronomically large for any interesting AI task and hence physically infeasible to implement in the real universe. But more importantly, the fact that something is being looked up rather than computed undermines the idea that the system understands or is reasoning.
Of course, GPT-3 and PaLM are not consulting a look-up table, but they are less flexible and arguably much less compressed than a human brain. They may do a large amount of nominal computation, but I suspect their computation is very inefficient and lies somewhere on the (logarithmically scaled) “spectrum of understanding” between a look-up table and the human brain. In this case, I think it’s fair to say they “only partially understand” or something like that.
Alex Tabarrok says that “it’s hard to imagine there is a difference in kind”, but I’d counter with the popular AI aphorism that “sufficient large quantitative difference are essentially qualitative”. Indeed, I’d say the recent results are still very consistent with — though by no means demonstrate — these closely related claims
But who knows. Gwern is bullish.
]]>While reading Hayden & Sorce’s nice recent paper [arXiv:2108.08316] motivating the choice of traceless Lindblad operators, I noticed for the first time that the trace-ful parts of Lindblad operators are just the contributions to Hamiltonian part of the reduced dynamics that arise at first order in the system-environment interaction. In contrast, the so-called “Lamb shift” Hamiltonian is second order.
Consider a system-environment decomposition of Hilbert space with a global Hamiltonian , where , , and are the system’s self Hamiltonian, the environment’s self-Hamiltonian, and the interaction, respectively. Here, we have (without loss of generality) decomposed the interaction Hamiltonian into a tensor product of Hilbert-Schmidt-orthogonal sets of operators and , with a real parameter that control the strength of the interaction.
This Hamiltonian decomposition is not unique in the sense that we can always^{b } send and , where is any Hermitian operator acting only on the system. When reading popular derivations of the Lindblad equation
(1)
like in the textbook by Breuer & Petruccione, one could be forgiven^{c } for thinking that this freedom is eliminated by the necessity of satisfying the assumption that , which is crucially deployed in the “microscopic” derivation of the Lindblad equation operators and from the global dynamics generated by . (Here, is the interaction-picture version of the interaction Hamiltonian and is the global initial state.) However, a careful reading will show that this is in fact only necessary if you want to satisfy the stronger condition that (without the commutator comma), i.e., that for all . Several authors, including Breuer & Petruccione, suggest that we might as well do so since we can always choose to satisfy this stronger condition by making the above transformation with the choice .
However, this choice can have the consequence of introducing (or eliminating or modifying) the trace of the Lindblad operators. Obviously, at the end you will get the same reduced dynamics, but you need to be mindful that this is happening since the Hamiltonian part of the dynamics you might want to understand can be “hiding” inside the Lindblad operators. This choice can also have a big impact on the feasibility of analytically deriving the Lindblad operators since it generically changes the energy eigenbasis of , which plays a very important role in the derivation.
So how do we interpret the trace-ful part of the Lindblad operators? One can check that they merely make a Hamiltonian contribution to the dynamics. Let’s uniquely expand the Lindblad operators as , where is the traceless part of and is a complex number proportional to the trace. Then using the traditional definition of the Hamiltonian superoperator and the dissipator superoperator , we see
(2)
with^{d } the new contribution to the Hamiltonian part of the reduced dynamics. Note that all the terms above that are quadratic in have canceled.
Importantly, this Hamiltonian contribution from the trace-ful part of the Lindblad operators is first order in the coupling constant (hence the “” superscript) and therefore is distinct from the strictly second-order Hamiltonian contribution , also arising during this microscopic derivation, that is commonly known as the (generalized) Lamb shift. To see this, one merely notes in the derivation that and , where the are operators on constructed linearly during the derivation from the operators in the interaction Hamiltonian . We end up with a Lindblad equation, (1), but now with the replacements and .
Hayden & Sorce, generalizing^{e } the work of Gorini, Kossakowski, & Sudarshan^{f }, show that subtracting off the Lindblad-operator traces in this way minimizes a certain norm of the dissipative superoperator associated with the Lindblad operators, and so in this sense this choice is preferred.
[I thank Dan Ranard for discussion and for bringing Hayden & Sorce to my attention.]
(↵ returns to text)
Here’s the abstract:
In this post I will give a non-rigorous bird’s-eye-view of the main ideas that seem most important to me. In particular, I will mostly concentrate on the non-relativistic setting and comment only briefly on the relativistic case (which Weingarten handles in detail). The following table of contents provides an outline of this blog post.
My one-sentence summary of Weingarten’s proposal: The correct decomposition of the wavefunction into branches is given by the sum of orthogonal components that minimizes a linear combination of branch-norm entropy and (norm-weighted) mean squared branch complexity.
Quantum circuit complexity comes in several flavors. Weingarten uses Nielsen’s geometric version of complexity [quant-ph/0502070, quant-ph/0603161] with nearest-neighbor generators only^{b } and a zero-complexity reference class of product states, as I now define. Given the tensor-product Hilbert space over lattice sites , the relative complexity between two states is the minimum amount of “time” necessary to evolve one state to the other using a “Hamiltonian” with unit Hilbert-Schmidt norm constructed as the sum of nearest-neighbor terms. This means we consider
(1)
where the sum is over all nearest-neighbor pairs and where each is a Hermitian operator on . More specifically, the relative complexity between two pure states and is defined to be the minimum value of such that there exist at least one such schedule satisfying
(2)
where the exponentiated integral is time ordered by in the usual way. Then, picking the zero-complexity reference class to be the set of all first-quantized^{c } product states, the (absolute) complexity of a single state is its minimum relative complexity with a product state:
(3)
Very importantly, the squares of the complexity add over uncorrelated spatial regions: , where are disjoint spatial regions.
For any candidate decomposition of the state into orthogonal (unnormalized) components , Weingarten defines the net complexity of the decomposition to be
(4)
where are the norms of the candidate branches, is the norm-weighted mean-squared-complexity of the decomposition, is the Shannon entropy of the branch norms, and is a free parameter with units of volume^{d }.
We can now state the heart of Weingarten’s proposal: At any given time, the correct decomposition of the wavefunction into branches is given by the orthogonal decomposition that minimizes the net complexity.
The trade-off between the amount of branching and the per-branch complexity is quantified by the as-yet unspecified parameter . The interpretation and determination of is very subtle, which I will address in a forthcoming blog post, but for now let me for now point out two things.
The viability of this proposal for wavefunction branches depends in a large part on it satisfying physically sensible properties discussed in the next subsection. Weingarten gives arguments that these properties are almost always satisfied, relying on a reasonable but imprecise conjecture from Brown, Susskind, and Zhao^{e } called the “Second Law of Quantum Complexity” [1507.02287, 1608.02612, 1701.01107], which I’ll briefly express.
The claim is that, for almost all nearest-neighbor Hamiltonians with Hilbert-Schmidt norm , and for almost all initial states with much less than the maximum possible quantum circuit complexity, the quantum-circuit complexity of a state grows nearly as fast as it can until it reaches maximum complexity, i.e., it nearly saturates the “speed limit”
(5)
One argument for this is based on the distribution of complexity over the space of states: the vast majority of states in Hilbert space have near-maximal complexity because , where is the Haar measure and is the set of states with complexity no larger than . Thus, of all the directions in Hilbert space that Hamiltonian evolution might drive a state, the overwhelming majority lead to higher complexity.
Weingarten argues for the following “binary-tree properties” of the branch decomposition of an evolving state, which are necessary for the time-dependent decomposition to form a binary tree that can be interpreted as wavefunction branches.
These are claimed to hold generically (i.e., for almost all states) on the sub-exponential (in system size) timescales for which the Brown-Susskind quantum-circuit complexity conjecture hold. Although Weingarten offers detailed arguments for the above conjectured properties, I believe some of them may fail in unusual or not-so-unusual situations. However, they, or a slight weakened version of them, must hold for this circuit-complexity definition of branches to be viable as it currently stands.
I believe many potential readers will be sufficiently wary of this overall project to not commit to an initial time investment of 49 dense pages. For them I suggest initially reading only Sec. I – XI (18 pages), passing over the proofs in Appendix A – C and in Sec. VII – X, and but making sure not to miss the details in Secs. III – VII. Here’s why.
First, the bulk of the paper is divided cleanly between the non-relativistic case (Secs. III – XI and Appendices A – C) and the relativistic case (Secs. XII – XXIV and Appendices D – G). The basic ideas of complexity-based branches are well captured in the non-relativistic discussion, whereas the relativistic discussion is (in my opinion) relevant mostly if you are already convinced that this branch decomposition is very promising and you want to see the extent to which it can be reconciled with Lorentz covariance.
Second, The detailed proof of the complexity bounds that are found in Sec. IV (proved in Appendix B and C) and in Sec. VII – X, impressive as they are, are not necessary for understanding the core of the paper. The proofs are a bit brute force (e.g., carefully addressing the Schmidt spectra of a state) and are probably most interesting for thinking about how things would be changed if the complexity function were modified, e.g., if non-nearest-neighbor Hamiltonians or different norms besides Hilbert-Schmidt were used.^{f } But if you’re trying to figure out whether you should care about Weingarten’s main ideas quickly, I would not get hung up on these details on an initial reading.
On the other hand, I found the (non-rigorous) arguments in Sec. VI for how the branching develops under time evolution to be super interesting beyond the bottom-line result. Although it could be cleaned up a bit, this discussion is really novel, bridging the two key ideas, complexity and wavefunction branches, that Weingarten has tied together for the first time.
Here I’ll emphasize some key ideas that I think could easily be under-appreciated.
Readers of Weingarten’s paper should know that the multi-fermion state discussed in Sec. IV (“Complexity of entangled multi-fermion state”), which initially appeared quite mysterious to me, is just a particular lattice instantiation of an -partite GHZ state with components (branches):
(6)
with . Here, is the unit norm () complex phase of the -th branch, is the normalized state of subsystem (a subset of the lattice) conditional on the -th branch, and is the vaccum state on the rest of the universe. Weingarten takes this basic multipartite entanglement structure and encodes it into the lattice in different spatial ways, which are parameterized by a choices of spatial regions (), spin choices (), and surfaces (). In particular, the subsystems (parts) of the GHZ state are the regions , and, conditional on branch , the state of this region is constructed from the vacuum by placing fermions of spin at each site in region (while leaving the rest of as a vacuum)^{g }; this ensures that the are orthogonal as required. The surfaces are used by Weingarten to specify spatial gaps between the regions which, as he shows, have implications for the complexity of the state.
When we model the macroscopic world with quantum states, we generally assume that widely separated spatial regions do not initially share much entanglement. (For example, when we model a measurement, we generally assume the measuring device begins completely uncorrelated with the measured system.) Roughly speaking, the macroscopic world looks like a tensor network of bounded bond dimension or, at least, such states are consistent with our observations to good accuracy. Yet, if the wavefunction of the world is just evolving unitarily, the amount of entanglement should be increasing steadily, as we evolve from an out-of-equilibrium area-law toward a heat-death volume-law.
The putative explanation for this is that, although the full wavefunction of the world is quickly accruing entanglement on larger scales, the bond dimension of each branch remains bounded. This requires a sufficiently high rate of branching to “absorb” all the entanglement being generated and, of course, that the branches are chosen so that each individually has low entanglement even though their sum does not.
There are of course very many ways to quantify many-body entanglement, but if you already think that complexity is particularly elegant way, and in particular if you think the proposed second law of complexity is likely to capture important features of the irreversibility necessary for a sensible definition of branches, then Weingarten’s decomposition based on net complexity is a fairly natural guess; it optimizes for the smallest amount of branching that achieves the lowest per-branch complexity.
As an aside, Weingarten’s proposal also suggests an alternative version: Perhaps, in one spatial dimension, branches should be given by the decomposition that minimizes “net MPS entanglement”, a quantity
(7)
obtained from net complexity by replacing the squared complexity function on candidate branches with some function that measures the degree of entanglement in the MPS representation of . For instance, one might consider the spatial mean bond entropy
(8)
where the first sum is over MPS bonds for a spatial lattice of size and where is the set of unnormalized Schmidt values for branch at the bond between sites and . Besides it not being immediately clear if this is well behaved mathematically, it seems somewhat less likely to me that this would correctly describe the all important irreversibility of branches. But maybe this general direction could be worth thinking about.
One notable deficiency of the purely records-based definition [1608.05377] of branches I wrote about is that it cannot absorb sufficient entanglement to ensure the bond dimension necessary to represent branches remains bounded, as collaborators^{h } and I later discovered. Some form of record-based branches could still be consistent with observations and basically the correct choice, but they would have a bunch of hidden entanglement that was infeasible to detect and would not admit a tensor-network representation. (We will be writing about this in more detail in the future.)
In contrast, Weingarten’s complexity-based definition looks like it may identify much finer-grained branches that have bounded bond dimension and consequently obey an area law. See Weingarten’s discussion, using different language, in Sec. X. Insofar as complexity-based branches and records-based branches are roughly compatible (very speculative), the complexity-based decomposition would be a fine-graining of the records-based one. This would mean that each coarse-grained branch (distinguished from other such branches by records) would be a sum of fine-grained branches that are macroscopically indistinguishable yet essentially independently evolving in the sense that it would be infeasible to detect coherence between the different fine-grained branches. This has important implications for using branch decompositions to run faster numerical simulations.
In both the relativistic and non-relativistic cases, one could ask that a reasonable definition of branching obey:
For the Weingarten decomposition, the spatial independence follows from the additivity of Shannon^{i } entropy and the aforementioned additivity of the complexity. Temporal independence would follow from the claimed binary-tree properties.
In the second half of the paper, Weingarten proposes a method for defining branches in a (flat-space) relativistic setting that enjoys a certain flavor of Lorentz invariance. He makes use of a random lattice (on spacetime) with a Lorentz-invariant density and, loosely following Adrian Kent’s ideas [0708.3710, 1311.0249, 1608.04805], defines a branch decomposition at asymptotically late time that plausibly is boost- and translation-covariant in the limit of small lattice density. A frame-dependent branch decomposition at finite time can then be obtained by evolving the decomposition backward. As Weingarten correctly emphasizes, this frame dependence is to be expected: spacelike separated branching events will occur in different time order in different frames.
Although I don’t know enough about relativistic random lattices to say anything with authority, Weingarten’s approach to defining complexity in the relativistic setting looks like a novel and very interesting set of ideas quite independent of its application to branching.
I’ll close by noting that space and time will not be treated on completely equal footing by any sensible notion of branching, and this is to be expected: branching is a thermodynamic process, characterized by effective irreversibility, and so is intimately connected to the specialness of the low-entropy state on a spacelike hypersurface in the distant past (with no similar assumption about timelike hypersurfaces at large spatial distances).
(↵ returns to text)
(1)
where and are coherent states, is the mean phase space position of the two states, “” denotes the convolution, and is the (Gaussian) quasicharacteristic function of the ground state of the Harmonic oscillator.
The quasicharacteristic function for a quantum state of a single degree of freedom is defined as
where is the Weyl phase-space displacement operator, are coordinates on “reciprocal” (i.e., Fourier transformed) phase space, is the phase-space location operator, and are the position and momentum operators, “” denotes the Hilbert-Schmidt inner product on operators, , and “” denotes the symplectic form, . (Throughout this post I use the notation established in Sec. 2 of my recent paper with Felipe Hernández.) It has variously been called the quantum characteristic function, the chord function, the Wigner characteristic function, the Weyl function, and the moment-generating function. It is the quantum analog of the classical characteristic function.
Importantly, the quasicharacteristic function obeys and , just like the classical characteristic function, and provides a definition of the Wigner function where the linear symplectic symmetry of phase space is manifest:
(2)
where is the phase-space coordinate and is the position-space representation of the quantum state. This first line says that and are related by the symplectic Fourier transform. (This just means the inner product “” in the regular Fourier transform is replaced with the symplectic form, and has the simple effect of exchanging the reciprocal variables, , simplifying many expressions.) The second line is often taken as the definition of the Wigner function, but it suffers from explicitly breaking symmetry in phase space, unnecessarily privileging position over momentum. The above relations make it clear that is yet another 1-to-1 representation of a quantum state.
First, we will need these checkable properties of the displacement operator,
(3)
from which we can invert the definition of the quasicharacteristic function:
(4)
Next, take to be an arbitrary normalized pure wavefunction (i.e., ) that will serve as a “reference wavepacket”. This is typically taken to be a wavepacket with minimal amounts of momentum well localized around the origin in configuration space, that is, a state whose Wigner function is mostly concentrated around the origin in phase space. Then we define to be the reference wavepacket displaced in phase space by the vector . We call the set the “wavepacket basis”; it forms an overcomplete basis (formally, a frame) of the Hilbert space, in particular providing a resolution of the identity . For concreteness, you can if you like take to be the ground state of the Harmonic oscillator, i.e., a Gaussian with zero expectation of position and momentum: for some characteristic spatial scale ; this makes the set of coherent states.
Now we consider the matrix elements in the wavepacket basis. Unlike for an orthonormal basis, there is no sharp distinction between off-diagonal and on-diagonal matrix elements. Rather, can be considered roughly off-diagonal whenever and are sufficiently far apart^{a } that . Large off-diagonal terms are indicative of long-range coherence in phase space, where “large” is relative to how closely it saturates the Cauchy-Schwartz inequality
(5)
For instance, if is the coherence superposition^{b } of two widely separated wavepackets and , and if is the corresponding incoherent mixture, then but , with all on-diagonal elements the same: .
We can then compute
(6)
and using the shorthands and for the phase-space mean and separation between the two wavepackets and , we have
(7)
where is the quasicharacteristic function of the reference wavepacket . Several things could be said about this expression, especially if we introduced the twisted convolution, but let’s just observe that if for outside some region in reciprocal phase space then only “knows” about when is in that region translated so it’s centered around . Furthermore, it only knows about the part proportional to the local Fourier component . In particular, if our reference wavepacket is Gaussian, , then , so that is essentially determined by the values takes in a -sized region centered around .
From this one can also quickly check that if we take the squared norm of this off-diagonal matrix element and integrate over the entire phase space with a fixed value of the separation between the two points and , we get
(8)
where “” denote the convolution. So we find that the “total amount of coherence” over the phase-space distance (i.e., the summed amount of coherence between all pairs of wavepackets separated by ) is encoded in the value of in a small -sized region around . In the aforementioned Gaussian case, we have .
(↵ returns to text)
Although I’m not sure they would phrase it this way, the key idea for me was that merely protecting massive superpositions from decoherence is actually not that hard; sufficient isolation can be achieved in lots of systems. Rather, much like quantum computing, the challenge is to achieve this level of protection while simultaneously having sufficient control to create and measure superpositions.
Carney et al. observe that you do not need to be able to implement a Hadamard-like gate (i.e., a gate that takes a state in the preferred quasi-classical basis^{a } to superpositions thereof) on the massive system in order to demonstrate that it’s storing quantum information. You just need to be able to implement a controlled unitary on it, in any basis, that is controlled by a second (smaller) quantum system that you do have more complete control over. More specifically, they suggest starting with the control system in a superposition of two near “gravitational eigenstates” and , allowing this state to become entangled with an initial state of a massive oscillator (decohering both the oscillator and the control system), and then witnessing recoherence (revival) of the control system as the oscillator disentangles into a final state . (By “gravitational eigenstates”, I just mean states of the oscillator that, to a good approximation, source a quasiclassical state of gravitational field; in this case, it’s something like a wavepacket that’s well localized in space, rather than being in a superposition of widely separated positions that would have distinctly different corresponding gravitational fields.) For this, all that needs to be achieved is evolution of the form
(1)
where (or at least is less than for partial decoherence). Importantly, at no time does the massive oscillator need to be brought into a coherent superposition like of gravitational eigenstates. Furthermore, it doesn’t even matter whether and are the same. If you can implement this evolution and witness the revival, and you can convince yourself that the control system couldn’t have been entangling with anything else, then you have shown that the gravitational field is transmitting quantum information.
The general idea of leveraging quantum control of a small system to gain partial quantum control of a large system isn’t itself a novel idea, but the authors go on to show that
The first part is unintuitive, but you can basically read it off from Eq. (1): the decoherence and recoherence can still happen even if the massive oscillator starts and ends in mixtures of states and , just as long as the disentangling happens at the same moment in time (to very high accuracy) for all members of the ensemble. (That’s where the strong harmonicity assumption comes in.) Furthermore, the initial and final ensembles don’t have to be the same. In particular, the contraction of the state of the oscillator by, say, one quanta over the course of a single period (e.g., from to ) doesn’t prevent you from having a pretty clean revival. (The visibility will only go down insofar as the contraction is so strong that the oscillator states are piling up on top of each other near the ground state, preventing full disentanglement.)
Instead, the only thing you’re really worrying about is isolation, i.e., the extent to which you can prevent the two paths of the oscillator (each conditional on different states and of the control system) from getting decohered by the larger environment.
Here are some concerns that I haven’t fleshed out yet:
Naturally, I was very interested to know whether I could shoe-horn this idea into my hobby horse: decoherence detection. Unfortunately, it looks on first blush like the ideas can’t be combined. Here’s why.
The standard QBM parameters used by Carney et al. for the open-system dynamics of the oscillator is the mean excitation number (were the oscillator allowed to thermalized to the bath) and the dissipation coefficient . My preferred parameters are the decoherence-and-diffusion matrix and (described in detail here), and they are related by (where my parameters are more general in the sense that they allow for the decoherence-and-diffusion matrix to be not proportional to the identity ).
Anomalous pure^{d } decoherence (e.g, from collisional decoherence like dark matter (DM), or from objective collapse models like Diosi-Penrose) is the case of the simultaneous limits , while holding constant. This is the sense in which idealized collisional decoherence looks like an infinite temperature bath, and it’s natural model to consider for DM because 1 MeV virialized DM is at ~6000 Kelvin. (Once the DM mass is below 10 keV, then the infinite-temp approximation breaks down. Also, for usual collisional decoherence, is not actually proportional to the identity, but I don’t think it matters much for what I’m going to say…) To get the complete reduced dynamics for the oscillator, you would basically just add this pure decoherence to the other conventional sources of noise (which generally are dissipative, ).
This means when you have an oscillator with conventional sources of noise, and you add anomalous decoherence, you expect to raise the equilibrium temperature of the oscillator, and hence raise the thermalized occupation number , but you do not change or . Generally the experimentally measurable quantities are and , and the bare diffusion matrix for conventional sources alone is inaccessible.
Unfortunately, the protocol of Carney et al. doesn’t really change this. All it can detect is the total strength of . You could try and distinguish anomalous decoherence from conventional sources during the protocol by using the various tricks I’ve talked about (shielding the experiment from DM, looking for sidereal variations, etc.), but it would be a hell of a lot easier to just use those tricks while simply measuring the equilibrium temperature of the oscillator — no quantum mechanics required.^{e }
This also fits with my interpretation of superpositions as “negative temperature detectors” (see first figure in this blog post). Superpositions are a useful way to get increased sensitivity when you’ve already maxed out the amount of sensitivity you can get from cooling your target (because you’ve hit the ground state). But the whole point of the Carney et al. protocol is that is doesn’t care what the temperature of the massive oscillator is.
A bit disappointing, but I will keep thinking about variations on this…
(↵ returns to text)
After the intro, the authors give self-contained background information on the two key prerequisites: quantum Darwinism and generalized probabilistic theories (GPTs). The former is an admirable brief summary of what are, to me, the core and extremely simple features of quantum Darwinism. This and the summary of GPT can be skipped by familiar readers, but I recommend reading definitions 1-4 in the latter part of the GPT subsection, plus the “Summary of Assumptions”.
The main results seems to be
..one needs to consider the possibility of [classical-information-spreading dynamic] that preserves the statistics of [measurements on the system] S but still changes the state of S, even if S is prepared in one of the [(generalized) pointer states]. This is impossible in quantum theory…However, many GPT systems (such as gbits [18]) violate the analogous operational condition…Thus, definition 5 captures the essential features for ideal Darwinism on the operational level, while definition 6 further requests classical features from the frame states themselves.
In [quantum theory], the fan-out gate [classical-information-spreading dynamic] can create entanglement whenever the system is not initialized to a pointer state….entanglement–creation is a necessary property of any generalized ideal Darwinism process.
The reversible qualifier is key here, as this statement excludes the possibility of Darwinism in classical models, whereas we of course know it’s possible to copy classical information in classical models with irreversible Markovian classical dynamics. Thus
In particular, this rules out Darwinism in boxworld [18] (a theory containing the aforementioned gbits) or any dichotomic maximally nonlocal theory. For these specific examples, one could also infer this from Refs. [40, 44], but here we have shown it without having to determine the complete structure of the reversible transformations.
Philosophically and aesthetically I like the idea of GPT as an operational framework for thinking about the foundations of quantum mechanics, although we should all be quite skeptical that GPTs besides quantum theory — either more or less expansive — will be found to describe any fundamental physics (even though think it’s quite plausibly that quantum theory will eventually be superseded by something). This is because, among other things, GPTs treat space and time very different and especially because they take time asymmetry as fundamental rather than emergent or a consequence of initial conditions.
The practical downside of GPTs is that there’s been a whole industry of papers exploring non-classical-or-quantum GPTs that don’t maintain contact with what could describe the real world; it’s too much fun to play with the math. This paper was a welcome exception, as it helps clarify which theories could lead to the appearance of classicality, at least insofar as the latter is identified with quantum Darwinsism.
]]>It seems clear enough to me that, within the field of journalism, the distinction between opinion pieces and “straight reporting” is both meaningful and valuable to draw. Both sorts of works should be pursued vigorously, even by the same journalists at the same time, but they should be distinguished (e.g., by being placed in different sections of a newspaper, or being explicitly labeled “opinion”, etc.) and held to different standards.^{a } This is true even though there is of course a continuum between these categories, and it’s infeasible to precisely quantify the axis. (That said, I’d like to see more serious philosophical attempts to identify actionable principles for drawing this distinction more reliably and transparently.)
It’s easy for idealistic outsiders to get the impression that all of respectable scientific research is analogous to straight reporting rather than opinion, but just about any researcher will tell you that some articles are closer than other articles to the opinion category; that’s not to say it’s bad or unscientific, just that such articles go further in the direction of speculative interpretation and selective highlighting of certain pieces of evidence, and are often motivated by normative claims (“this area is more fruitful research avenue than my colleagues believe”, “this evidence implies the government should adopt a certain policy”, etc.). Let’s call this “scientific opinion”, as opposed to “straight research”, even though we again concede that the distinction is fuzzy.
For the most part, there isn’t a distinction in scientific journal between these types of articles (with some notable exceptions, especially in certain splashy scientific magazines that have articles that resemble newspaper columns). Rather, editors and referees seem to enforce a certain level of objectivity for all articles, and this usually takes the form of (1) an objective sounding voice and (2) outright rejection of articles that are too opinion-y. I mostly think this works fine, although the clear need for some scientific opinion writing means that (a) editors and referees allow some opinion-y type stuff without attempts to explicitly distinguish it from straight research and (b) some scientific opinion discussion that is quite influential ends up getting unfortunately relegated to blogs and in-person discussion.
In recent news, a prominent research scientist got fired from an industry lab for writing a scientific article that made her employer look bad. Lots of folks reasonably (though not necessarily persuasively) argue either that (i) corporations should always support their internal research even when it makes them look bad or that (ii) it’s naive to think that corporation could allow this, so attempt to force them to will just induce them to do less research in the future.
I wonder if constructive progress could be made by relying on the distinction between straight research and scientific opinion. My second-hand impression of the article prompting the current dispute is that it was less a follow-your-nose, just-the-facts-ma’am investigation, and more a scientific opinion piece. I have not read the actual work, but for the purposes of this post it actually doesn’t matter because I just want to consider the general possibility of using this distinction. So for the sake of argument, let’s consider a hypothetical situation where an industry researcher wants to publish an article that makes her employer look bad and the article is well characterized as scientific opinion, i.e., is scientifically sound (no falsehoods, rigorous, etc.) but also clearly displaying aspects of opinion (obviously motivated by normative claims rather than just scientific curiosity, emphasizes evidence on one side, and somewhat speculative).
In journalism, most people agree that opinion pieces are very useful and should exist, but also that employees can’t write anti-employer editorials in the New York Times while reasonably demanding that their employer keep paying them. (A few folks will reject this latter claim, but I think they can only reasonably do so by endorsing a very strong version of the principle of freedom of speech, going beyond even John Stuart Mill.) It seems then that you might be able to protect an important subset of research freedom in industrial labs by establishing a norm that straight research should be shielded from being vetoed from above, while corporate leaders are permitted to exercise control on scientific opinion. If this could be achieved, outsiders could naturally put a bit more faith in straight research from industry while remaining appropriately skeptical of scientific opinion pieces that they generate.
Of course, if the decision about whether any given article was an opinion piece were decided in-house by the company, it would be very hard to trust, and also very hard to make transparent without destroying the company’s ability to meaningfully protect itself from bad PR. Therefore, you could consider an outside arbiter (e.g., certain independent scientific journals) who would certify articles as straight research through a review process that uses publicly-stated principles but that, necessarily, did not release the research article publicly until after it had been certified in this way. (This is not unusually secretive; journal already reject articles without publicly explaining why.) If companies were convinced that this process was at least sort of reliable and objective, they might publicly commit to never vetoing straight research, as certified by the outsider arbiter, both as a way to establish trust in the public and as a way to attract research talent by credibly committing to a limited form of academic freedom.
If you think that industry research currently enjoys, or could plausibly achieve, university levels of freedom when it comes to any results that make the employer look bad, then you would of course see this proposal as a step backwards. However, it seems clear to me that corporate leaders have always had a veto on sufficiently unpalatable research results, and that if they were somehow forced to relinquish this power (due to public opinion), the net result would simply be them choosing to not fund research that had a chance of turning out badly for them. Rather, the idea considered here is just to somewhat constrain the leadership veto and, importantly, making it more principled and transparent.
[Added:] Of course, this proposal doesn’t fix all conflicts between research and academic freedom. Some straight research will make a company look clearly bad without any need for speculation or unusual emphasis. But I think this proposal is plausibly an improvement over the status quo because (1) opinion is more likely to generate bad PR but (2) straight research is “less replaceable” in the sense that outsiders will have an easier time writing critical opinion pieces if they have access to straight research from insiders. [I thank Dylan Hadfield-Menell and Graeme Smith for conversation that informed this post.](↵ returns to text)
(1)
where the arrows over partial derivatives tell you which way they act, i.e., . This only becomes slightly less weird when you use the equivalent formula , where “” is the Moyal star product given by
(2)
The star product has the crucial feature that , where we use a hat to denote the Weyl transform (i.e., the inverse of the Wigner transform taking density matrices to Wigner functions), which takes a scalar function over phase-space to an operator over our Hilbert space. The star product also has some nice integral representations, which can be found in books like Curtright, Fairlie, & Zachos^{a }, but none of them help me understand the Moyal equation.
A key problem is that both of these expressions are neglecting the (affine) symplectic symmetry of phase space and the dynamical equations. Although I wouldn’t call it beautiful, we can re-write the star product as
(3)
where is a symplectic index using the Einstein summation convention, and where symplectic indices are raised and lowered using the symplectic form just as for Weyl spinors: and , where is the antisymmetric symplectic form with , and where upper (lower) indices denote symplectic vectors (co-vectors).
With this, we can expand the Moyal equation as
where we can see in hideous explicitness that it’s a series in the even powers of and the odd derivates of the Hamiltonian and the Wigner function . Furthermore, as we see it quickly reduces to the Poisson bracket .
]]>Advanced quantum computing comes with some new applications as well as a few risks, most notably threatening the foundations of modern online security.
In light of the recent experimental crossing of the “quantum supremacy” milestone, it is of great interest to estimate when devices capable of attacking typical encrypted communication will be constructed, and whether the development of communication protocols that are secure against quantum computers is progressing at an adequate pace.
Beyond its intrinsic interest, quantum computing is also fertile ground for quantified forecasting. Exercises on forecasting technological progress have generally been sparse — with some notable exceptions — but it is of great importance: technological progress dictates a large part of human progress.
To date, most systematic predictions about development timelines for quantum computing have been based on expert surveys, in part because quantitative data about realistic architectures has been limited to a small number of idiosyncratic prototypes. However, in the last few years the number of device has been rapidly increasing and it is now possible to squint through the fog of research and make some tentative extrapolations. We emphasize that our quantitative model should be considered to at most augment, not replace, expert predictions. Indeed, as we discuss in our preprint, this early data is noisy, and we necessarily must make strong assumptions to say anything concrete.
Our first step was to compile an imperfect dataset of quantum computing devices developed so far, spanning 2003-2020. This dataset is freely available, and we encourage others to build on it in the future as the data continues to roll in.
To quantify progress, we developed our own index – the generalized logical qubit – that combines two important metrics of performance for quantum computers: the number of physical qubits and the error rate for two-qubit gates. Roughly speaking, the generalized logical qubit is the number of noiseless qubits that could be simulated with quantum error correction using a given number of physical qubits with a given gate error. Importantly, the metric can be extended to fractional values, allowing us to consider contemporary systems that are unable to simulate even a single qubit noiselessly.
To forecast historical progress into the future, we focus on superconducting-qubit devices. We make our key assumption of exponential progress on the two main metrics and use statistical bootstrapping to build confidence intervals around when our index metric will cross the frontier where it will be able to threaten the popular cryptographic protocol RSA 2048.
Note that since we are modelling progress on each metric separately we are ignoring the interplay between the two metrics. But a simple statistical check shows that the metrics are likely negatively correlated within each system – that is, quantum computer designers face a trade off between increasing the number of qubits and the gate quality. Ignoring this coupling between the metrics results in an optimistic model.
As a point of comparison, last year Piani and Mosca surveyed quantum computing experts and found that 22.7% think it is likely or highly likely that quantum computers will be able to crack RSA-2048 keys by 2030, and 50% think that is likely or highly likely that we will be able to crack RSA-2048 keys by 2035.
Will this be enough time to deploy adequate countermeasures? I discuss this in depth in this other article about quantum cryptanalysis. Given the current rate of progress on the standardization of quantum-resistant cryptography there seems to be little reason for concern (though it should be considered that the yearly base rate for discontinuous breakthroughs for any given technology is about 0.1%).
If you are interested in better understanding our model and assumptions, I encourage you to check out our preprint on the arXiv.
]]>Here is the abstract:
There is a vast number of people who will live in the centuries and millennia to come. In all probability, future generations will outnumber us by thousands or millions to one; of all the people who we might affect with our actions, the overwhelming majority are yet to come. In the aggregate, their interests matter enormously. So anything we can do to steer the future of civilization onto a better trajectory, making the world a better place for those generations who are still to come, is of tremendous moral importance. Political science tells us that the practices of most governments are at stark odds with longtermism. In addition to the ordinary causes of human short-termism, which are substantial, politics brings unique challenges of coordination, polarization, short-term institutional incentives, and more. Despite the relatively grim picture of political time horizons offered by political science, the problems of political short-termism are neither necessary nor inevitable. In principle, the State could serve as a powerful tool for positively shaping the long-term future. In this chapter, we make some suggestions about how we should best undertake this project. We begin by explaining the root causes of political short-termism. Then, we propose and defend four institutional reforms that we think would be promising ways to increase the time horizons of governments: 1) government research institutions and archivists; 2) posterity impact assessments; 3) futures assemblies; and 4) legislative houses for future generations. We conclude with five additional reforms that are promising but require further research. To fully resolve the problem of political short-termism we must develop a comprehensive research program on effective longtermist political institutions.
In the rest of the post, I am going to ask a few pointed questions and make comments. Fair warning: I am trying to get back into frequent low-overhead blogging, so this post is less polished by design, and won’t be very useful if you don’t read the paper (since I don’t summarize it). My comments are largely critical, but needless to say I usually only bother to comment on the tiny minority of papers that I think are important and interesting, which this certainly is.
I know this is just an early attempt at formalizing these ideas, but I would want to see substantially more discussion of the public choice problems that will arise with all these proposals, not just the legislative house. I think such problems are immediate and large (i.e., not just a perturbation that can be handled later), and would strongly drive the best solution. In particular:
Lastly, some tangents:
Edit 2022-Mar-9: See also this recent criticism of UK’s Wellbeing of Future Generations Bill.
]]>