I’ve only used it for a week, but it’s the best PDF reader I’ve experienced for reading academic articles. It’s snappy and reminds me of Chrome when it first came out. Draggable tabs. Split view. Plays well with Zotero. Can easily add native PDF annotation and search through the existing ones. (And it saves annotations *fast* when you close the file.Competitors either do this slowly or, like Skim, use a non-native annotation format that can’t be read by other PDF readers.**a **) The UI for “find” displays a lot of info intuitively. Everything is just nicely designed. I haven’t yet run into a limitation on the free version, but it’s worth upgrading to Pro to support the developer (only $20).

Beware that this is the first version following a big re-write of PDF Reader X, and it’s not completely stable. I’ve gotten it to crash a few times, but the developer has been very responsive to feedback and I’d wager on the stability improving soon.

I’m advertising Guru because I think the current selection of PDF readers for academic reading is pretty bad.I have no connection to the developer.**b ** I strongly prefer Guru (instability and all) over these other PDF readers on maxOS that I have tried: FoxIt, Preview, Adobe Acrobat Reader, and Skim. I would love to see the in-PDF commenting functionality provided by the Librarian Chrome plugin extended to desktop PDF readers, and PDF Guru strikes me as a great place to start if it ever grabs marketshare among researchers.

Alas, it’s not available for Windows or Linux.

(↵ returns to text)

I’ve only used it for a week, but it’s the best PDF reader I’ve experienced for reading academic articles. It’s snappy and reminds me of Chrome when it first came out. Draggable tabs. Split view. Plays well with Zotero. Can easily add native PDF annotation and search through the existing ones. (And it saves annotations *fast* when you close the file.^{a }) The UI for “find” displays a lot of info intuitively. Everything is just nicely designed. I haven’t yet run into a limitation on the free version, but it’s worth upgrading to Pro to support the developer (only $20).

Beware that this is the first version following a big re-write of PDF Reader X, and it’s not completely stable. I’ve gotten it to crash a few times, but the developer has been very responsive to feedback and I’d wager on the stability improving soon.

I’m advertising Guru because I think the current selection of PDF readers for academic reading is pretty bad.^{b } I strongly prefer Guru (instability and all) over these other PDF readers on maxOS that I have tried: FoxIt, Preview, Adobe Acrobat Reader, and Skim. I would love to see the in-PDF commenting functionality provided by the Librarian Chrome plugin extended to desktop PDF readers, and PDF Guru strikes me as a great place to start if it ever grabs marketshare among researchers.

Alas, it’s not available for Windows or Linux.

]]>Just heard about this story showing that the AZ governor means business:

Three weeks into his new job as Arizona’s governor, Doug Ducey made a move that won over Silicon Valley and paved the way for his state to become a driverless car utopia.

It was January 2015 and the Phoenix area was about to host the Super Bowl. Mr. Ducey learned that a local regulator was planning a sting on Lyft and Uber drivers to shut down the ride-hailing services for operating illegally. Mr. Ducey, a Republican who was the former chief executive of the ice cream chain Cold Stone Creamery, was furious.

“It was the exact opposite message we should have been sending,” Mr. Ducey said in an interview. “We needed our message to Uber, Lyft and other entrepreneurs in Silicon Valley to be that Arizona was open to new ideas.” If the state had a slogan, he added, it would include the words “open for business.”

Mr. Ducey fired the regulator who hatched the idea of going after ride-hailing drivers and shut down the entire agency, the Department of Weights and Measures. By April 2015, Arizona had legalized ride-sharing.

This (violent) scene from the movie *Unbroken* depicts a B-24 Liberator defending itself using mounted machine guns during a bombing mission.

“If you have any kind of prejudice or concern that somehow their system and their educational system is not going to produce the kind of people that I’m talking about, you’re wrong.”

Just heard about this story showing that the AZ governor means business:

Three weeks into his new job as Arizona’s governor, Doug Ducey made a move that won over Silicon Valley and paved the way for his state to become a driverless car utopia.

It was January 2015 and the Phoenix area was about to host the Super Bowl. Mr. Ducey learned that a local regulator was planning a sting on Lyft and Uber drivers to shut down the ride-hailing services for operating illegally. Mr. Ducey, a Republican who was the former chief executive of the ice cream chain Cold Stone Creamery, was furious.

“It was the exact opposite message we should have been sending,” Mr. Ducey said in an interview. “We needed our message to Uber, Lyft and other entrepreneurs in Silicon Valley to be that Arizona was open to new ideas.” If the state had a slogan, he added, it would include the words “open for business.”

Mr. Ducey fired the regulator who hatched the idea of going after ride-hailing drivers and shut down the entire agency, the Department of Weights and Measures. By April 2015, Arizona had legalized ride-sharing.

This (violent) scene from the movie *Unbroken* depicts a B-24 Liberator defending itself using mounted machine guns during a bombing mission.

“If you have any kind of prejudice or concern that somehow their system and their educational system is not going to produce the kind of people that I’m talking about, you’re wrong.”

“We’re not stuck,” he said. “It doesn’t feel like we’re on the verge of getting it all sorted, but I know more each day than I did the day before – and so presumably we’re getting somewhere.”

If you’re reading this, you will probably be dead in 6 decades or less, maybe much less. And if you’re over 25, your brain function has probably peaked. *Time is running out*.

Let’s call the game “relay programming”. To run the game, the game master picks a challenging programming problem. For concreteness, let’s say Project Euler problem 62:

“…41063625 is the smallest cube which has exactly three permutations of its digits which are also cube. Find the smallest cube for which exactly five permutations of its digits are cube.”

(I chose this problem because I don’t know how to solve it, and I don’t know what strategy I’d use to solve it, and I don’t think I’d make that much progress in ten minutes, but I’m 75% confident I’d solve it in five hours of work, based on its difficulty as rated on the Project Euler website.)

This game is called a relay because it is played with a team of people, each of whom is only allowed ten minutes to contribute. The first player is sent an email which contains the problem; before ten minutes is up they must reply with the email that they want to be sent to the second player. The game continues with fresh players until the task is solved.

So every player has ten minutes to make as much progress as possible. One possible thing that the first player could do over those ten minutes is think about how to decompose the problem…

Relatedly, it is interesting that the concept of a Political Officer is associated strongly with the Soviet Union even though the general idea — an integration of high-ranking officers designed to maintain civilian control of the military — could arguably be useful to democracies.

Today? The DARPA Spectrum Collaboration Challenge:

The DARPA Spectrum Collaboration Challenge (SC2) is the first-of-its-kind collaborative machine-learning competition to overcome scarcity in the radio frequency (RF) spectrum. Today, spectrum is managed by dividing it into rigid, exclusively licensed bands. This human-driven process is not adaptive to the dynamics of supply and demand, and thus cannot exploit the full potential capacity of the spectrum. In SC2, competitors will reimagine a new, more efficient wireless paradigm in which radio networks autonomously collaborate to dynamically determine how the spectrum should be used moment to moment.

The team whose radio design most reliably achieves successful communication in the presence of other competing radios could win as much as $3,500,000.

(H/t aresant.)

The Wirecutter makes money with referral links through *retailers* (mostly Amazon) rather than manufacturers. Arguably, it is evidence that advertisement-based websites have fundamental equilibrium limitations, and that retailer oligopolies can have surprising benefits. My chief complaint is that the Wirecutter is geared toward folks less stingy than myself, likely induced by their incentive to get commission on large purchases.

The processor specifically designed for machine learning is, of course, Google’s “tensor processing unit“.

Primary care doctors (eg family, internal, and emergency medicine) will benefit most from affordable ultrasound. We are learning that it’s a powerful diagnostic tool when used along side the physical exam. Some zealots have equated bedside ultrasound to be the biggest advancement to medicine since antibiotics. This notion I feel is exaggerated, but— it taps into the underlying excitement in the medical community for bedside ultrasound…

Patient came in with all the symptoms and findings of a stroke— altered mental status, inability to move their left arm. Before giving the treatment for a stroke, a potent blood thinner called tPA, the doctor decided to do an informal ultrasound of the patients heart. He found the patient had a massive dissection of their aorta. The patient wasn’t getting adequate blood flow to their arm or brain. Had the patient been given tPA they most likely would have died. A quick bedside ultrasound revealed a difficult diagnosis and saved the patients life.

these procedures resulted in nearly 73,000 babies — 1.6 percent of all U.S. births. The rate is even higher in some countries, including Japan (5 percent) and Denmark (10 percent).

(H/t Gwern.)

Why is drug resistance common and vaccine resistance rare? Drugs and vaccines both impose substantial pressure on pathogen populations to evolve resistance and indeed, drug resistance typically emerges soon after the introduction of a drug. But vaccine resistance has only rarely emerged. Using well-established principles of population genetics and evolutionary ecology, we argue that two key differences between vaccines and drugs explain why vaccines have so far proved more robust against evolution than drugs. First, vaccines tend to work prophylactically while drugs tend to work therapeutically. Second, vaccines tend to induce immune responses against multiple targets on a pathogen while drugs tend to target very few. Consequently, pathogen populations generate less variation for vaccine resistance than they do for drug resistance, and selection has fewer opportunities to act on that variation. When vaccine resistance has evolved, these generalities have been violated. With careful forethought, it may be possible to identify vaccines at risk of failure even before they are introduced.

To oversimplify, property rights are often said to be good against the world, while contract rights are good against specific others.

A share of stock is itself a set of contractual rights, but the record owner has a property right in the share (and often a physical certificate). It doesn’t matter if someone takes your share or inadvertently/accidentally sells it to an innocent buyer. It’s still yours and you have a better claim to it than any buyer or later holder.

But a beneficial owner of a share held in “street name” has only a contractual right to his shares–essentially a promise from his broker that the broker will have at least [x] shares for him (note this means he does not have a claim to any specific or identifiable shares and his broker surely holds many times more since they’ll have many other clients). And on top of that, his broker has an account with DTC that involves a second layer of contractual rights to the stock–essentially a promise from DTC to the broker that DTC will have at least [x] shares for the broker (again not specific or identifiable shares and DTC certainly has many times more shares since DTC holds nearly all shares held in “street name”)

If your broker or DTC accidentally or inadvertently disposes of too many shares (and this can happen surprisingly often) you only have recourse against your broker or DTC. The agreements between [you and your broker] and [your broker and DTC] do not bind the new owner of the shares, who has no obligation under those agreements and, as a bona fide buyer + current holder, also has a better claim to the shares than you do.

If that wasn’t specific enough, here’s a very detailed summary and analysis of the current stock ownership structure and mechanics:

http://scholarship.law.upenn.edu/cgi/viewcontent.cgi?article=1163&context=faculty_scholarship

Your browser does not support the video tag.

[Download MP4] [Other options]

In particular, he sketched the essential equivalence between matrix product states (MPS) and restricted Boltzmann machinesThis is discussed in detail by Chen et al. See also good intuition and a helpful physicist-statistician dictionary from Lin and Tegmark.**b ** (RBM) before showing how he and collaborators could train an efficient RBM representations of the states of the transverse-field Ising and XXZ models with a small number of local measurements from the true state.

As you’ve heard me belabor ad nauseum, I think identifying and defining branches is the key outstanding task inhibiting progress in resolving the measurement problem. I had already been thinking of branches as a sort of “global” tensor in an MPS, i.e., there would be a single index (bond) that would label the branches and serve to efficiently encode a pure state with long-range entanglement due to the amplification that defines a physical measurement process. (More generally, you can imagine branching events with effects that haven’t propagated outside of some region, such as the light-cone or Lieb-Robinson bound, and you might even make a hand-wavy connection to entanglement renormalization.) But I had little experience with constructing MPSs, and finding efficient representations always seemed like an ad-hoc process yielding non-unique results. Demonstrating uniqueness is crucial since it is equivalent to the set-selection problem in the language of consistent histories, and avoid problems (like Wigner’s friend or modern refinements) that otherwise just have to be ignored by fiat. My earlier result showed that the principle of redundant records in quantum Darwinism goes a long way to establishing uniqiueness and, importantly, observed that “branch pruning” might enable the efficient simulation of non-equilibrium systems. However, getting an (even approximate) definition irreversibility remains a crucial challenge; without a notion of irreversibility, record-defined branches would be too fine-grained and transient.

Even with uniqueness and irreversibility, there is not much reason to think that a records-based definition of branches would enable branches to be found efficiently in a numerical simulation. This is what makes the connection to artificial neural nets so exciting. There is already a huge literature on training these nets to find key underlying structure in probability distributions (which are mathematically highly similar to wavefunctions). We won’t have to reinvent the wheel.

Now, for various practical experimental and theoretical reasons, condensed matter … Continue reading ]]>
^{a }

[Download MP4] [Other options]

In particular, he sketched the essential equivalence between matrix product states (MPS) and restricted Boltzmann machines^{b } (RBM) before showing how he and collaborators could train an efficient RBM representations of the states of the transverse-field Ising and XXZ models with a small number of local measurements from the true state.

As you’ve heard me belabor ad nauseum, I think identifying and defining branches is the key outstanding task inhibiting progress in resolving the measurement problem. I had already been thinking of branches as a sort of “global” tensor in an MPS, i.e., there would be a single index (bond) that would label the branches and serve to efficiently encode a pure state with long-range entanglement due to the amplification that defines a physical measurement process. (More generally, you can imagine branching events with effects that haven’t propagated outside of some region, such as the light-cone or Lieb-Robinson bound, and you might even make a hand-wavy connection to entanglement renormalization.) But I had little experience with constructing MPSs, and finding efficient representations always seemed like an ad-hoc process yielding non-unique results. Demonstrating uniqueness is crucial since it is equivalent to the set-selection problem in the language of consistent histories, and avoid problems (like Wigner’s friend or modern refinements) that otherwise just have to be ignored by fiat. My earlier result showed that the principle of redundant records in quantum Darwinism goes a long way to establishing uniqiueness and, importantly, observed that “branch pruning” might enable the efficient simulation of non-equilibrium systems. However, getting an (even approximate) definition irreversibility remains a crucial challenge; without a notion of irreversibility, record-defined branches would be too fine-grained and transient.

Even with uniqueness and irreversibility, there is not much reason to think that a records-based definition of branches would enable branches to be found efficiently in a numerical simulation. This is what makes the connection to artificial neural nets so exciting. There is already a huge literature on training these nets to find key underlying structure in probability distributions (which are mathematically highly similar to wavefunctions). We won’t have to reinvent the wheel.

Now, for various practical experimental and theoretical reasons, condensed matter theorists focus mostly on equilibrium states. Even outside this, they mostly care about *approach* to equilibrium and how quantum systems evolve when they are perturbed slightly away from equilibrium. And even when they care about persistent non-equilibrium, they focus on *permanent* non-equilibrium due to conserved quantities (either simply global conserved quantities like particle number or, more interestingly, the local conserved quantities that lead to Anderson localization). Branches, on the other hand, are all about persistent but non-trivial and non-permanent out-of-equilibrium evolution; we expect branch structure to eventually break down during thermalization. Therefore, rather than using restricted Boltzmann machines, which find equilibrium probability distributions with Boltzmann weights, we’ll probably need a deep learning buzzword with “temporal” or “recurrent” in the name.

So, here are my predictions as concretely as I can make them right now: Efficient simulation of many-body systems that are far from equilibrium but not permanently out of equilibrium will use some sort of neural net structure where the value of key hidden nodes label the branches. The branches will correspond to orthogonal states of the system that are macroscopically distinct in the sense that they can be distinguished by local measurements at many spatially disjoint regions. Different hidden nodes will correspond to different branching events so that the set of all branches is labeled by the joint value of all such nodes. The branches will monotonically become more fine-grained (when compared at a fixed time a la the Heisenberg picture), and in particular the number of branches will increase exponentially with time because the number of branching events will increase linearly with time (and space). Local, experimentally accessible observables will be estimated with high precision by simply sampling from all branches. On the timescale of thermalization, the branch structure “dissolves” by becoming progressively more inefficient to find numerically, and eventually non-unique.

**Edit** 2017-11-1: Branch identification will only be *asymptotically* crucial for simulation when (1) the number of branching events (i.e., hidden branch nodes) is extensive in system size at any given time (so that the total number of branches is exponential in space as well as time, as mentioned above), *and* (2) the effects of branching events have had time to propagate globally.^{c } Thus, for an -site lattice simulated for time steps, this is the large- limit taken while holding constant, where is the speed of propagation and is the number of trips around the lattice a perturbation can travel. If we instead hold constant and take to infinity, then eventually the correlation length has to saturate (since perturbations simply don’t have time to propagate further than lattice sites away) and resources required no longer scales exponentially with . But obviously, fixed modest values of (or fixed modest values of for sufficiently large ) may already easily require infeasible computational resources.

(↵ returns to text)

- There is a title card about “resurgence” from Francesco Di Renzo’s talk at the beginning of the talk you can ignore. This is just a mistake in KITP’s video system.↵
- This is discussed in detail by Chen et al. See also good intuition and a helpful physicist-statistician dictionary from Lin and Tegmark.↵
- I think Martin Ganahl for discussion on this point.↵

is the most acutely lethal toxin known, with an estimated human median lethal dose (LD50) of 1.3–2.1 ng/kg intravenously or intramuscularly and 10–13 ng/kg when inhaled.

The observational data that a planet nine would explain:

“There are now five different lines of observational evidence pointing to the existence of Planet Nine,” Konstantin Batygin, a planetary astrophysicist at the California Institute of Technology (Caltech) in Pasadena, said….

…a study that examined the elliptical orbits of six known objects in the Kuiper Belt…all of those Kuiper Belt objects have elliptical orbits that point in the same direction and are tilted about 30 degrees “downward” compared to the plane in which the eight official planets circle the sun…

Using computer simulations of the solar system with a Planet Nine…there should be even more objects tilted a whopping 90 degrees with respect to the solar plane. Further investigation revealed that five such objects were already known to fit these parameters…

…Planet Nine’s influence might have tilted the planets of our solar system, which would explain why the zone in which the eight major planets orbit the sun is tilted by about 6 degrees compared to the sun’s equator….”Over long periods of time, Planet Nine will make the entire solar-system plane precess, or wobble, just like a top on a table,”…

Finally, the researchers demonstrate how Planet Nine’s presence could explain why Kuiper Belt objects orbit in the opposite direction from everything else in the solar system.

That’s only four, but other evidence is discussed here.

is the most acutely lethal toxin known, with an estimated human median lethal dose (LD50) of 1.3–2.1 ng/kg intravenously or intramuscularly and 10–13 ng/kg when inhaled.

The observational data that a planet nine would explain:

“There are now five different lines of observational evidence pointing to the existence of Planet Nine,” Konstantin Batygin, a planetary astrophysicist at the California Institute of Technology (Caltech) in Pasadena, said….

…a study that examined the elliptical orbits of six known objects in the Kuiper Belt…all of those Kuiper Belt objects have elliptical orbits that point in the same direction and are tilted about 30 degrees “downward” compared to the plane in which the eight official planets circle the sun…

Using computer simulations of the solar system with a Planet Nine…there should be even more objects tilted a whopping 90 degrees with respect to the solar plane. Further investigation revealed that five such objects were already known to fit these parameters…

…Planet Nine’s influence might have tilted the planets of our solar system, which would explain why the zone in which the eight major planets orbit the sun is tilted by about 6 degrees compared to the sun’s equator….”Over long periods of time, Planet Nine will make the entire solar-system plane precess, or wobble, just like a top on a table,”…

Finally, the researchers demonstrate how Planet Nine’s presence could explain why Kuiper Belt objects orbit in the opposite direction from everything else in the solar system.

That’s only four, but other evidence is discussed here.

We applied novel machine learning methods (“compressed sensing”) to ~500k genomes from UK Biobank, resulting in an accurate predictor for human height which uses information from thousands of SNPs.

1. The actual heights of most individuals in our replication tests are within a few cm of their predicted height.

2. The variance captured by the predictor is similar to the estimated GCTA-GREML SNP heritability. Thus, our results resolve the missing heritability problem for common SNPs.

3. Out-of-sample validation on ARIC individuals (a US cohort) shows the predictor works on that population as well. The SNPs activated in the predictor overlap with previous GWAS hits from GIANT.

GM right now is the most vertically integrated of all the companies making meaningful progress on Robotaxis. They have a dedicated assembly line set up building off the Chevy Bolt platform, and intend to have ‘thousands’ of them on the road before the end of 2018. They’re building their own ride hailing app, called Cruise anywhere, currently only available to GM employees. They’ve got On-star, which provides in-house expertise with connected car and vehicle diagnostic services. GM’s Maven subsidiary offers car sharing services.

Consider two cars facing each other at a two-way stop (the cross traffic does not stop). The first car to arrive is turning left, the second to arrive is going straight. Who gets to go first: The car turning left or the car who arrived first?

Answer: the driver going straight has the right-of-way, even if he arrived later.

In particular, he proposes simultaneous use for transcontinental travel through suborbital hops (2 min):

and lunar surface missions through Earth-orbit rendezvous. (/r/SpaceX coverage.)

Unlike ordinary classified documents, which have 1 cover sheet, SCI [Secret Compartmented Information] documents have 2 cover sheets. The top one has the base classification (SECRET, TOP SECRET) and a prominent notice that the document is SCI material. If the person has access to SCI at that level, they can lift that first cover sheet and see what the codeword is. If they recognise the codeword, they can lift the second cover sheet. If they don’t recognise it, lifting the second cover sheet constitutes unauthorised access, for which the person can be fined and/or imprisoned.

- The overall likelihood of approval (LOA) from Phase I for all developmental candidates was 9.6%, and 11.9% for all indications outside of Oncology.
- Rare disease programs and programs that utilized selection biomarkers had higher success rates at each

phase of development vs. the overall dataset.- Chronic diseases with high populations had lower LOA from Phase I vs. the overall dataset.
- Of the 14 major disease areas, Hematology had the highest LOA from Phase I (26.1%) and Oncology had the lowest (5.1%)
- Sub-indication analysis within Oncology revealed hematological cancers had 2x higher LOA from Phase I

than solid tumors.- Oncology drugs had a 2x higher rate of first cycle approval than Psychiatric drugs, which had the lowest percent of first-cycle review approvals. Oncology drugs were also approved the fastest of all 14 disease areas.
- Phase II clinical programs continue to experience the lowest success rate of the four development phases, with only 30.7% of developmental candidates advancing to Phase III.

It remains difficult to observe genome variation in transposon content. The situation is improving as we get longer single-molecule reads, as these let us reach through these sequences into bits of DNA that let us anchor the position of transposons against genomes which we’ve already sequenced.

I think some people may have the idea that we can observe whole genomes easily, but consider the case of repeats like transposons. Half of the human genome is made up of these, but we still have trouble seeing when and where they are active. A new insertion of a big piece of DNA can be much more phenotypically effective than a little SNP, and yet our observational methods make the latter much easier to see than the first. It seems that structural variation in genomes is a likely place to find at least a partial solution to the missing heritability problem posed by the GWAS community.

Insertions and deletions are usually harder to spot with the short reads that make up the data that you’ll get back from your typical “sub-1000 dollar whole genome”. The reason is simple. There are vastly more possible insertions and deletions that SNPs, and these all must be considered by algorithms in order to detect them. Worse, as the length of the insertion or deletion increases to a reasonable fraction of the length of your reads, it becomes impossible to hope to resolve the event without considering an untenable space of possible indels and opening yourself to spurious matching.

Those cheap genomes have a serious blind spot— they don’t easily yield information about the large scale variation in structure (indels, copy number, inversions, translocations) that are apparently very important to evolution. I believe the field has blinded itself to the importance of large variation simply because it is hard to observe. Recent papers based on long read data have started to respond to this assumption in a serious way (https://www.biorxiv.org/content/early/2016/09/24/076562)….

In the context of humans there is ample evidence that the things we are missing with short reads are not minute, but are rather an enormous elephant in the room, see https://bioscibatzerlab.biology.lsu.edu/Publications/Sudmant_et_al_2010_Science.pdf and http://science.sciencemag.org/content/349/6253/aab3761. They report that some genomic regions are expanding by up to 50-fold between individuals. Some whole human populations feature quarter-megabase duplications not present in other groups. The scope of the studies are actually very narrow, with hundreds of individuals being considered. I would be surprised if this is anything less than the tip of the iceberg, and incredibly surprised if this turns out to be a minute detail.

Basically, if I help myself to the common (but certainly debatable) assumption that “the industrial revolution” is the primary cause of the dramatic trajectory change in human welfare around 1820-1870, then my one-sentence summary of recorded human history is this:

Everything was awful for a very long time, and then the industrial revolution happened.

Interestingly, this is not the impression of history I got from the world history books I read in school. Those books tended to go on at length about the transformative impact of the wheel or writing or money or cavalry, or the conquering of this society by that other society, or the rise of this or that religion, or the disintegration of the Western Roman Empire, or the Black Death, or the Protestant Reformation, or the Scientific Revolution. But they could have ended each of those chapters by saying “Despite these developments, global human well-being remained roughly the same as it had been for millennia, by every measure we have access to.” And then when you got to the chapter on the industrial revolution, these books could’ve said: “Finally, for the first time in recorded history, the trajectory of human well-being changed completely, and this change dwarfed the magnitude of all previous fluctuations in human well-being.”

Also from Luke: Hillary Clinton on AI risk.

This post will collect some models of decoherence and branching. We don’t have a rigorous definition of branches yet but I crudely define models of branching to be models of decoherenceI take decoherence to mean a model with dynamics taking the form for some tensor decomposition , where is an (approximately) stable orthonormal basis independent of initial state, and where for times and , where is the initial state of and is some characteristic time scale.**a ** which additionally feature some combination of amplification, irreversibility, redundant records, and/or outcomes with an intuitive macroscopic interpretation. I have the following desiderata for models, which tend to be in tension with computational tractability:

- physically realistic
- symmetric (e.g., translationally)
- no ad-hoc system-environment distinction
- Ehrenfest evolution along classical phase-space trajectories (at least on Lyapunov timescales)

Regarding that last one: we would like to recover “classical behavior” in the sense of classical Hamiltonian flow, which (presumably) means continuous degrees of freedom.In principle you could have discrete degrees of freedom that limit, as , to some sort of discrete classical systems, but most people find this unsatisfying.**b ** Branching only becomes unambiguous in some large-*N* limit, so it seems satisfying models are necessarily messy and difficult to numerically simulate. At the minimum, a good model needs time asymmetry (in the initial state, not the dynamics), sensitive dependence on initial conditions, and a large bath. Most branching will (presumably) be continuous both in time and in number of branches, like a decaying atom where neither the direction nor time of decay are discrete.

Below are some models that have one or more of the above features. For many of these, the historical progression was to first analyze decoherence (tracing out the environment) and then the creation of redundant records (by looking at the correlations within the environment). There are too many cites for me to be comprehensive or historically fair in this brief post, so just email me if you want a more comprehensive bibliography for any of these.

A “spin” refers generically to a two-state quantum system, regardless of whether it … Continue reading ]]>
*[This is akin to a living review, which will hopefully improve from time to time. Last edited 2017-11-26.]*

This post will collect some models of decoherence and branching. We don’t have a rigorous definition of branches yet but I crudely define models of branching to be models of decoherence^{a } which additionally feature some combination of amplification, irreversibility, redundant records, and/or outcomes with an intuitive macroscopic interpretation. I have the following desiderata for models, which tend to be in tension with computational tractability:

- physically realistic
- symmetric (e.g., translationally)
- no ad-hoc system-environment distinction
- Ehrenfest evolution along classical phase-space trajectories (at least on Lyapunov timescales)

Regarding that last one: we would like to recover “classical behavior” in the sense of classical Hamiltonian flow, which (presumably) means continuous degrees of freedom.^{b } Branching only becomes unambiguous in some large-*N* limit, so it seems satisfying models are necessarily messy and difficult to numerically simulate. At the minimum, a good model needs time asymmetry (in the initial state, not the dynamics), sensitive dependence on initial conditions, and a large bath. Most branching will (presumably) be continuous both in time and in number of branches, like a decaying atom where neither the direction nor time of decay are discrete.

Below are some models that have one or more of the above features. For many of these, the historical progression was to first analyze decoherence (tracing out the environment) and then the creation of redundant records (by looking at the correlations within the environment). There are too many cites for me to be comprehensive or historically fair in this brief post, so just email me if you want a more comprehensive bibliography for any of these.

A “spin” refers generically to a two-state quantum system, regardless of whether it has an interpretation in terms of particle spin. An “oscillator” refers generically to a quantum system with a continuous degree of freedom (e.g., position of a particle), regardless of whether there is a harmonic confining potential.

**Dirac equation in an inhomogeneous magnetic field**. This is the Stern-Gerlach experiment. It’s nice because the interactions are physical and completely analytic. You can start with a spin uncorrelated with the spatial degrees of freedom, and then see how the inhomogeneous field splits the wavepacket into the two parts. This can be done as a first-year graduate QM homework problem. However, once you have the two parts, you’d need to add amplification/irreversibility to get a proper model of branching, e.g., a model of the phosphorescent screen.

**Two-particle scattering in one dimension**. This is a good example of discrete branching of a continuous variable. (But only in 1D; in higher dimensions, the branches are continuous, being indexed by the scattering angles.) If the two particles interact through a potential that’s a function only of their relative distance, then the center of mass coordinate decouples and this is isomorphic to a single particle scattering from a central potential. The two outcomes are either to tunnel through the barrier or reflect off it. Also a good homework problem, but also has no amplification on its own.

**Spin decohered by multi-spin apparatus**. The simplest possible model that includes amplification is a central spin measured sequentially by set of other spins through ad-hoc interactions. This is unrealistic, but is deployed judiciously by Zurek in his well-known review article [1] with simple CNOT interactions. An artificial spatial degree of freedom is added to get the Coleman-Hepp model [1, 2] wherein an electron moves along a chain (1D lattice) of atoms and the electron spin is measured and recorded by each atom in turn. Rather than add a spatial degree of freedom or ad-hoc time dependence, one can also just assert that the apparatus spins are coupled much more strongly to the central spin than to each other (while still assuming the Hamiltonian is diagonalized in the basis corresponding to the *z*-axis) [3]. Records have been studied more extensively in this scenario [4, 5].

**Spin decohered by multi-spin apparatus in oscillator bath**. Achieves effective measurement dynamics (as in the previous type of model) using slightly more realistic interactions for a purpose-built measuring apparatus. The apparatus (or “readout”) spins are initialized in a metastable state and coupled to a dissipative bath of oscillators. This is sometimes known as the Curie–Weiss model, and is connected to spontaneous symmetry breaking [4, 5].

**Spin decohered by oscillator bath**. The Jaynes-Cummings model generalized to have multiple modes (oscillators), sometimes known as the spin-boson model. As a model for dissipative reduced dynamics it is well studied [1], but I’m not aware of any work on amplification or records.

**Oscillator decohered by oscillator bath**. Generally a chosen to be harmonic oscillators coupled linearly to each other. Most people just use the reduced dynamics of the central oscillator to study decoherence and diffusion [1], e.g.., Caldiera-Leggett [2]. For sufficiently fast monitoring by a large bath of (individually weakly coupled) oscillators, the reduced dynamics limit to Markovian quantum Brownian motion, which is characterized by a Lindblad equation. One can also show how sufficiently different paths (histories) are recorded in the environment [3] and can also be an idealized form of branching if you analyze the environment, and

**Decoherence by scattering** The system is a particle decohered in the position basis, momentum basis, or (most realistically) an over-complete basis of wavepackets.^{c } Often, the particle is taken to be a charged or dielectric particle decohered by scattering radiation. Usually studied in the idealized limit of Markovian quantum Brownian motion, as described by a Lindblad equation, where the scattering interaction time is taken to zero. Lots of different regimes; too many cites to list. For a neutral particle with a dielectric constant, see chapter 3 of Schlosshauer’s textbook and references therein. Records are discussed in [1][2][3]. One can generalize from a single monitored particle to a large population of them that is monitored by a different (lighter) species [4]. For a charged accelerating particle decohering through bremsstrahlung see the textbooks by Joos et al. and Breuer & Petruccione, and [5]. For a charged non-accelerating particle decohering in the momentum basis, see [6] and, for records, see [7].^{d }

**Decoherence of a field**. Anglin & Zurek treat an electromagnetic field decohered in the basis of coherent states by a dielectric medium [1].

**Brun-Halliwell model**. A 1D interacting spin chain. This model is important because the emergent classical variables — local average hydrodynamic variables — are very general/universal. [1,2,3,4]

**Avalanche photodiode**. I’ve never seen a simple tractable model for this, but I know that a good amount of theory does exist somewhere. It would be nice because it’s very physical and common in labs.

*[I thank Daniel Ranard and interstice for conversation that prompted this post, and Curt von Keyserlingk for suggesting the paper by Gaveau & Schulman.]*

(↵ returns to text)

- I take
*decoherence*to mean a model with dynamics taking the form for some tensor decomposition , where is an (approximately) stable orthonormal basis independent of initial state, and where for times and , where is the initial state of and is some characteristic time scale.↵ - In principle you could have discrete degrees of freedom that limit, as , to some sort of discrete classical systems, but most people find this unsatisfying.↵
- Wavepackets are approximate eigenstates of both position and momentum. For a discussion of decoherence with respect to an overcomplete basis of wavepackets, see the introduction of this and references therein.↵
- Charged particles, especially accelerating ones, are the most difficult to handle because one must work with their “dressed” states.↵

We start by assuming that a precise wavefunction branch structure has been specified. The idea, basically, is to randomly draw a branch at late times according to the Born probability, then to evolve it backwards in time to the beginning of the universe and take *that* as your initial condition. The main motivating observation is that, if we assume that all branch splittings are defined by a projective decomposition of some subsystem (‘the system’) which is recorded faithfully elsewhere (‘the environment’), then the lone preferred branch — time-evolving by itself — is an eigenstate of each of the projectors defining the splits. In a sense, Weingarten lays claim to *ordered consistency* [arxiv:gr-qc/9607073] by assuming partial-trace decoherenceNote on terminology: What Finkelstein called “partial-trace decoherence” is really a specialized form of consistency (i.e., a mathematical criterion for sets of consistent histories) that captures some, but not all, of the properties of the physical and dynamical process of decoherence. That’s why I’ve called it “partial-trace consistency” here and here.**a ** [arXiv:gr-qc/9301004]. In this way, the macrostate states stay the same as normal quantum mechanics but the microstates secretly conspire to confine the universe to a single branch.

I put proposals like this in the same category as Bohmian mechanics. They take as assumptions the initial state and unitary evolution of the universe, along with the conventional decoherence/amplification story that argues for (but never fully specifies from first principles) a fuzzy, time-dependent decomposition of the wavefunction into branches. To this they add something that “picks out” one of the branches as preferred. By Bell’s theorem, the added thing has to have at least one very unattractive quality (e.g., non-locality, superdeterminism, etc.), and then the game is to try to convince oneself that something else about it makes it attractive enough to choose on aesthetic grounds over normal quantum mechanics.In addition to Bohmian mechanics, see important examples like Kent’s late-time photodetection [arXiv:1608.04805] and the “Many Interacting Worlds” [PRX 4, 041013 (2014)].**b **

Weingarten is refreshingly clear about this and correctly characterizes his proposal as a hidden variable theory. I’d say its virtue is that, on its face, it doesn’t introduce new mathematical objects like the Bohm particle. However, if we try and quantify theory elegance with something like algorithmic complexity, then the bit string *b* used to specify the preferred branch (which is necessary to write down the complete theory, unlike normal quantum mechanics) is an equivalently inelegant structure.

Weingarten argues (in the paragraph beginning “Each |Ψ(hj,t)> may be viewed…”) that this proposal solves most of the fuzziness problems associated with the decoherence story, but I’d say it just … Continue reading ]]>
**Tl;dr**: I did not find an important new idea in it, but this paper nicely illustrates the appeal of Finkelstein’s partial-trace decoherence and the ambiguity inherent in connecting a many-worlds wavefunction to our direct observations.

We propose a method for finding an initial state vector which by ordinary Hamiltonian time evolution follows a single branch of many-worlds quantum mechanics. The resulting deterministic system appears to exhibit random behavior as a result of the successive emergence over time of information present in the initial state but not previously observed.

We start by assuming that a precise wavefunction branch structure has been specified. The idea, basically, is to randomly draw a branch at late times according to the Born probability, then to evolve it backwards in time to the beginning of the universe and take *that* as your initial condition. The main motivating observation is that, if we assume that all branch splittings are defined by a projective decomposition of some subsystem (‘the system’) which is recorded faithfully elsewhere (‘the environment’), then the lone preferred branch — time-evolving by itself — is an eigenstate of each of the projectors defining the splits. In a sense, Weingarten lays claim to *ordered consistency* [arxiv:gr-qc/9607073] by assuming partial-trace decoherence^{a } [arXiv:gr-qc/9301004]. In this way, the macrostate states stay the same as normal quantum mechanics but the microstates secretly conspire to confine the universe to a single branch.

I put proposals like this in the same category as Bohmian mechanics. They take as assumptions the initial state and unitary evolution of the universe, along with the conventional decoherence/amplification story that argues for (but never fully specifies from first principles) a fuzzy, time-dependent decomposition of the wavefunction into branches. To this they add something that “picks out” one of the branches as preferred. By Bell’s theorem, the added thing has to have at least one very unattractive quality (e.g., non-locality, superdeterminism, etc.), and then the game is to try to convince oneself that something else about it makes it attractive enough to choose on aesthetic grounds over normal quantum mechanics.^{b }

Weingarten is refreshingly clear about this and correctly characterizes his proposal as a hidden variable theory. I’d say its virtue is that, on its face, it doesn’t introduce new mathematical objects like the Bohm particle. However, if we try and quantify theory elegance with something like algorithmic complexity, then the bit string *b* used to specify the preferred branch (which is necessary to write down the complete theory, unlike normal quantum mechanics) is an equivalently inelegant structure.

Weingarten argues (in the paragraph beginning “Each |Ψ(h_{j},t)> may be viewed…”) that this proposal solves most of the fuzziness problems associated with the decoherence story, but I’d say it just repackages them. You need to help yourself to a precise choice of splitting events (when exactly they happen, etc.) to even define the ensemble of branches |Ψ(h_{j},t)⟩, but if you assume you already have that precision, then what’s the problem? Why not just declare that the set of branches is nothing but an ensemble of potential outcomes, exactly one of which is chosen at random (according to the Born probability), thereby reducing quantum mechanics to a classical non-local stochastic theory?

Perhaps Weingarten’s issue is that Many-Worlders like Davide Wallace often embrace a fuzzy/emergent nature of branches, likening them to the fuzzy/emergent nature of a tiger, and refuse to specify an arbitrary precise definition. But then it seems Weingarten would be happy with a consistent histories interpretation, where the branches are specified precisely with projectors…whose precision, insofar as it exceeds the fuzziness inherent to the decoherence story, is just picked arbitrarily.

Indeed, despite the marked inelegance of Bohmian mechanics, it has at least one advantage over Weingarten’s proposal: the Bohm particle path is automatically precise and this requires only an initial random sample from a well-defined probability distribution (to be compared with an arbitrary and still unspecified choice of branch structure). This means that, if we accept the Bohm story, branches can be fuzzy for the same reason that we’re OK with tigers being fuzzy in a universe where we understand atomic physics precisely.

Finally, note that, in the far future, this single-branch theory shares a problem with all theories that take branches as fundamental: eventually, the universe will thermalize and branch structure must break down.

(↵ returns to text)

- Note on terminology: What Finkelstein called “partial-trace decoherence” is really a specialized form of
*consistency*(i.e., a mathematical criterion for sets of consistent histories) that captures some, but not all, of the properties of the physical and dynamical process of decoherence. That’s why I’ve called it “partial-trace consistency” here and here.↵ - In addition to Bohmian mechanics, see important examples like Kent’s late-time photodetection [arXiv:1608.04805] and the “Many Interacting Worlds” [PRX 4, 041013 (2014)].↵

(1)

There are many subtleties obscured in this cartoon presentation, like the fact that a symmetry , being a tangent direction *on* the manifold of trajectories, can vary with the tangent point it is attached to (as for rotational symmetries). If you’ve never spent a long afternoon with a good book on the calculus of variations, I recommend it.

(↵ returns to text)

(1)

There are many subtleties obscured in this cartoon presentation, like the fact that a symmetry , being a tangent direction *on* the manifold of trajectories, can vary with the tangent point it is attached to (as for rotational symmetries). If you’ve never spent a long afternoon with a good book on the calculus of variations, I recommend it.

(↵ returns to text)

- You can find this presentation in “A short review on Noether’s theorems, gauge symmetries and boundary terms” by Máximo Bañados and Ignacio A. Reyes (H/t Godfrey Miller).↵

For nuclear power plants governed by the United States Nuclear Regulatory Commission, SAFSTOR (SAFe STORage) is one of the options for nuclear decommissioning of a shut down plant. During SAFSTOR the de-fuelled plant is monitored for up to sixty years before complete decontamination and dismantling of the site, to a condition where nuclear licensing is no longer required. During the storage interval, some of the radioactive contaminants of the reactor and power plant will decay, which will reduce the quantity of radioactive material to be removed during the final decontamination phase.

The other options set by the NRC are nuclear decommissioning which is immediate dismantling of the plant and remediation of the site, and nuclear entombment which is the enclosure of contaminated parts of the plant in a permanent layer of concrete.Mixtures of options may be used, for example, immediate removal of steam turbine components and condensors, and SAFSTOR for the more heavily radioactive containment vessel. Since NRC requires decommissioning to be completed within 60 years, ENTOMB is not usually chosen since not all activity will have decayed to an unregulated background level in that time.

See also his discussion of the decreased flow of information out of GiveWell

with reply by Catherine Hollander.

Researchers think that this ‘turbulence cascade’ explains how even fluids with low viscosity — such as gases in the atmosphere, where there is little resistance between moving layers — still quickly convert their kinetic energy into heat and slow down when turbulence kicks in. Turbulence spreads energy into increasingly tiny eddies, which, at their smaller scale, increase local viscosity. Like friction between solid objects, this viscosity acts to increase resistance to movement between layers of fluid, and thereby dissipates kinetic energy as heat.

Mathematicians are pushing exploration of low-viscosity fluids to their … Continue reading ]]>

For nuclear power plants governed by the United States Nuclear Regulatory Commission, SAFSTOR (SAFe STORage) is one of the options for nuclear decommissioning of a shut down plant. During SAFSTOR the de-fuelled plant is monitored for up to sixty years before complete decontamination and dismantling of the site, to a condition where nuclear licensing is no longer required. During the storage interval, some of the radioactive contaminants of the reactor and power plant will decay, which will reduce the quantity of radioactive material to be removed during the final decontamination phase.

The other options set by the NRC are nuclear decommissioning which is immediate dismantling of the plant and remediation of the site, and nuclear entombment which is the enclosure of contaminated parts of the plant in a permanent layer of concrete.Mixtures of options may be used, for example, immediate removal of steam turbine components and condensors, and SAFSTOR for the more heavily radioactive containment vessel. Since NRC requires decommissioning to be completed within 60 years, ENTOMB is not usually chosen since not all activity will have decayed to an unregulated background level in that time.

See also his discussion of the decreased flow of information out of GiveWell

with reply by Catherine Hollander.

Researchers think that this ‘turbulence cascade’ explains how even fluids with low viscosity — such as gases in the atmosphere, where there is little resistance between moving layers — still quickly convert their kinetic energy into heat and slow down when turbulence kicks in. Turbulence spreads energy into increasingly tiny eddies, which, at their smaller scale, increase local viscosity. Like friction between solid objects, this viscosity acts to increase resistance to movement between layers of fluid, and thereby dissipates kinetic energy as heat.

Mathematicians are pushing exploration of low-viscosity fluids to their ultimate limit. The physicist, chemist and mathematician Lars Onsager suggested in 1949 that, in theory, a fluid could still dissipate energy even if its viscosity were to become vanishingly small, or zero (a situation that is never seen in the real world). In this hypothetical scenario, the fluid’s motion will just keep dispersing into infinitesimally small eddies, where it still will die out eventually. “That was kind of a shocking idea,” says Philip Isett, a mathematician at the University of Texas at Austin.

My purpose is to explain the ‘essential content’ of ‘the’ Threshold Theorem. By ‘essential content’, I mean those aspects of the theorem which justify a belief that quantum computing is indeed possible in practice. I divide this essential content into two categories: promises about the noise affecting a device and the guarantee that a given quantum circuit can be altered so as to be robust against the promised noise model.

And how the steel reinforcement of modern concrete, not the loss of ancient wisdom, is responsible for it aging faster than some Roman concrete. And this recent proof-of-concept software attack on DNA sequencers:

In new research they plan to present at the USENIX Security conference on Thursday, a group of researchers from the University of Washington has shown for the first time that it’s possible to encode malicious software into physical strands of DNA, so that when a gene sequencer analyzes it the resulting data becomes a program that corrupts gene-sequencing software and takes control of the underlying computer….

the natural stability of DNA depends on a regular proportion of A-T and G-C pairs. And while a buffer overflow often involves using the same strings of data repeatedly, doing so in this case caused the DNA strand to fold in on itself.

The oldest carbon-14-dated seed that has grown into a viable plant was Silene stenophylla (narrow-leafed campion), an Arctic flower native to Siberia. Radiocarbon dating has confirmed an age of 31,800 ±300 years for the seeds…Scientists extracted the embryos and successfully germinated plants in vitro which grew, flowered and created viable seeds of their own.

Here are more bets that Peter Woit listed three years ago.

…is a phenomenon observed in some tree species, in which the crowns of fully stocked trees do not touch each other, forming a canopy with channel-like gaps….There exist many hypotheses as to why crown shyness is an adaptive behavior, though research suggest that it might inhibit spread of leaf-eating insect larvae.

The N1…was a super heavy-lift launch vehicle intended to deliver payloads beyond low Earth orbit…. Its first stage is the most powerful rocket stage ever built.

The N1-L3 version was developed to compete with the United States Apollo-Saturn V to land a man on the Moon, using the same lunar orbit rendezvous method.

N1-L3 was underfunded and rushed, starting development in October 1965, almost four years after the Saturn V. The project was badly derailed by the death of its chief designer Sergei Korolev in 1966. Each of the four attempts to launch an N1 failed; during the second launch attempt the N1 rocket crashed back onto its launch pad shortly after liftoff and exploded, resulting in one of the largest artificial non-nuclear explosions in human history. The N1 program was suspended in 1974, and in 1976 was officially canceled. Along with the rest of the Soviet manned lunar programs, the N1 was kept secret almost until the collapse of the Soviet Union in December 1991; information about the N1 was first published in 1989.

Why doesn’t everyone use such a system of tubes? lstamour:

As to cost [this] suggests it’s 10-25% cheaper to operate, but when you factor in capital costs, it’s 40-90% more expensive, if I’m reading the abstract correctly.

Rubi dramatically out-performs Maple and Mathematica (the two major commercial computer algebra systems) on a grueling integration test suite. Consisting of over 55 thousand integrands and optimal antiderivatives, the entire test suite is also available for downloading.

The Rubi integration package for Mathematica has a verbose setting that lets you view the steps it took to perform integration.

The Grand Canal (also known as the Beijing-Hangzhou Grand Canal), a UNESCO World Heritage Site, is the longest canal or artificial river in the world and a famous tourist destination.[1] Starting at Beijing, it passes through Tianjin and the provinces of Hebei, Shandong, Jiangsu and Zhejiang to the city of Hangzhou, linking the Yellow River and Yangtze River. The oldest parts of the canal date back to the 5th century BC, but the various sections were first connected during the Sui dynasty (581–618 AD). The Yuan and Ming dynasties significantly rebuilt the canal and altered its route to supply their capital Beijing…

The total length of the Grand Canal is 1,776 km (1,104 mi)…

The southern portion remains in heavy use to the present day.

(H/t Tyler Cowen.)

Of the 12 [journal] titles on Suber’s list that left to start up new titles, 10 have comparable impact factors between the old journal and the new journal….six have an impact factor [IF] greater than that of the title from which the editors split at the time of the split. Four of the new titles have impact factors that are less than the boycotted title’s IF at the time of the split. Compared to the current IFs of boycotted titles, the IFs look a bit better for the new journals; seven of the new titles were better rated than the title that was subject of the revolt. The average impact factor of the new title is more than 50% greater than the boycotted title, with one outlier that is more than five times better off than the boycotted title.

I’d be very interested in interviewing people who continued to submit to the old journal. Would they be more likely to endorse “I didn’t realized the editorial board changed” or “I think the publisher is more important to making a good journal than the editors” or “I think everyone *else* will continue to think this is the top tier journal” (a la a Keynesian beauty contest).

Other termination shocks can be seen in terrestrial systems; perhaps the easiest may be seen by simply running a water tap into a sink creating a hydraulic jump. Upon hitting the floor of the sink, the flowing water spreads out at a speed that is higher than the local wave speed, forming a disk of shallow, rapidly diverging flow (analogous to the tenuous, supersonic solar wind). Around the periphery of the disk, a shock front or wall of water forms; outside the shock front, the water moves slower than the local wave speed (analogous to the subsonic interstellar medium).

Female users can also opt for a chaperoning feature whereby all of their in-app chats are emailed to a wali/guardian, should they wish to observe this type of Islamic etiquette.

And:

They tell a funny story about how they were emailed by a man from Uganda thanking them for helping him meet his wife via the app — and when they went to check exactly how many users they had in Uganda it was, well, just those two. “When it’s meant to be, it is meant to be!” says Younas.

The fact that they claim it should be near the most distant part of its hypothetical orbit makes me a bit suspicious. This sounds like the natural astronomical way to square a misunderstood signal in the data (the orbits of the visible planets) with the apparent invisibility for Planet Nine (put it far away).

More caveats to Planet Nine’s theorized existence come from the Cassini probe, which has orbited Saturn since 2004. From minute changes in the spacecraft’s speed and other telemetry, the Cassini team calculates the distance from Earth to Saturn to within 3 meters. Those range measurements could reveal even small deviations in Saturn’s orbit due to the pull from Planet Nine, but only if it is close or large enough. William Folkner, a principal engineer at JPL, says he and coworkers examined the data and saw no perceptible distortion of Saturn’s orbit. So, if Planet Nine exists and is 10 times Earth’s mass, it must be within 25 degrees of the farthest point in its hypothetical orbit, he says.

Still, the apparent axial alignment of the orbits of 12 trans-Neptunian objects is striking.

Interestingly, when discussing the symplectic integration numerical technique, the article links to a paper by Jack Wisdom, whose textbook I’ve praised before.

(NB: targeted at 3D graphics programers, not physicists.) It also introduces the “meet” product of k-blades, a type of product that I hadn’t heard of before but which gives pleasingly symmetry to an exterior algebra. Indeed, per the video, when Grassmann originally did his work he called the exterior and meet products the progressive and regressive products, respectively.

Sperm counts in men from America, Europe, Australia and New Zealand have dropped by more than 50 percent in less than 40 years, researchers said on Tuesday.

They also said the rate of decline is not slowing…

Levine screened and brought together the findings of 185 sperm count studies from 1973 to 2011 and then conducted a so-called meta-regression analysis.

The results, published in the journal Human Reproduction Update, showed a 52.4 percent decline in sperm concentration and a 59.3 percent decline in total sperm count among North American, European, Australian and New Zealand men…

In contrast, no significant decline was seen in South America, Asia and Africa. The researches noted, however, that far fewer studies have been conducted in these regions.

Experts asked to comment on the work said it was a comprehensive and well-conducted analysis and did a good job of adjusting for confounders that could have skewed its findings.

Incidentally, there’s now a PsyArXiv which looks reasonably reputable, currently hosting more than 600 papers, including this one (H/t Carl Shulman) advocating for a more stringent cut-off before something is declared “statistically significant”. I’ll link again to this very good post by Holden Karnofsky arguing that, because of various real-world inefficiencies, we should fund a smaller number of high-quality studies, for fixed resources, insofar as we actually want those studies to be useful to outsiders.

Also, here’s them on space junk.

There’s a theory [0] [1] that it’s mostly down to aesthetics. It looks visually more pleasing that way when split into three groups of four numbers:

I, II, III, IIII (consisting of I only)

V, VI, VII, VIII (consisting of I and V)

IX, X, XI, XII (consisting of I and X)

Apparently, it rotates with reaction wheels and translates with microfans. Seems like an obvious and great way to outsource astronaut workload to people who aren’t fed with food that costs $50,000/kg.

Here we report a reconstructed ancient genome of

Yersinia pestisat 30-fold average coverage from Black Death victims securely dated to episodes of pestilence-associated mortality in London, England, 1348–1350. Genetic architecture and phylogenetic analysis indicate that the ancient organism is ancestral to most extant strains and sits very close to the ancestral node of allY. pestiscommonly associated with human infection….Comparisons against modern genomes reveal no unique derived positions in the medieval organism, indicating that the perceived increased virulence of the disease during the Black Death may not have been due to bacterial phenotype. These findings support the notion that factors other than microbial genetics, such as environment, vector dynamics and host susceptibility, should be at the forefront of epidemiological discussions regarding emergingY. pestisinfections.

You’re taking a vacation to Granada to enjoy a Spanish ski resort in the Sierra Nevada mountains. But as your plane is coming in for a landing, you look out the window and realize the airport is on a small tropical island. Confused, you ask the flight attendant what’s wrong. “Oh”, she says, looking at your ticket, “you’re trying to get to Gr**a**nada, but you’re on the plane to Gr**e**nada in the Caribbean Sea.” A wave of distress comes over your face, but she reassures you: “Don’t worry, Granada isn’t that far from here. The Hamming distance is only 1!”.

After you’ve recovered from that side-splitting humor, let’s dissect the frog. What’s the basis of the joke? The flight attendant is conflating two different metrics: the geographic distance and the Hamming distance. The distances are completely distinct, as two named locations can be very nearby in one and very far apart in the other.

Now let’s hear another joke from renowned physicist Chris Jarzynski:

The linear Schrödinger equation, however, does not give rise to the sort of nonlinear, chaotic dynamics responsible for ergodicity and mixing in classical many-body systems. This suggests that new concepts are needed to understand thermalization in isolated quantum systems. – C. Jarzynski, “Diverse phenomena, common themes” [PDF]

Ha! Get it? This joke is so good it’s been told by S. Wimberger“Since quantum mechanics is the more fundamental theory we can ask ourselves if there is chaotic motion in quantum systems as well. A key ingredient of the chaotic phenomenology is the sensitive dependence of the time evolution upon the initial conditions. The Schrödinger equation is a linear wave equation, implying also a linear time evolution. The consequence of linearity is that a small distortion of the initial conditions leads only to a small and constant change in the wave function at all times (see Sect. 4.1). Certainly, this is not what we mean when talking about ‘quantum chaos’.” -S. Wimberger, “Nonlinear Dynamics and Quantum Chaos: An Introduction” (2014).**a **, H.-J. Stöckmann“The Schrödinger equation is a linear equation leaving no room for chaos.” – H.-J. Stöckmann, “Quantum Chaos: An Introduction” (1999).**b **, Casetti et al.“Chaos does not exist in the linear evolution of the quantum state vector, hence different approaches to identify quantum features that correspond to classical chaos have been developed…” – Casetti et al., “Chaos in effective classical and quantum dynamics”, arXiv:hep-th/9707054.**c **, Ullmo & Tomsovic“One possible sense of the term “chaos” here could be that two slightly different initial wave functions and diverge “exponentially” rapidly from one another with time. It turns out however that one can answer this question under relatively general conditions, and the answer is negative. Indeed, the simple fact that the Schrödinger equation is linear (i.e. … Continue reading ]]>
*[Other parts in this series: 1,2,3, 4,5,6,7.]*

You’re taking a vacation to Granada to enjoy a Spanish ski resort in the Sierra Nevada mountains. But as your plane is coming in for a landing, you look out the window and realize the airport is on a small tropical island. Confused, you ask the flight attendant what’s wrong. “Oh”, she says, looking at your ticket, “you’re trying to get to Gr**a**nada, but you’re on the plane to Gr**e**nada in the Caribbean Sea.” A wave of distress comes over your face, but she reassures you: “Don’t worry, Granada isn’t that far from here. The Hamming distance is only 1!”.

After you’ve recovered from that side-splitting humor, let’s dissect the frog. What’s the basis of the joke? The flight attendant is conflating two different metrics: the geographic distance and the Hamming distance. The distances are completely distinct, as two named locations can be very nearby in one and very far apart in the other.

Now let’s hear another joke from renowned physicist Chris Jarzynski:

The linear Schrödinger equation, however, does not give rise to the sort of nonlinear, chaotic dynamics responsible for ergodicity and mixing in classical many-body systems. This suggests that new concepts are needed to understand thermalization in isolated quantum systems. – C. Jarzynski, “Diverse phenomena, common themes” [PDF]

Ha! Get it? This joke is so good it’s been told by S. Wimberger^{a }, H.-J. Stöckmann^{b }, Casetti et al.^{c }, Ullmo & Tomsovic^{d }, and of course the great Wikipedia^{e }, who all point to the linearity of the Schrödinger equation to motivate the need for new tools to study quantum chaos. Heck, even people who *know* better love to tell this knee-slapper^{f }.

The “humor” is grounded in the fact that these jokesters are conflating two completely different metrics:

- distance in phase space, and
- distinguishability of states.

Classical chaos is associated with the exponential divergence *in phase space* of nearby states, but two classical probability distributions *remain just as distinguishable* under Hamiltonian flow, even when it’s chaotic. Likewise, quantum chaos is associated with the exponential divergence, in phase space, of nearby states *when such a distance is well-defined* (i.e., when those states are roughly localized in phase space). But two quantum states, pure or mixed, always remain *just as distinguishable* under unitary evolution.

The two colored regions represent alternative distributions in phase space, either two classical probability distributions or two Wigner functions. Classically, the probability density at any point is preserved under phase-space flow. In quantum mechanics, the analog is only guaranteed for quadratic Hamiltonians. In both cases, distinguishability of the states is always preserved by Hamiltonian evolution and roughly corresponds to the amount of overlap of the two distributions (purple area). This is distinct from the distances between points in phase space, which may diverge exponentially (dotted lines) in chaotic systems.

One way to quantify the (in)distinguishability^{g } is with quantum relative entropy

(1)

which reduces to the KL divergence

(2)

classically, i.e., when and are diagonal in the same basis. (An individual point in phase space corresponds to the special case of a delta-function probability distribution, and so the KL divergence of two classical states is simply unity when they coincide and vanishes when they don’t.) Both of these distances are manifestly preserved by Hamiltonian evolution. Indeed, the linearity of the Schrödinger equation, , is just the quantum analog of the linearity of the Liouville equation in classical mechanics,

(3)

where denotes the classical PDF and denotes the Poisson bracket.

(The von Neumann equation, , which simply generalizes the Schrödinger equation to allow for mixed states, is a more direct quantum analog to the classical Liouville equation. This is another manifestation of the idea that quantum states — both pure and mixed — are much more analogous to classical probability distributions than to classical points in phase space.^{h })

Now, there are plenty of reasons one might need a different definition of chaos in quantum systems. It is *ambiguous* how we are to extend the phase-space distance to arbitrary quantum states for the same reason that it’s ambiguous how we would define a phase-space distance between probability distributions that are widely dispersed over classical phase space. The key difference is that *no* precisely localized states exist in quantum mechanics, and it’s not surprising that a certain phenomenon in a limiting theory (classical mechanics) could be just a special case of a more general and complicated phenomena in the fundamental theory (quantum mechanics); more powerful and abstract tools may be required for the latter.^{i } But the linearity of the Schrödinger equation has **nothing whatever** do to with this.

Indeed, saying “the exponential-sensitivity criterion for identifying chaos doesn’t work in quantum systems because the Schrödinger equation is linear” is just as silly as saying “spatial separation between cities is not useful in China because the Hamming distance is almost always maximum for city names in logographic languages”. On the other hand, it is both analogous and correct to observe that spatial separation between cultures is sometimes poorly defined because cultures don’t need to be well localized in space. But that when they are (e.g., the cultures of Florence, Italy and Hawaii in the 15th century), the distance between them is perfectly sensible.

So if linearity of evolution *in phase space* implies no chaos, but this is not the same thing as linearity of evolution *in Hilbert space* (which always applies), what is an example of the former?: The harmonic oscillator. Abstractly, linearity of an equation means that solutions form a vector space; if and are solutions of the equation, then is too, for (or in the case of a complex vector space). If the classical evolution has two solutions and , then the solution is also a solution. This is guaranteed by the equation of motion for the harmonic oscillator but does not apply for an anharmonic oscillator like . The linearity of the classical equation of motion follows from the quadratic form of the Hamiltonian,

(4)

since this can always be put in the form through a linear change of variables, and from this we get . This caries over immediately to the analogous statement about the Hamiltonian operator of quantum mechanics.

One could argue that some of these writers do understand all this but that they just don’t explain it well in their writing. This may be true for some authors. However, my anecdotal experience from speaking in person to physicists is that many of them *don’t* get it, a testament to the disservice done by authors repeating this misleading idea.

Consider what this says about how physics knowledge is stored in the literature and encoded in humans brains, and the implications for how physicists decide what to work on. This particular step in the motivation for studying quantum chaos is mindlessly repeated in dozens of papers and books on quantum chaos, and it is *totally bogus*. What makes you think most folks understand the *other* steps better? I think we have to conclude that many physicists do not quite know why they are doing what they are doing; of those who do know, many apparently cannot explain it even to their peers, much less students.

And yet: there are good reasons to study different quantifiers of quantum chaos! But if many people can’t actually articulate those reasons, *what mechanism draws them to the field?* It’s some combination of them being interested in the object-level work (as opposed to the big-picture motivation), in the usefulness of the results, in the impressiveness of the other practitioners, and probably many other things I can’t think of. Nonetheless, the fact remains: many people apparently can’t clearly explain why they do what they do. More contentiously, I think we should also conclude that this is an impediment to them *carefully reasoning* about it as well, and hence that they are less likely to abandon the field when new relevant information comes to light.

There is nothing special about quantum chaos; there are many of these sorts of bad explanations that propagate by being uncritically repeated from one physicist to another. This is just an example where it’s particularly unambiguous.

(↵ returns to text)

- “Since quantum mechanics is the more fundamental theory we can ask ourselves if there is chaotic motion in quantum systems as well. A key ingredient of the chaotic phenomenology is the sensitive dependence of the time evolution upon the initial conditions. The Schrödinger equation is a linear wave equation, implying also a linear time evolution. The consequence of linearity is that a small distortion of the initial conditions leads only to a small and constant change in the wave function at all times (see Sect. 4.1). Certainly, this is not what we mean when talking about ‘quantum chaos’.” -S. Wimberger, “Nonlinear Dynamics and Quantum Chaos: An Introduction” (2014).↵
- “The Schrödinger equation is a linear equation leaving no room for chaos.” – H.-J. Stöckmann, “Quantum Chaos: An Introduction” (1999).↵
- “Chaos does not exist in the linear evolution of the quantum state vector, hence different approaches to identify quantum features that correspond to classical chaos have been developed…” – Casetti et al., “Chaos in effective classical and quantum dynamics”, arXiv:hep-th/9707054.↵
- “One possible sense of the term “chaos” here could be that two slightly different initial wave functions and diverge “exponentially” rapidly from one another with time. It turns out however that one can answer this question under relatively general conditions, and the answer is negative. Indeed, the simple fact that the Schrödinger equation is linear (i.e. that a linear combination of two solutions of Eq. (6) is also a solution of this equation) makes it impossible that chaos, in any sense similar to classical mechanics, develops in quantum mechanics.” – Ullmo & Tomsovic, “Introduction to Quantum Chaos” [PDF].↵
- “However, this mechanism of dynamical chaos is absent in Quantum Mechanics, due to the strictly linear time evolution of the Schrödinger equation…This time evolution is manifestly linear, and any notion of dynamical chaos is absent. Thus, it becomes an open question as to whether an isolated quantum mechanical system, prepared in an arbitrary initial state, will approach a state which resembles thermal equilibrium, in which a handful of observables are adequate to make successful predictions about the system.” – Wikipedia: “Eigenstate Thermalization Hypothesis” | “Motivation” (January 2017). Luckily, at least
*this*one I was able to fix myself…↵ - “The relation between classical and quantum chaos has been always somewhat unclear and, at times, even strained. The cause of the difficulties can be traced to the fact that the defining characteristic of classical chaos — sensitive dependence on initial conditions — has no quantum counterpart: It is defined through the behavior of neighboring trajectories, a concept which is essentially alien to quantum mechanics. Moreover, when the natural language of quantum mechanics of closed systems is adopted, an analog of the exponential divergence cannot be found.” – Zurek & Paz, “Decoherence, Chaos, and the Second Law”, arXiv:gr-qc/9402006. “In an idealized case, the distance between the two points in phase space grows as , where is the largest
*Lyapunov exponent*of the system. This does not happen in quantum mechanics. The (oversimplified) reason is that quantum mechanics is linear; thus two ‘‘nearly identical’’ states i.e.,states with a large initial overlap” remain nearly identical—their overlap is constant under unitary evolution—for all time.” – Blume-Kohout & Zurek, ‘Decoherence from a Chaotic Environment: An Upside Down “Oscillator” as a Model’, arXiv:quant-ph/0212153.↵ - Of course, here by “distinguishable” we mean in the idealized sense of perfectly precise measurements. Chaos is certainly associated with initial distributions that progressively in time become more difficult to distinguish for
*practical*measurements of finite accuracy. But this applies just as well to both classical and quantum mechanics!↵ - Another key aspect of this analogy is the
*size*of the spaces: classical probability distributions and quantum states (pure or mixed) live in spaces that are exponentially large in the number of degrees of freedom (and hence take an exponential amount of information to specify), whereas classical points live in phase space itself, which scales linearly.↵ - For an example of someone explaining this point well, see M. V. Berry, in New Trends in Nuclear Collective Dynamics, edited by Y. Abe, H. Horiuchi, and K. Matsuyanagi (Springer, Berlin, 1992), p. 183.↵

On the historical visibility of the Southern Cross:

Crux was known to the Ancient Greeks due to the fact that it can be seen from southern Egypt; Ptolemy regarded it as part of the constellation Centaurus. It was entirely visible as far north as Britain in the fourth millennium BC. However, the precession of the equinoxes gradually lowered its stars below the European horizon, and they were eventually forgotten by the inhabitants of northern latitudes. By AD 400, most of the constellation never rose above the horizon for Athenians.

On the historical visibility of the Southern Cross:

Crux was known to the Ancient Greeks due to the fact that it can be seen from southern Egypt; Ptolemy regarded it as part of the constellation Centaurus. It was entirely visible as far north as Britain in the fourth millennium BC. However, the precession of the equinoxes gradually lowered its stars below the European horizon, and they were eventually forgotten by the inhabitants of northern latitudes. By AD 400, most of the constellation never rose above the horizon for Athenians.

…was a signal sent by Vice Admiral Horatio Nelson, 1st Viscount Nelson, from his flagship HMS Victory as the Battle of Trafalgar was about to commence on 21 October 1805. Trafalgar was a decisive naval engagement of the Napoleonic Wars. It gave the United Kingdom control of the seas, removing all possibility of a French invasion and conquest of Britain.

It originally had “confides” (i.e. is confident) in place of “expects”, but the latter was used because it was part of the signal-flag vocabulary and therefore would not need to be spelled out letter by letter.

It’s funny that this is targeted at “the 99% of people who don’t know how to program”; I know the basics of how to program, but

Ironically, one way I might solve this problem in the past is by “coding by example”, i.e., grabbing code snippets or boilerplate off the internet and learning enough syntax by viewing usages (rather than having anything defined).

…is a mode of attack using a freefall nuclear weapon: the bomb’s descent to the target is slowed by ribbon parachute so that it actually lands on the ground before detonating.[1] Laydown delivery requires the weapon’s case to be reinforced so that it can survive the force of impact and generally involves a time-delay fuze to trigger detonation e.g. 45 seconds after hitting the ground. Laydown mode can be used to increase the effect of the weapon’s blast on built-up targets such as submarine pens or to transmit a shock wave through the ground to attack deeply-buried targets.

Behavioural individuality is thought to be caused by differences in genes and/or environmental conditions. Therefore, if these sources of variation are removed, individuals are predicted to develop similar phenotypes lacking repeatable individual variation. Moreover, even among genetically identical individuals, direct social interactions are predicted to be a powerful factor shaping the development of individuality. We use tightly controlled ontogenetic experiments with clonal fish, the Amazon molly (*Poecilia formosa*), to test whether near-identical rearing conditions and lack of social contact dampen individuality. In sharp contrast to our predictions, we find that (i) substantial individual variation in behaviour emerges among genetically identical individuals isolated directly after birth into highly standardized environments and (ii) increasing levels of social experience during ontogeny do not affect levels of individual behavioural variation. In contrast to the current research paradigm, which focuses on genes and/or environmental drivers, our findings suggest that individuality might be an inevitable and potentially unpredictable outcome of development.

A Simple Proof That a Power of an Irrational Number to an Irrational Exponent May Be Rational.

is either rational or irrational. If it is rational, our statement is proved. If it is irrational, proves our statement.

(H/t The Wirecutter.)

After Alex’s last post about the topic, I became very enthusiastic about dominant assurance contracts. However, as long as the idea is not empirically tested, I can’t put too much trust in them. Would people really contribute to DACs either as entrepreneurs implementing the contracts or “customers” (or whatever they would be called) paying for the project? They might think that there is something fishy in the contract (“why would someone give me free money?”); they might think that it’s a problem that some people can free-ride (even if it’s not a problem from the economical point of view) and not contribute because of that (people can sometimes reject favorable propositions if they think that someone else benefits “unfairly” from it).

One more practical obstacle that I thought about is this: suppose a DAC is implemented in some platform that is perhaps similar to Kickstarter. If it looks like the contract is not going to be accepted by enough people (but is somewhat close to it), the entrepreneur can himself contribute the rest of the money in order to avoid paying the failure fee. The escrow service might try to prevent this by not accepting money from the entrepreneur himself, but it wouldn’t be too difficult find friend, family, etc. to do this for the entrepreneur. If the public knows that the entrepreneur always has the option of contributing the rest of the money himself and thus avoid failure, a DAC becomes just a normal assurance contract.

This is all sensible, although I was disappointed that they confirmed they may indeed be recommending the support of political candidates.

Getting food from the mouth to the stomach without any getting into your lungs is an amazing feat. Here are videos of a person swallowing in MRI and X-ray.

Stench gas warning systems give miners the signal to head to shelter during serious emergencies.

Deep underground in a noisy mining environment, it can be very difficult to quickly raise an emergency alarm. It is often very noisy. It may be very dark or there may be many bright lights, or both in turn. There may be no source of electric power. Workers may be isolated from each other, spread over large distances, and separated by thick rock. But there will always be air (mines work very hard to make sure of this) and that air allows mines to use a very interesting way to communicate: Stench gas.

Ethyl Mercaptan has a very distinct, highly unpleasant odor that humans can easily detect and recognize, even when it is present in small amounts. Mine alarm systems (remote or manual) can release the gas, and a non-flammable propellant, into the ventilation system. It spreads rapidly through a large volume of air. When the smell is detected, miners immediately report to designated shelter areas. They remain there until an all-clear is given, which may include a distinct all-clear scent such as wintergreen.

(H/t Will Riedel.)

Depending on the size of their household people were given number of points each month that they could “spend” on food. Everything was given price in these points. For example a can of corn might be 1 point, whereas a gallon of milk might be 3. The only thing not given a points price was bread. Bread was free and you could always take as much bread as you wanted…We always had bread left over after we got all the people through the building.

Obviously bread alone cannot make a well-balanced or enjoyable diet, and food banks are valuable for more than just providing bare calories.

Note that this is a partial rendering. This earlier version makes it more clear what’s going on. Every flicker is a new image. The trajectory is an interpolation of these photos taken at various points in the orbit. Not as honest as I was hoping, but still spectacular and reasonably grounded in true images. (H/t Jana Grcevich.)

Advances in artificial intelligence (AI) will transform modern life by reshaping transportation, health, science, finance, and the military. To adapt public policy, we need to better anticipate these advances. Here we report the results from a large survey of machine learning researchers on their beliefs about progress in AI. Researchers predict AI will outperform humans in many activities in the next ten years, such as translating languages (by 2024), writing high-school essays (by 2026), driving a truck (by 2027), working in retail (by 2031), writing a bestselling book (by 2049), and working as a surgeon (by 2053). Researchers believe there is a 50% chance of AI outperforming humans in all tasks in 45 years and of automating all human jobs in 120 years, with Asian respondents expecting these dates much sooner than North Americans. These results will inform discussion amongst researchers and policymakers about anticipating and managing trends in AI.

Grace on the apparent discrepancy between predictions for human-level AI and AI automation of all jobs:

It is an obvious case of massive framing bias I think: the definitions mean that HLMI should be strictly later than labor being fully automatable. It might be less clear whether anyone in particular made errors, because the questions weren’t necessarily answered by the same people.

(H/t Alyssa Vance.)

The distinction that really matters is not between violence and non-violence, but between having and not having the appetite for power. There are people who are convinced of the wickedness both of armies and of police forces, but who are nevertheless much more intolerant and inquisitorial in outlook than the normal person who believes that it is necessary to use violence in certain circumstances. They will not say to somebody else, ‘Do this, that and the other or you will go to prison’, but they will, if they can, get inside his brain and dictate his thoughts for him in the minutest particulars. Creeds like pacifism and anarchism, which seem on the surface to imply a complete renunciation of power, rather encouraged this habit of mind. For if you have embraced a creed which appears to be free from the ordinary dirtiness of politics — a creed from which you yourself cannot expect to draw any material advantage — surely that proves that you are in the right? And the more you are in the right, the more natural that everyone else should be bullied into thinking likewise.

As I said, the mystery of the industrial revolution is that the majority of the innovation was made by people who had no formal connection to science and no formal scientific education, and who were just instead these kind of tinkerers and mechanics, and really had no access to that kind of high level scientific knowledge. Again, it is very hard to disprove that this wasn’t connected to high level scientific facts, but it is very hard to show that there really was any strong connection

How can it be that something this enormous taking place in relatively recent history is not understood at even a basic level? I wonder to what extent the curse of knowledge is at play, where people learn things but can’t even really notice it.

No idea if it actually works, but the

(It started as a kickstarter in 2013, is just now shipping mugs, and are still filling backorders.)

NIR vision appears to be entirely absent from vertebrates….See “The Verriest Lecture 2009: Recent progress in understanding mammalian color vision”

https://pdfs.semanticscholar.org/c9df/0b61e4a45e10577c001513a4fc1432696b3c.pdf…

…Vertebrate photopigment sensitivity drops off rapidly toward the near IR (~700 nm).

I found this fascinating related (but not peer reviewed) document on arXiv, “Did Evolution get it right? An evaluation of near-infrared imaging in semantic scene segmentation”

The author performs semantic segmentation with convnets using conventional visual-spectrum images and compares performance with images spectrally extended into the near-infrared. He finds that adding the additional NIR band did not improve task performance — hence that evolution “got it right” by not expanding vision into a wavelength region that fails to improve semantic segmentation. I’m not sure how good this work is from a professional biologist’s perspective, but I found it damn interesting.

“I’ve had the strangest questions about the waste water system,” he says. “Even things like: ‘do you have to wait for the house to spin so that the pipes line up underneath before you can flush the toilet!’ It’s funny how people think. All the plumbing services run under the flooring back to a 100mm swivel joint from a broad acre farm walking irrigation boom arm.

“This fitting houses all the waste water pipes and provides a water tight seal. Because the house rotates, it was important to find something that was designed to spin and not leak. This is perfect.

“It wasn’t that hard to find one, once I asked the right people what they thought would do the trick. Being in the bush you can often find people who are good at finding solutions.”

You can find discussion on HackerNews. The lead author was kind enough to answers some questions about this work.

**Q:** Is the correctness specification usually a fairly singular statement? Or will it often be of the form “The program satisfied properties A, B, C, D, and E”? (And then maybe you add “F” later.)

**Daniel Selsam:** There are a few related issues: how singular is a specification, how much of the functionality of the system is certified (coverage), and how close the specification comes to proving that the system actually does what you want (validation).

*Singular vs plural*. If you want to certify a red-black tree, then you will probably want to prove many different properties about how the core methods (e.g. finding, inserting, deleting) interact, and so the specification will be rather plural. But a different system may *use* the certified red-black tree to do one very specific thing and may have a singular specification. Thus how singular or plural a specification is depends heavily on where we draw the boundaries between systems and is somewhat arbitrary. One way or another, the internals of any proof of correctness will need to make use of many different lemmas; sometimes you can tie them all up in a bow for a particular project and sometimes you cannot.

*Coverage*. In Certigrad, we prove that the sampled gradients are unbiased estimates of the true gradients, which arguably constitutes total functional correctness for the stochastic backpropagation algorithm. We also prove a few other properties (e.g. that two program optimizations are sound). However, there are still some parts of the system about which we prove nothing. For example, a user could enter their model and compute the correct gradients, but then do gradient ascent instead of gradient descent and get nonsense when trying to train it. In principle we could prevent this kind of error as well by also proving that the learning algorithms converge to local optima on models that satisfy certain properties, but we do not do this yet in Certigrad.

*Validation*. In Certigrad, even if we approach full coverage of the system by proving convergence of the learning algorithms, we still would not have any theorems that even come close to saying that Certigrad does what we really want it to, which might be something like “after training a model with Certigrad on a dataset, the model will be extremely accurate on unseen data”. Traditionally, *verification* refers to proving that you implemented a system correctly (e.g. that your gradients are correct), whereas *validation* refers to proving that the system you chose to implement really does what you want (e.g. that your model will be accurate on unseen data). The line is blurry between verification and validation … Continue reading ]]>
^{a } funded through the Future of Life Institute.

Noisy data, non-convex objectives, model misspecification, and numerical instability can all cause undesired behaviors in machine learning systems. As a result, detecting actual implementation errors can be extremely difficult. We demonstrate a methodology in which developers use an interactive proof assistant to both implement their system and to state a formal theorem defining what it means for their system to be correct. The process of proving this theorem interactively in the proof assistant exposes all implementation errors since any error in the program would cause the proof to fail. As a case study, we implement a new system, Certigrad, for optimizing over stochastic computation graphs, and we generate a formal (i.e. machine-checkable) proof that the gradients sampled by the system are unbiased estimates of the true mathematical gradients. We train a variational autoencoder using Certigrad and find the performance comparable to training the same model in TensorFlow.

You can find discussion on HackerNews. The lead author was kind enough to answers some questions about this work.

**Q:** Is the correctness specification usually a fairly singular statement? Or will it often be of the form “The program satisfied properties A, B, C, D, and E”? (And then maybe you add “F” later.)

**Daniel Selsam:** There are a few related issues: how singular is a specification, how much of the functionality of the system is certified (coverage), and how close the specification comes to proving that the system actually does what you want (validation).

*Singular vs plural*. If you want to certify a red-black tree, then you will probably want to prove many different properties about how the core methods (e.g. finding, inserting, deleting) interact, and so the specification will be rather plural. But a different system may *use* the certified red-black tree to do one very specific thing and may have a singular specification. Thus how singular or plural a specification is depends heavily on where we draw the boundaries between systems and is somewhat arbitrary. One way or another, the internals of any proof of correctness will need to make use of many different lemmas; sometimes you can tie them all up in a bow for a particular project and sometimes you cannot.

*Coverage*. In Certigrad, we prove that the sampled gradients are unbiased estimates of the true gradients, which arguably constitutes total functional correctness for the stochastic backpropagation algorithm. We also prove a few other properties (e.g. that two program optimizations are sound). However, there are still some parts of the system about which we prove nothing. For example, a user could enter their model and compute the correct gradients, but then do gradient ascent instead of gradient descent and get nonsense when trying to train it. In principle we could prevent this kind of error as well by also proving that the learning algorithms converge to local optima on models that satisfy certain properties, but we do not do this yet in Certigrad.

*Validation*. In Certigrad, even if we approach full coverage of the system by proving convergence of the learning algorithms, we still would not have any theorems that even come close to saying that Certigrad does what we really want it to, which might be something like “after training a model with Certigrad on a dataset, the model will be extremely accurate on unseen data”. Traditionally, *verification* refers to proving that you implemented a system correctly (e.g. that your gradients are correct), whereas *validation* refers to proving that the system you chose to implement really does what you want (e.g. that your model will be accurate on unseen data). The line is blurry between verification and validation but I find the distinction useful. The limiting factor in proving validation properties for Certigrad is that nobody knows that much yet about the assumptions under which most useful models (such as neural networks) are guaranteed to perform well. Also, the set of properties we might want to prove is open-ended. There may even be multiple interesting properties we could prove about individual Certigrad models.

**Q:** Why concentrate on machine learning?

**D.S.:** Formal verification may be especially useful in machine learning (ML) for several reasons. First, ML systems are notoriously difficult to debug because developers don’t know what the system is supposed to do on a given input (we argue this in the introduction). Second, ML systems involve advanced mathematics that most developers will not be masters of, and formal math lets the ML experts communicate to the developers what the system needs to do in a precise and self-contained way that does not require years of training to be able to make sense of. Third, (math-aware) developers often need to do tedious and error-prone calculations by hand, and many of these calculations can be automated using tools from computer algebra. Fourth, because performance is dominated by matrix multiplication, machine learning algorithms do not require low-level optimizations to be competitive and so it is easier to formally verify real systems.

**Q:** It would seem that theorem provers are only useful insofar as the mathematical statement that specifies what it means to be correct is easier to understand than the code itself. What sort of “compression ratios” does Certigrad achieve? Will this depend on machine learning vs. another application?

**D.S.:** Theorem provers are also useful when the dumbest possible implementation is easier to understand than the most optimized one, which is almost always the case. The specification can simply be “it does the same thing as the naive implementation”. Similarly, a compiler may have millions of lines of code implementing program transformations, and yet the specification for each transformation may only be “it preserves semantics”. Compression ratios can vary though, and we did not get much actual compression in Certigrad. The full specification for backpropagation (including the definitions of all the preconditions) is about as many characters as the code that implements it. But it is much easier to provide than the implementation, and much easier to inspect and confirm that it is correct. I think a better metric may be how much shorter the specification is than the code *plus* the proof that the code is correct, since you would need to understand the proof to really understand the code in the first place.

(↵ returns to text)

The Reeh–Schlieder theorem states that the vacuum is *cyclic* with respect to the algebra of observables localized in some subset of Minkowski space. (For a single field , the algebra is defined to be generated by all finite smearings for with support in .) Here, “cyclic” means that the subspace is dense in , i.e., any state can be arbitrarily well approximated by a state of the form with . This is initially surprising because could be a state with particle excitations localized (essentially) to a region far from and that looks (essentially) like the vacuum everywhere else. The resolution derives from the fact the vacuum is highly entangled, such that the every region is entangled with every other region by an exponentially small amount.

One mistake that’s easy to make is to be fooled into thinking that this property can only be found in systems, like a field theory, with an infinite number of degrees of freedom. So let me exhibitMost likely a state with this property already exists in the quantum info literature, but I’ve got a habit of re-inventing the wheel. For my last paper, I spent the better part of a month rediscovering the Shor code…**a ** a quantum state with the Reeh–Schlieder property that lives in the tensor product of a *finite* number of *separable* Hilbert spaces:

As emphasized above, a separable Hilbert space is one … Continue reading ]]>

The Reeh–Schlieder theorem states that the vacuum is *cyclic* with respect to the algebra of observables localized in some subset of Minkowski space. (For a single field , the algebra is defined to be generated by all finite smearings for with support in .) Here, “cyclic” means that the subspace is dense in , i.e., any state can be arbitrarily well approximated by a state of the form with . This is initially surprising because could be a state with particle excitations localized (essentially) to a region far from and that looks (essentially) like the vacuum everywhere else. The resolution derives from the fact the vacuum is highly entangled, such that the every region is entangled with every other region by an exponentially small amount.

One mistake that’s easy to make is to be fooled into thinking that this property can only be found in systems, like a field theory, with an infinite number of degrees of freedom. So let me exhibit^{a } a quantum state with the Reeh–Schlieder property that lives in the tensor product of a *finite* number of *separable* Hilbert spaces:

As emphasized above, a separable Hilbert space is one that has a countable orthonormal basis, and is therefore isomorphic to , the space of square-normalizable functions. Thus could be, for instance, the Hilbert space of a finite number of oscillators in a chain.

First, consider the normalized, Bell-ish state

in . It’s easy to see that any basis state can be obtained by projecting onto the corresponding state in as so: . More generally, we can get any state by acting^{b } on with the operator where .

Our objective is to extend this to a state with more than two parts, so lets start with three: , , and . If the Hilbert space of each part — henceforth “site” — were single dimensional, then the Reeh–Schlieder property is enjoyed trivially by the (only) state . But if each site has at least a 2-dimensional subspace, then each pair of sites like has at least a 4-dimensional subspace, and we’d want our “vacuum” state to contain components like

(1)

so that we can pick them out with projectors on . To do this, we have expanded the dimensionality of to 6 in order to ensure there’s a state we can project on to get any state of the 4-dimensional subspace of .

We’d want this to be symmetric with respect to permuting the three systems, so we’d like to also include the states

(2)

where we have expanded the dimensionality of and similarly. If our vacuum was proportional to a sum of these 12 components, Eq. (1–2), then it would be possible to project on any one of the three sites and get (up to normalization) any of the 8 states state where each site is either or . For instance, to get state by acting only on , we would apply the operator to the vacuum , which picks out the state and flips the “5” to a “1” at .

But of course, now that each site has a 6-dimensional Hilbert space, we need to include another level of states

(3)

to allow us to select, by acting only on , any of the states formed from the two 6-dimensional subspaces of and . This would be complemented by the other two sets obtained by swapping or for , making each site 42-dimensional because .

Continuing this process ad infinitum, and generalizing from sites to an arbitrary number , we would construct^{c } our vacuum state as a sum of the above collections of states, appropriately weighted:

(4)

where is the (recursively defined) dimensionality of each “level”^{d }, are normalization coefficients, and is the function that cyclically shifts an -tuple places to the left. The state is normalized and manifestly invariant under permutation of its subsystems

The careful reader will notice that if contains the entire tower of levels labeled by , then we can no longer act on one site to cleanly project out, from level , an arbitrary joint state of the other sites because we will necessarily pick up some contributions from higher levels (i.e., larger ). However, the coefficients necessarily fall off exponentially. Therefore, for any desired degree of accuracy, we can always project onto a component from some level large enough that the higher levels have negligible relative norm. A little thought also shows we can use the same trick employed with the Bell-ish state to act on with a single-site operator to get any state in the full Hilbert space (up to arbitrarily smaller error). In this sense we have constructed a state that has the Reeh–Schlieder property on a finite number of degrees of freedom.

Of course, to create a *normalized* state on the other sites from a projected component of , we’d need to multiply by roughly . That means the norm of our local operator generically needs to be exponentially large to get . Physically, this would correspond to performing a local measurement and obtaining, with exponentially small probability, an outcome that assures you that an arbitrary state has been prepared elsewhere. (See previous post.)

Note that a state satisfying the Reeh–Schlieder property has all possible forms of multi-partite entanglement “in its belly”, e.g., EPR pairs, GHZ states, W states, etc. That is, it’s possible to distill any sort of entanglement with an exponential number of copies of the state. Presumably, it’s also straightforward to modify such a state to exhibit the so-called “split structure”.

**Edit:** Thank you to Zoltan Zimboras for pointing me to Clifton et al., which includes a very similar construction of an -partite state with the Reeh-Schlieder property. (The arXiv version calls it “hyperentanglement”, but the PRA calls it “superentanglement”; I guess the editors made them tone it down :). I think my construction is a lot easier to read!

*[I thank Peter Morgan for discussion.]*

(↵ returns to text)

- Most likely a state with this property already exists in the quantum info literature, but I’ve got a habit of re-inventing the wheel. For my last paper, I spent the better part of a month rediscovering the Shor code…↵
- Note that this sort of “acting” on is possible mathematically but not physically. That is, it’s tempting to think that a local agent could “act” on the first system with a local operator like to ensure the creation of any (normalized) state on the second system . But this operator simply does not correspond to a physically realizable action that can be taken by a local agent! Indeed, such an ability would allow superluminal signaling. Rather, the agent on the first system can only perform unitaries or make (POVM/PVM) measurements. For instance, the agent could applying a local unitary that evolves the state to , but now if you got outcome from a measurement of in the first system (which you would with reasonably high probability 1/4) this would not imply the second system is in the state . In fact, it would still just be in the state .↵
- For compactness, we’ve switched to the (CS) convention where the indices start from 0 rather than 1.↵
- Our explicit construction above got to level .↵

It is well known that, despite the misleading imagery conjured by the name, entanglement in a multipartite system cannot be understood in terms of pair-wise entanglement of the parts. Indeed, there are only pairs of systems, but the number of qualitatively distinct types of entanglement scales exponentially in . A good way to think about this is to recognize that a quantum state of a multipartite system is, in terms of parameters, much more akin to a classical probability distribution than a classical state. When we ask about the information stored in a probability distributions, there are lots and lots of “types” of information, and correlations can be much more complex than just knowing all the pairwise correlations. (“It’s not just that A knows something about B, it’s that A knows something about B *conditional* on a state of C, and that information can only be unlocked by knowing information from either D or E, depending on the state of F…”).

However, Gaussian distributions (both quantum and classical) are described by a number of parameters that grows on quadratically with the number of variables. The pairwise correlations really do tell you everything there is to know about the quantum state or classical distribution. The above paper makes me wonder to what extent we can understand multipartite Gaussian entanglement in terms of pairs of modes. They have shown that this works at a single level, that entanglement across a bipartition can be decomposed into modewise entangled pairs. But since this doesn’t work for mixed states, it’s not clear how to proceed in understanding the remain entanglement within a partition. My intuition is that there is a canonical decomposition of the Gaussian state that, in some sense, lays bare all the multipartite entanglement it has in any possible partitioning, in much the same way that the eigendecomposition of a matrix exposes its the inner workings.

In other words:

…we consider a modified point particle limit, wherein not only the size of the body goes to zero, but its charge and mass also go to zero. More precisely, we will consider a limit where, asymptotically, only the overall scale of the body changes, so, in particular, all quantities scale by their naive dimension. In this limit, the body itself completely disappears, and its electromagnetic self-energy goes to zero.

David Wallace is “a physicist’s philosopher” whose work I have recommend highly for a while. In addition to the above article, see “Decoherence and its role in the modern measurement problem” and “What is orthodox quantum mechanics?” for an approach to understanding the measurement problem that strongly accords with, and shaped, my own. On the other hand, I disagree with his defeatist “eh, it’s good enough” … Continue reading ]]>

We address the decomposition of a multimode pure Gaussian state with respect to a bipartite division of the modes. For any such division the state can always be expressed as a product state involving entangled two-mode squeezed states and single-mode local states at each side. The character of entanglement of the state can therefore be understood modewise; that is, a given mode on one side is entangled with only one corresponding mode of the other, and therefore the total bipartite entanglement is the sum of the modewise entanglement. This decomposition is generally not applicable to all mixed Gaussian states. However, the result can be extended to a special family of “isotropic” states, characterized by a phase space covariance matrix with a completely degenerate symplectic spectrum.

It is well known that, despite the misleading imagery conjured by the name, entanglement in a multipartite system cannot be understood in terms of pair-wise entanglement of the parts. Indeed, there are only pairs of systems, but the number of qualitatively distinct types of entanglement scales exponentially in . A good way to think about this is to recognize that a quantum state of a multipartite system is, in terms of parameters, much more akin to a classical probability distribution than a classical state. When we ask about the information stored in a probability distributions, there are lots and lots of “types” of information, and correlations can be much more complex than just knowing all the pairwise correlations. (“It’s not just that A knows something about B, it’s that A knows something about B *conditional* on a state of C, and that information can only be unlocked by knowing information from either D or E, depending on the state of F…”).

However, Gaussian distributions (both quantum and classical) are described by a number of parameters that grows on quadratically with the number of variables. The pairwise correlations really do tell you everything there is to know about the quantum state or classical distribution. The above paper makes me wonder to what extent we can understand multipartite Gaussian entanglement in terms of pairs of modes. They have shown that this works at a single level, that entanglement across a bipartition can be decomposed into modewise entangled pairs. But since this doesn’t work for mixed states, it’s not clear how to proceed in understanding the remain entanglement within a partition. My intuition is that there is a canonical decomposition of the Gaussian state that, in some sense, lays bare all the multipartite entanglement it has in any possible partitioning, in much the same way that the eigendecomposition of a matrix exposes its the inner workings.

Some reflections are presented on the state of the search for a quantum theory of gravity. I discuss diverse regimes of possible quantum gravitational phenomenon, some well explored, some novel.

During the past century, there has been considerable discussion and analysis of the motion of a point charge, taking into account "self-force" effects due to the particle's own electromagnetic field. We analyze the issue of "particle motion" in classical electromagnetism in a rigorous and systematic way by considering a one-parameter family of solutions to the coupled Maxwell and matter equations corresponding to having a body whose charge-current density and stress-energy tensor scale to zero size in an asymptotically self-similar manner about a worldline as . In this limit, the charge, , and total mass, , of the body go to zero, and goes to a well defined limit. The Maxwell field is assumed to be the retarded solution associated with plus a homogeneous solution (the "external field") that varies smoothly with . We prove that the worldline γ must be a solution to the Lorentz force equations of motion in the external field . We then obtain self-force, dipole forces, and spin force as first order perturbative corrections to the center of mass motion of the body. We believe that this is the first rigorous derivation of the complete first order correction to Lorentz force motion. We also address the issue of obtaining a self-consistent perturbative equation of motion associated with our perturbative result, and argue that the self-force equations of motion that have previously been written down in conjunction with the "reduction of order" procedure should provide accurate equations of motion for a sufficiently small charged body with negligible dipole moments and spin. There is no corresponding justification for the non-reduced-order equations.

In other words:

…we consider a modified point particle limit, wherein not only the size of the body goes to zero, but its charge and mass also go to zero. More precisely, we will consider a limit where, asymptotically, only the overall scale of the body changes, so, in particular, all quantities scale by their naive dimension. In this limit, the body itself completely disappears, and its electromagnetic self-energy goes to zero.

I contrast two possible attitudes towards a given branch of physics: as inferential (i.e., as concerned with an agent’s ability to make predictions given finite information), and as dynamical (i.e., as concerned with the dynamical equations governing particular degrees of freedom). I contrast these attitudes in classical statistical mechanics, in quantum mechanics, and in quantum statistical mechanics; in this last case, I argue that the quantum-mechanical and statistical-mechanical aspects of the question become inseparable. Along the way various foundational issues in statistical and quantum physics are (hopefully!) illuminated.

David Wallace is “a physicist’s philosopher” whose work I have recommend highly for a while. In addition to the above article, see “Decoherence and its role in the modern measurement problem” and “What is orthodox quantum mechanics?” for an approach to understanding the measurement problem that strongly accords with, and shaped, my own. On the other hand, I disagree with his defeatist “eh, it’s good enough” assessment of our (currently) messy and hand-wavy way to understand wavefunction branches and think the Set Selection problem is vitally important. I also find Wallace’s justification of the Born rule from decision-theoretic axioms to be fully undermined by Kent’s thorough criticisms.

Separately from quantum mechanics, Wallace’s “The Logic of the Past Hypothesis” is a great discussion of how to think about the thermodynamic justification for, and explanatory content of, asserting a low-entropy state in the past. It greatly changed my opinion on Jayne’s max entropy principle in this context.

Large-scale quantum effects have always played an important role in the foundations of quantum theory. With recent experimental progress and the aspiration for quantum enhanced applications, the interest in macroscopic quantum effects has been reinforced. In this review, we critically analyze and discuss measures aiming to quantify various aspects of macroscopic quantumness. We survey recent results on the difficulties and prospects to create, maintain and detect macroscopic quantum states. The role of macroscopic quantum states in foundational questions as well as practical applications is outlined. Finally, we present past and on-going experimental advances aiming to generate and observe macroscopic quantum states.

Observing a breakdown in quantum mechanics is, of course, highly unlikely, but pushing the bounds on the still-only-intuitively-defined notion of “macroscopic” quantum states is among the most promising approaches.^{a } To design such experiments and make sure you’re actually doing something new, you want a *measure* of macroscopicity. This forthcoming RMP contains the most comprehensive summary of measures of macroscopicity that I’ve seen.

(↵ returns to text)

Before we begin: turn away from the screen and see if you can remember what the Legendre transform accomplishes *mathematically* in classical mechanics.If not, can you remember the definition? I couldn’t, a month ago.**b ** I don’t just mean that the Legendre transform converts the Lagrangian into the Hamiltonian and vice versa, but rather: what key mathematical/geometric property does the Legendre transform have, compared to the cornucopia of other function transforms, that allows it to connect these two conceptually distinct formulations of mechanics?

(Analogously, the question “What is useful about the Fourier transform for understanding translationally invariant systems?” can be answered by something like “Translationally invariant operations in the spatial domain correspond to multiplication in the Fourier domain” or “The Fourier transform is a change of basis, within the vector space of functions, using translationally invariant basis elements, i.e., the Fourier modes”.)

Let’s turn to the canonical text by Goldstein for an example of how the Legendre transform is usually introduced. After a passable explanation of why one might want to move from a second-order equation of variables to a first-order equation of variables, we hear this [3rd edition, page 335]:

Treated strictly as a mathematical problem, the transition from Lagrangian to Hamiltonian formulation corresponds to the changing of variables in our mechanical functions from to by [ ]. The procedure for switching variables in this manner is provided by the Legendre transformation, which is tailored for just this type of change of variable.

Consider a function of only two variables , so that a differential of Continue reading *Structure and Interpretation of Classical Mechanics* is a good starting point for thinking about these issues. As a pointed example, in this blog post I’ll look at how badly the Legendre transform is taught in standard textbooks,^{a } and compare it to how it *could* be taught. In a subsequent post, I’ll used this as a springboard for complaining about the way we record and transmit physics knowledge.

Before we begin: turn away from the screen and see if you can remember what the Legendre transform accomplishes *mathematically* in classical mechanics.^{b } I don’t just mean that the Legendre transform converts the Lagrangian into the Hamiltonian and vice versa, but rather: what key mathematical/geometric property does the Legendre transform have, compared to the cornucopia of other function transforms, that allows it to connect these two conceptually distinct formulations of mechanics?

(Analogously, the question “What is useful about the Fourier transform for understanding translationally invariant systems?” can be answered by something like “Translationally invariant operations in the spatial domain correspond to multiplication in the Fourier domain” or “The Fourier transform is a change of basis, within the vector space of functions, using translationally invariant basis elements, i.e., the Fourier modes”.)

Let’s turn to the canonical text by Goldstein for an example of how the Legendre transform is usually introduced. After a passable explanation of why one might want to move from a second-order equation of variables to a first-order equation of variables, we hear this [3rd edition, page 335]:

Treated strictly as a mathematical problem, the transition from Lagrangian to Hamiltonian formulation corresponds to the changing of variables in our mechanical functions from to by [ ]. The procedure for switching variables in this manner is provided by the Legendre transformation, which is tailored for just this type of change of variable.

Consider a function of only two variables , so that a differential of has the form

(8.3)

where

(8.4)

We wish now to change the basis of description from to a new distinct set of variables , so that differential quantities are expressed in terms of differentials and Let be a function of and defined by the equation

(8.5)

A differential of is then given as

or, by (8.3), as

which is exactly in the form desired. The quantities and are now functions of the variables and given by the relations

(8.6)

which are analogues of Eqs. (8.4).

The Legendre transformation so defined is used frequently in thermodynamics. The first law of thermodynamics…

Huh? Did you see an clean definition of a *function transform* in there? is supposed to be a function of and , but the right-hand side of (8.5) has dependence. Can we always find a way to eliminate for arbitrary ? What does it mean when we can’t, or there are multiple solutions? And in what sense can become a variable *independent* of if its definition, , depends on ? Contrast this to the Fourier, special conformal, or Laplace transforms, which are unambiguous ways to convert a function of one variable to a function of another.

If you can reconstruct a clean definition using this quote, it will be ugly and you will do so by implicitly drawing on your previously obtained knowledge of when one can and cannot treat variables as independent (knowledge that is not accessible to the student reader) and by making assumptions that are true for physical Lagrangians but not true generally (surprise! has to be convex in ). And the motivation for the definition — beyond merely “look at how pretty Hamilton’s equations turn out to be” — will still be opaque.

It would be bad enough if this was just Goldstein because that book is, to my knowledge, the most widely used mechanics textbook, presumably representing the level of clarity achieved by the modal physicist. But I sat down in the library where the classical mechanics books are kept and flipped through seven or eight more^{c } and they were as bad or worse. The venerable textbook by Landau, for instance, uses the same ambiguous differential notation and declines to explain what the Legendre transform is in general; rather, it just declares a formula for the Hamiltonian in terms of the Lagrangian [Vol. 1, 3rd edition, page 131]:

(40.2)

Notice how the functional parameters are written for the Hamiltonian but not the Lagrangian? It is an impressive sleight of hand designed to distract you from the weird fact that this definition implicitly requires inverting the equation by solving for in terms of , , and , and then inserting back into Eq. (40.2).

Indeed, serious ambiguities arise when you start trying to literally interpret a quantity with differentials in terms of different variables, some of which are independent and some of which are not. (Remember, when starting from a Lagrangian defined on space, is generically a function of *both* and .^{d }) And why do we use (40.2) rather than, say, ? Landau doesn’t say, but he does derive Hamilton’s equations a couple of lines later and it’s clear we wouldn’t get the proper cancelation of differentials with a different choice. In other words…look at how pretty Hamilton’s equations are and stop asking questions!

At this point, the typical advice given to the impudently inquisitive student is to look at V.I. Arnold’s *Mathematical Methods of Classical Mechanics*, often described as the definitive, mathematically rigorous treatment…that few bother to read carefully. Here, at least, we find an actual definition in generality [2nd edition, page 61]^{e }:

Let be a convex function, .

The

Legendre transformationof the function is a new function of a new variable , which is constructed in the following way (Figure 43). We draw the graph of in the plane. Let be a given number. Consider the straight line . We take the point at which the curve is farthest from the straight line in the vertical direction: for each the function has a maximum with respect to at the point . Now we define .

The point is defined by the extremal condition , i.e., . Since is convex, the point is unique [if it exists].

If we were to condense this prescription down, we’d get this rather ugly definition for the Legendre transform of a convex function :

(1)

The convexity specified by Arnold guarantees that is well-defined, so at long last we have a clear definition. But the meaning of the transform, and especially the fact that it is its own inverse (i.e., an involution), are completely obscured.

Stare at that figure for a while. Remember: The Legendre transformation links Lagrangian and Hamiltonian mechanics, the two most important formulations of both classical *and* quantum physics. **This transformation binds together the fundamental operating system of the universe, on which all the other physical theories, like electromagnetism and gravity, run merely as programs.**^{f } Do you feel like you understand it? Have you reduced the transformation to its essence?

Let’s try one more definition, which I originally noticed buried near the bottom of a Wikipedia page. **Two convex functions and are Legendre transforms of each other when their first derivatives are inverse functions**:

.

To confirm that this is equivalent to the above definitions, we solve for (up to an additive constant) by taking the inverse function and computing its anti-derivative:

The graph of an inverse function is just the graph of the original function flipped about the 45° line, so the integral of an inverse function (area below the curve) plus the integral of the original function (area left of the curve) must equal the bounding rectangle (see figure):

(2)

(This is just an oblique use of integration by parts, which is discussed more ^{g })

The above equation holds for each pair satisfying , so we can explicitly confirm that

(3)

It is also clear from looking at a graph that a function inverse is symmetric (i.e., ), so our boxed definition makes it manifest that the Legendre transform is an involution on the set of convex (and concave) functions. In Arnold’s Figure 43 this symmetry is, shall we say, less obvious.^{h }

Hold on, you might object, the above boxed definition only fixes the Legendre transformation up to an additive constant, in contrast to Goldstein, Landau, Arnold, and Eq. (2). How will we determine the absolute values of the Hamiltonian and the Lagrangian? …oh wait, *those aren’t physically meaningful.* All of the dynamical laws are constructed from derivatives of and , and we decline to specify an additive constant for the same reason we do so with conservative potentials^{i } and, more generally, anti-derivatives.

Lets apply our new characterization of the Legendre transform within mechanics to compare the different approaches. We switch between the Lagrangian and Hamiltonian formulations by changing just the “kinetic” variable (i.e., swapping velocity and momentum), while keeping the configuration coordinates fixed. So let’s introduce functional notation and with for taking the derivative and inverse with respect to just the first (configuration) or second (kinetic) variable, for fixed value of the other.^{j } Then the Legendre transform of the kinetic variable determines the gradient in that direction,

,

.

The two boxed equations above define the relationship between the Lagrangian and Hamiltonian through the Legendre transform of the kinetic coordinates. **The first box says the gradients of and in the kinetic direction are inverses of each other. The second box says the gradients in the configuration direction are negatives of each other once you account for the change of kinetic variables.**

Now, after you understand this, it’s of course perfectly fine to use the mnemonic , along with and , to remember how to quickly compute the Hamiltonian from the Lagrangian, or vice versa. But those cartoon equations suppress all the important structure that tells you what is actually going on, and the equal footing of and merely gestures at the symmetry of the transformation. In particular, the fact that you can uniquely solve for either function in terms of their respective pair of independent variables is guaranteed by the tacit assumption of convexity. More philosophically, only the gradients of these functions, not their actual values, are physically meaningful.

OK, so that’s basically all we need to know about the transform to discuss the terribleness of physics textbook in the forthcoming blog post. In this next and final section, I’ll briefly generalize the transform a bit for added context, but consider it optional reading.

The Legendre transforms can be extended to multivariate functions and without any surprises. One clear definition is

(4)

where the inverse function exists because is a bijection on when we assume (as we must) that is convex. In the approach advocated above, this is restated more elegantly as

where, again, the involutivity is manifest. For switching between Lagrangians and Hamiltonians, we use the defining conditions

(5)

where the subscripts and now refer respectively to groups of configuration and kinetic variables in the natural way. The interpretation is the same: the kinetic gradients are inverses, and the configuration gradients are negatives.

One can generalize (4) to non-convex functions as

(6)

where the supremum is of course just the formal way of talking about a maximum over all choices of . When is convex and smooth, the right-hand side is maximized under the condition that , so we recover (4). But more generally, this formula ensures the transform is well defined for any , and the output is convex regardless.^{m } Basically, we’ve defined a way of breaking the ambiguity that comes when doesn’t have a unique inverse, and it turns out that this judicious choice ensures is the Legendre transform of the convex hull of ! In other words, applying the Legendre transform *twice* acts as the identify on convex functions but it produces the convex hull on non-convex ones.^{n }

The Legendre transform (green arrow) is an involution on the space of convex functions (blue). It leaves invariant the small set of functions (red) whose first derivatives are their own inverse, like and . The generalized definition (6) acts on the larger space of non-convex function (cylindrical volume), but still maps it down to the space of convex ones. The Legendre transform of a non-convex function is the Legendre transform of its convex hull (purple arrow), which can be obtained by just applying the transform twice.

*[I think Robert Lasenby and Godfrey Miller for discussion.]*

(↵ returns to text)

- I was pleased to note as this essay went to press that my choice of Landau, Goldstein, and Arnold were confirmed as the “standard” suggestions by the top Google results.↵
- If not, can you remember the definition? I couldn’t, a month ago.↵
- Landau & Lifshitz, Hand & Finch, Rossberg, plus a bunch I hadn’t heard of before.↵
- To keep my blood pressure in check, I’m just going to skip completely over the fact that this Hamiltonian has dependence. Almost all physics textbook fail to clearly explain to the student why we can nevertheless indulge in the sin of pretending that is an independent variable from , attributable perhaps to the mystery of faith. For clarity on this, I suggest picking up Gelfand and Fomin’s “Calculus of variations”, which is well regarded and available on Amazon for $9.↵
- A similar precise but opaque definition is given in the text by Jose and Saletan, which was inflicted upon me in graduate school. Figures like the one below can be found in Hand & Finch, and in various notes floating around that promise an “easy introduction” to the Legendre transform.↵
- Scott Aaronson, PHYS771 Lecture 9: Quantum: ‘So, what is quantum mechanics? Even though it was discovered by physicists, it’s not a physical theory in the same sense as electromagnetism or general relativity. In the usual “hierarchy of sciences” — with biology at the top, then chemistry, then physics, then math — quantum mechanics sits at a level between math and physics that I don’t know a good name for. Basically, quantum mechanics is the operating system that other physical theories run on as application software (with the exception of general relativity, which hasn’t yet been successfully ported to this particular OS). There’s even a word for taking a physical theory and porting it to this OS: “to quantize.’
Note, by “the level between physics and math”, Aaronson is talking about the basic information theoretic underpinning of quantum mechanics, whose classical analog is probability theory (which, classically, is generally assumed without discussion). In this post I am talking about the abstract formalism of Lagrangian/Hamiltonian mechanics, which sits above quantum/classical information theory but still below (i.e., more fundamental than) particular physical theories like electromagnetism or gravity.↵

- Thanks to Godfrey Miller for this link.↵
- See also Arnold’s Figure 45, where he tries (in my opinion, hopelessly) to make the involution property intuitive.↵
- By the way, you weren’t fooled by the Aharanov-Bohm effect into thinking that the potential is more “real” in quantum mechanics, were you?↵
- Here I’m adopting something close to the functional notation of Sussman and Wisdom.↵
- We actually only need this equation to hold on a single value of the kinetic variable since the previous boxed equation automatically extends it to all values. What’s happening here is this: although the additive offset we choose when transforming the kinetic variable () is arbitrary, we need to choose the
*same*offset for different values of the configuration coordinate () since the configuration gradient is physically important.↵ - At a more advanced level this is known to map between the tangent and cotangent bundles, so the domain and range are isomorphic but not strictly the same.↵
- Indeed, generalizing the Legendre transform to non-convex functions, where it is usually known as the convex conjugate, is a justification for using a definition like Eq. (1) rather than . See here for more, as well as a nice description in terms of hyperplanes.↵
- I thank Kristan Temme for emphasizing this non-mechanical aspect of the Legendre transform to me.↵

Claims about what makes Amazon’s vertical integration different:

I remember reading about the common pitfalls of vertically integrated companies when I was in school. While there are usually some compelling cost savings to be had from vertical integration (either through insourcing services or acquiring suppliers/customers), the increased margins typically evaporate over time as the “supplier” gets complacent with a captive, internal “customer.”

There are great examples of this in the automotive industry, where automakers have gone through alternating periods of supplier acquisitions and subsequent divestitures as component costs skyrocketed. Divisions get fat and inefficient without external competition. Attempts to mitigate this through competitive/external bid comparison, detailed cost accountings and quotas usually just lead to increased bureaucracy with little effect on actual cost structure.

The most obvious example of Amazon’s SOA structure is Amazon Web Services (Steve Yegge wrote a great rant about the beginnings of this back in 2011). Because of the timing of Amazon’s unparalleled scaling — hypergrowth in the early 2000s, before enterprise-class SaaS was widely available — Amazon had to build their own technology infrastructure. The financial genius of turning this infrastructure into an external product (AWS) has been well-covered — the windfalls have been enormous, to the tune of a $14 billion annual run rate. But the revenue bonanza is a footnote compared to the overlooked organizational insight that Amazon discovered: By carving out an operational piece of the company as a platform, they could future-proof the company against inefficiency and technological stagnation.

In the 10+ years since AWS’s debut, Amazon has … Continue reading ]]>

Claims about what makes Amazon’s vertical integration different:

I remember reading about the common pitfalls of vertically integrated companies when I was in school. While there are usually some compelling cost savings to be had from vertical integration (either through insourcing services or acquiring suppliers/customers), the increased margins typically evaporate over time as the “supplier” gets complacent with a captive, internal “customer.”

There are great examples of this in the automotive industry, where automakers have gone through alternating periods of supplier acquisitions and subsequent divestitures as component costs skyrocketed. Divisions get fat and inefficient without external competition. Attempts to mitigate this through competitive/external bid comparison, detailed cost accountings and quotas usually just lead to increased bureaucracy with little effect on actual cost structure.

The most obvious example of Amazon’s SOA structure is Amazon Web Services (Steve Yegge wrote a great rant about the beginnings of this back in 2011). Because of the timing of Amazon’s unparalleled scaling — hypergrowth in the early 2000s, before enterprise-class SaaS was widely available — Amazon had to build their own technology infrastructure. The financial genius of turning this infrastructure into an external product (AWS) has been well-covered — the windfalls have been enormous, to the tune of a $14 billion annual run rate. But the revenue bonanza is a footnote compared to the overlooked organizational insight that Amazon discovered: By carving out an operational piece of the company as a platform, they could future-proof the company against inefficiency and technological stagnation.

In the 10+ years since AWS’s debut, Amazon has been systematically rebuilding each of its internal tools as an externally consumable service. A recent example is AWS’s Amazon Connect — a self-service, cloud-based contact center platform that is based on the same technology used in Amazon’s own call centers. Again, the “extra revenue” here is great — but the real value is in honing Amazon’s internal tools…

The key advantage that Amazon has over any other enterprise service provider — from UPS and FedEx to Rackspace — is that they are forced to use their own services. UPS is a step removed from backlash due to lost/destroyed packages, shipping delays, terrible software and poor holiday capacity planning. Angry customers blame the retailer, and the retailer screams at UPS in turn. When Amazon is the service provider, they’re permanently dogfooding. There is nowhere for poor performance to hide. Amazon has built a feedback loop as a moat, and it is incredible to watch the flywheel start to pick up speed.

So is it the end of an era? We may still use MP3s, but when the people who spent the better part of a decade creating it say the jig is up, we should probably start paying attention. AAC is indeed much better — it’s the default setting for bringing CDs into iTunes now — and other formats are even better than it, though they also take up mountains of space on our hard drives.

(That story also, incredibly, fails to mention that the patent expired.) The broader point, of course, is that the news is filled with these sorts of submarine stories, just generally a bit more concealed.

In mathematics, the Mertens conjecture is the false statement that the Mertens functionM(n) is bounded by √n, which implies the Riemann hypothesis.

It was later shown that the first counterexample appears below () (Pintz 1987) but above (100 trillion).

For more than 7,000 hours over the past five years, Mitchell has slowly exposed the fossil’s skin and bone. The painstaking process is like freeing compressed talcum powder from concrete. “You almost have to fight for every millimeter,” he says.

Recently unveiled in an Alberta museum. Michael Greshko:

Hey, I’m the journalist who wrote the story. A few things: – The organic-rich film preserving the outlines of scales (so, yes, fossilized skin) is only a few millimeters thick. So Mitchell had to prepare the fossil extremely slowly in order to follow the film through the matrix. – The half-life of DNA is ~521 years at 13.1°C, as found in this 2012 paper: http://rspb.royalsocietypublishing.org/content/279/1748/4724 The team’s model predicts higher half-lifes at truly freezing temperatures, but even at the extreme end, there’s no way DNA would survive 110 million years. – The dating on the site is well constrained to ~110 million years old. The fossil was found in the Wabiskaw Member of the Clearwater Formation, a well-dated rock formation in Alberta. The underlying oil sands have been radiometrically dated to 112±5.3 million years old. (http://science.sciencemag.org/content/308/5726/1293)

This is apparently a wild lion and there’s no indication of what happened after this, but at least in humans “most seizures last from 30 seconds to 2 minutes and do not cause lasting harm”.

In one example of what can happen when prices are widely known, Germany required all gas stations to provide live fuel prices that it shared with consumer price-comparison apps. The effort appears to have boosted prices between 1.2 to 3.3 euro cents per liter, or about 5 to 13 U.S. cents per gallon, according to a discussion paper published in 2016 by the Düsseldorf Institute for Competition Economics.

Plausibly, the key fact is that the systems — “AI”s, although not really — gather lots of historical information about how their competitors react to price changes, opening a potential line of communication between the AIs. This makes it feasible, and perhaps likely, that the AI’s may settle on a game-theoretic equilibrium where they both “agree” to keep prices high because some earlier tit-for-tat behavior. It would be great to see a toy model of this explicitly. Here was another neat anecdote:

One client called to complain the software was malfunctioning. A competitor across the street had slashed prices in a promotion, but the algorithm responded by raising prices. There wasn’t a bug. Instead, the software was monitoring the real-time data and saw an influx of customers, presumably because of the long wait across the street.

“It could tell that no matter how it increased prices, people kept coming in,” said Mr. Derakhshan.

The things to note are (1) Self-sufficient colonies in Antarctica or under the ocean would be much easier than Mars. (2) Many of the key technologies of our civilization only make sense at scale (100’s of millions of humans) and with access to cheap resources (e.g., water). The economy is highly interdependent so that each resource you take away forces you to solve many surprising new problems. (Most of these things are heavy and cannot be continuously shipped from Earth to Mars.) (3) Humans do not go places because it’s a great story, they largely go there for economic rewards, and in particular, they rarely leave places they are wealthy to go to places where they become dirt poor.

The Vikings had the transportation technology to get to the New World many centuries before there were persistent European colonies, and this was in a time when the economy was much less dependent on the cooperation of large numbers of humans and many kinds of resources from diverse places. Our current era is much more analogous to 900 AD than 1492.

In 2005, Wikipedia co-founder and Wikimedia Foundation [WMF] founder Jimmy Wales told a TED audience:

So, we’re doing around 1.4 billion page views monthly…. And everything is managed by the volunteers and the total monthly cost for our bandwidth is about US$5,000, and that’s essentially our main cost. We could actually do without the employee … We actually hired Brion because he was working part-time for two years and full-time at Wikipedia so we actually hired him so he could get a life and go to the movies sometimes.

According to the WMF, Wikipedia (in all language editions) now receives 16 billion page views per month. The WMF spends roughly US$2 million a year on Internet hosting and employs some 300 staff.

The modern Wikipedia hosts 11–12 times as many pages as it did in 2005, but the WMF is spending 33 times as much on hosting, has about 300 times as many employees, and is spending 1,250 times as much overall.WMF’s spending has gone up by 85% over the past three years.

Sounds a lot like cancer, doesn’t it? For those readers who were around three years ago, did you notice at the time any unmet needs that would have caused you to conclude that the WMF needed to increase spending by $30 million dollars? I certainly didn’t.

The large majority of funds donated to WMF are not spent on hosting, overhead, or other things necessary for the normal function of the site. See the HN comments for a rebuttals to the author’s claim that WMF’s software development team lacks a roadmap and goals/specifications (although pretty much everyone seems to agree the software output has been subpar).

As of 15 August 2016, the CAPS [Cirrus Airframe Parachute System] has been activated 83 times, 69 of which saw successful parachute deployment. In those successful deployments, there were 142 survivors and 1 fatality. No fatalities, unsuccessful deployments, or anomalies (with the exception of one that is still under investigation) have occurred when the parachute was deployed within the certified speed and altitude parameters. Some additional deployments have been reported by accident, as caused by ground impact or post-impact fires, and 14 of the aircraft involved in CAPS deployments have been repaired and put back into service.

Qualifies as a great landing.

Conventional geothermal is about 4-5 cents/kWh, on par with natural gas in the US. The dominant capital cost is drilling a large well (you need volume), and so geothermal plants are generally only built in areas that require shallow (1km) wells.

Well costs are ~quadratic in depth. Given how much money has already been spent optimizing drilling for the oil&gas industry, along with how cutthroat that market is, I don’t see the cost coming down significantly. As a result, deep geothermal will likely be limited to niche regions like Iceland. And you need deep geothermal to scale it past the existing locations.

I would love to be wrong, since geothermal checks all the boxes for renewables and is also suitable for base load power, but I don’t see an obvious path forward short of a drilling tech miracle.

“I think it’s fair to say…that our understanding of the worm has not been materially enhanced by having that connectome available to us. We don’t have a comprehensive model of how the worm’s nervous system actually produces the behaviors. What we have is a sort of a bed on which we can build experiments—and many people have built many elegant experiments on that bed. But that connectome by itself has not explained anything.”

It is, however, a very useful starting point

In the 1980s, as a postdoctoral student in Brenner’s lab, Martin Chalfie—now at Columbia University—used the

C. eleganswiring diagram to explain one of the worm’s behaviors: He identified the specific neural circuits responsible for the worm’s tendency to wriggle backward when poked on the head and to squirm forward when touched on the tail. “The connectome was absolutely critical,” Chalfie says. “Without it, we simply would not have known which cells were connected to which.” By combining the wiring diagram with evidence from previous research, Chalfie predicted that a particular set of interneurons mediated forward movement and that another was involved in backward movement. Annihilating those neurons with lasers confirmed his predictions.

Check out the worm atlas.

As part of our research on the history of philanthropy, I recently investigated several case studies of early field growth, especially those in which philanthropists purposely tried to grow the size and impact of a (typically) young and small field of research or advocacy.

The full report includes brief case studies of bioethics, cryonics, molecular nanotechnology, neoliberalism, the conservative legal movement, American geriatrics, American environmentalism, and animal advocacy. My key takeaways are:

- Most of the “obvious” methods for building up a young field have been tried, and those methods often work. For example, when trying to build up a young field of academic research, it often works to fund workshops, conferences, fellowships, courses, professorships, centers, requests for proposals, etc. Or when trying to build up a new advocacy community, it often works to fund student clubs, local gatherings, popular media, etc.
- Fields vary hugely along several dimensions, including (1) primary sources of funding (e.g. large philanthropists, many small donors, governments, companies), (2) whether engaged philanthropists were “active” or “passive” in their funding strategy, and (3) how much the growth of the field can be attributed to endogenous factors (e.g. explicit movement-building work) vs. exogenous factors (e.g. changing geopolitical conditions).

- A major institution (e.g. Vanguard) holds index funds, but the shareholder voting for each company in those funds is done by a different in-house team whose compensation is tied to the performance of that particular company. (More realistically, you’d just need a different team for each company in a given industry, but one team could vote for multiple companies that don’t compete with each other.) It’s “passive investing with active shareholder governance” in the sense that only voting, not dollars, are actively directed.
- Accelerate the trend of issuing two classes of stock, with and without voting rights. Only active investors and corporate raiders bother to pay a premium for influence.

Existing gasoline-powered vehicles may be converted to run on CNG [compressed natural gas] or LNG [liquid natural gas], and can be dedicated (running only on natural gas) or bi-fuel (running on either gasoline or natural gas). Diesel engines for heavy trucks and busses can also be converted and can be dedicated with the addition of new heads containing spark ignition systems, or can be run on a blend of diesel and natural gas, with the primary fuel being natural gas and a small amount of diesel fuel being used as an ignition source.

The share of hired crop farmworkers who were not legally authorized to work in the U.S. grew from roughly 15 percent in 1989-91 to almost 55 percent in 1999-2001. Since then it has fluctuated around 50 percent. Since 2001, the share who are citizens has increased from about 21 percent to about 33 percent, while the share who hold green cards or other forms of work authorization has fallen from about 25 percent to about 19 percent.

adulthood is emailing “sorry for the delayed response!” back and forth until one of you dies’

(H/t Rob Wiblin.)

The speed check story is at 56:29, although I think the written version is better. I also didn’t know that Shul’s partner Walter L. Watson Jr., who is the hero of the story, was the only African American who flew the SR-71.

Delope…is the practice of throwing away one’s first fire in a pistol duel, in an attempt to abort the conflict. According to most traditions, the deloper must first allow his opponent the opportunity to fire after the command (“present”) is issued by the second, without hinting at his intentions…

The delope could be attempted for practical reasons, such as if one duelist thought their opponent was superior in skill, so as not to provoke a fatal return shot. Deloping could also be done for moral reasons if the duelist had objections to attempting to kill his opponent or if he were so skilled a marksman as to make the exchange unfair. Deloping in a duel, for whatever reason, could be a risky strategy whether or not the delope was obvious to all present. Deloping with a near miss, in order to save one’s honor without killing, could backfire if the opponent believed the effort to be genuine and responded with a fatal shot. Also, regardless of whether the delope was near or wide, the opponent might infer that he was being insulted as “not worth shooting” (an unworthy opponent) and either take care to aim his own shot to kill or insist on a second exchange.

However, for the opponent to insist upon a second shot after a delope was considered bloodthirsty and unbecoming. Often, it would fall to the seconds to end the duel immediately after a delope had been observed.

The term delope is specific to the use of firearms in a duel which, historically speaking, were typically flintlock pistols. These pistols were notorious for their lack of accuracy at long distances and a particularly skilled marksman might attempt to delope unnoticed with a well-placed “near-miss.” The distance between the two combatants had to be great enough that all others present would assume that any miss was due to this inherent inaccuracy and not intentional. This way the shooter could avoid killing his opponent and, if accused of deloping, claim he had made a genuine effort. Also, the opponent might recognize the “near-miss” as a delope but understand that it was meant for the benefit of any witnesses present and, if the opponent was not insulted, also delope. Both parties could then claim they had each tried to shoot the other and the duel would end without any fatalities.

And this is just one aspect of the actual duel *itself*. Just imagine the amount of additional social structure that would arise in societies with dueling.

as well as the unpowered rocket plummeting at 20:20 and the landing burn at 20:35. (Or go here or here for the GIFs, and here for musical accompaniment.)

I think my favorite book on management is “High Output Management” by Andy Grove, which is kind of the Bible of how do you scale an organization. I have a very long list of recommendations. “The Effective Executive” by Peter Drucker is really good. But basically what you’re talking about is, you need to have good judgment about a lot of different things. What that really is saying is, how do you gain wisdom really quickly? Last book recommendation, “Poor Charlie’s Almanack” by Charlie Munger. This is probably the best book on that that I’ve read, which is really about how do you build a bunch of mental models to break down the complexity of the world and know when to use each.

There is a ton of room for innovation around tunnel boring machines built by the major manufacturers (primarily Herrenknecht and Robbins), around reliability, durability, ease of serviceability etc. I cannot emphasize enough how modern TBM’s require an unbelievable amount of engineering attention, repair labor, spare parts infrastructure etc, similar to many super-early-stage fragile prototype technologies. Unfortunately TBM’s are no longer early stage, but for some reason the technology is frozen at just-good-enough-to-barely-work.”

However I don’t know if the economics will work out to fix any of these. Here are a couple of the big problems:

- Most TBM’s are semi- or fully-customized for a single job. This raises machine costs. It’d be better if there were only a small range of small-medium-large TBM’s that work ~everywhere.
- Most TBM’s are fully assembled in the factory, smoketested aka turned on to make sure they work, disassembled and shipped to the job, then reassembled and used. This is not efficient, surely we can figure out a better way.
- Most TBM’s are entombed aka thrown away at the end of the job, because getting them out of the hole is expensive and difficult.
- Changing the cutterheads is labor intensive and dangerous and requires highly trained very expensive humans, and it’s slow. While you change the cutterheads, your billion dollar toy is sitting there doing nothing.
- TBM architecture is highly dependent on geology. A slurry faced TBM that works in mixed soils is a totally different beast from a hard-rock TBM. It would be cool to have one machine that works in many geologies, perhaps with minimal or automated modifications.
- TBM’s require lots of care and feeding from a small army of humans. This raises job costs.
- Topside support infrastructure such as slurry plants and ground freezing machinery comes from different vendors, often even from different countries. E.g. it’s common to buy your topside slurry plant from the MS company, in France, while your ground freeze vendor might be Tachibana from Japan. Often each subsystem’s engineers on site literally don’t even speak a common language. Hilarity predictably ensues. Vertical integration would pay huge dividends here.
Ideally Elon can mass-produce TBM’s that just work out of the box for most jobs, and that are easier to work on. Then we can laugh him out of town for his stupid “put cars in tunnels” ideas and use his miracle machines to build sensible train tunnels.

Lots of parallels with the rocket industry.

The plot of fusion triple product as a function of time at 43:45 is depressing.

With no new data coming in, what else can they do? He thinks the Maverick advice to “think wider, more unconventional” is fine but limited in possible impact.

My suggestion, on which I have much to write in the future, is that theorists should get their house in order: distill what is known in a single authoritative place. It should incorporate the best mechanisms from the Review of Particle Physics, Reviews of Modern Physics, the Stanford Encyclopedia of Philosophy, Wikipedia, and GitHub, and would require developing much more powerful tools for collaboration and dispute resolution than we have now, eventually with great value to other fields.

…how good is the social science in this area? I would say “not so great.” Try looking for good public choice treatments of how climate intentions end up translated into climate policy. That is a remarkably important question, and yet it is understood poorly.

Or “how many of the people who make proclamations in this area have a decent understanding of Chinese energy and climate policy?”, and the answer is hardly any, even though that may be the most important topic in the area. And I ask that question not only of the casual tweeters but also of the academics who work on climate change.

And a game theoretic (and idealistic) analysis of the North Korean nuclear situation.

The difference from suspension bridges:

In a suspension structure such as the Golden Gate Bridge primary cables are strung from tower to tower and secondary cables drop down from those to hold the roadbed in place. Cable-stayed bridges, by contrast, have cables that run directly from the tower to the road. They essentially eliminate the cables between towers.

But the longest spans still go for suspension:

Cable-stay is inadvisable for bridges with a main span longer than 915 meters because the towers would have to soar twice as high as the towers of a suspension bridge of the same length to string enough cables to hold the road deck in place. For very long bridges, traditional suspension wins out.

(H/t ghaff.)

More video. (H/t Howard Wiseman.)

A position of Huygens’ landing site on Titan was found with precision (within one km – one km on Titan measures 1.3′ latitude and longitude at the equator) using the Doppler data at a distance from Earth of about 1.2 billion kilometers.

When people lament the demise of communities or multi-generation family units in the United States, this is the kind of mutual support they’re thinking of. The extent to which America was once comprised of warm, child-raising villages in its real-life past is, of course, greatly exaggerated, and we certainly shouldn’t romanticize local communities per se: they always have the capacity to be meddling, oppressive, and exclusionary. But all communities don’t have to be like that, and instead of abdicating community ideals as outdated, we could be working to realize them better in the particular places we live. As American lifestyles become increasingly mobile and rootless, close involvement in a community may not be foremost on people’s minds; to the extent that people these days talk about “settling down” somewhere, they usually seem to be thinking in terms of sending their kids to a local school, patronizing nearby restaurants, and attending summer concerts in the park, not trundling around to people’s homes and asking what they can do for them.

But even if we aren’t planning to live in the same town for the entire rest of our lives, we mustn’t allow ourselves to use this as a convenient excuse to distance ourselves from local problems we may have the power to ameliorate. People who come to the U.S. from other parts of the world often find our way of living perverse, in ways we simply take for granted as facts of human nature, rather than peculiar societal failings. I was recently talking to a Haitian-born U.S. citizen who works long hours as a nurse’s aid, and then comes home each night to care for her mentally disabled teenage son. She told me that if it were possible, she would go back to Haiti in a heartbeat. She was desperately poor in Haiti, but there, she said, her neighbors would have helped her: they would have invited her over for dinner, they would have offered to look after the children. “Here,” she said, “nobody helps you.” That’s one of the worst condemnations of American civil society I’ve heard in a while.

Finding a precise definition of branches in the wavefunction of closed many-body systems is crucial to conceptual clarity in the foundations of quantum mechanics. Toward this goal, we propose amplification, which can be quantified, as the key feature characterizing anthropocentric measurement; this immediately and naturally extends to non-anthropocentric amplification, such as the ubiquitous case of classically chaotic degrees of freedom decohering. Amplification can be formalized as the production of redundant records distributed over spatial disjoint regions, a certain form of multi-partite entanglement in the pure quantum state of a large closed system. If this definition can be made rigorous and shown to be unique, it is then possible to ask many compelling questions about how branches form and evolve.

A recent result shows that branch decompositions are highly constrained just by this requirement that they exhibit redundant local records. The set of all redundantly recorded observables induces a preferred decomposition into simultaneous eigenstates unless their records are highly extended and delicately overlapping, as exemplified by the Shor error-correcting code. A maximum length scale for records is enough to guarantee uniqueness. However, this result is grounded in a preferred tensor decomposition into independent microscopic subsystems associated with spatial locality. This structure breaks down in a relativistic setting on scales smaller than the Compton wavelength of the relevant field. Indeed, a key insight from algebraic quantum field theory is that finite-energy states are never exact eigenstates of local operators, and hence never have exact records that are spatially disjoint, although they can approximate this arbitrarily well on large scales. This technical challenge frustrates not just the concept of redundancy-based branches, but in fact the entire theory of decoherence as a way to precisely understand measurement in quantum field theories.

There are at least two possible resolutions: (1) Find a framework for identifying branches in fields using approximate records and/or approximate locality; or (2) find an alternative, more fundamental mathematical characterization of branching in the relativistic setting that reduces to (or otherwise supercedes) redundancy on scales much larger than the Compton wavelength. This investigation is closely related to currently open questions about the distribution and interpretation of entanglement in the vacuum. Speculatively, an objective, mathematically rigorous decomposition of a many-body state into branches may also speed up numerical simulations of nonstationary many-body states, illuminate the thermalization of closed systems, and demote measurement from fundamental primitive in the quantum formalism. It also opens up the possibility of analyzing situations where branches may recombine — and the operational Copenhagen approach must fail — such as in the early universe, exotic materials, the distant future, or thermalizing systems.

]]>Finding a precise definition of branches in the wavefunction of closed many-body systems is crucial to conceptual clarity in the foundations of quantum mechanics. Toward this goal, we propose amplification, which can be quantified, as the key feature characterizing anthropocentric measurement; this immediately and naturally extends to non-anthropocentric amplification, such as the ubiquitous case of classically chaotic degrees of freedom decohering. Amplification can be formalized as the production of redundant records distributed over spatial disjoint regions, a certain form of multi-partite entanglement in the pure quantum state of a large closed system. If this definition can be made rigorous and shown to be unique, it is then possible to ask many compelling questions about how branches form and evolve.

A recent result shows that branch decompositions are highly constrained just by this requirement that they exhibit redundant local records. The set of all redundantly recorded observables induces a preferred decomposition into simultaneous eigenstates unless their records are highly extended and delicately overlapping, as exemplified by the Shor error-correcting code. A maximum length scale for records is enough to guarantee uniqueness. However, this result is grounded in a preferred tensor decomposition into independent microscopic subsystems associated with spatial locality. This structure breaks down in a relativistic setting on scales smaller than the Compton wavelength of the relevant field. Indeed, a key insight from algebraic quantum field theory is that finite-energy states are never exact eigenstates of local operators, and hence never have exact records that are spatially disjoint, although they can approximate this arbitrarily well on large scales. This technical challenge frustrates not just the concept of redundancy-based branches, but in fact the entire theory of decoherence as a way to precisely understand measurement in quantum field theories.

There are at least two possible resolutions: (1) Find a framework for identifying branches in fields using approximate records and/or approximate locality; or (2) find an alternative, more fundamental mathematical characterization of branching in the relativistic setting that reduces to (or otherwise supercedes) redundancy on scales much larger than the Compton wavelength. This investigation is closely related to currently open questions about the distribution and interpretation of entanglement in the vacuum. Speculatively, an objective, mathematically rigorous decomposition of a many-body state into branches may also speed up numerical simulations of nonstationary many-body states, illuminate the thermalization of closed systems, and demote measurement from fundamental primitive in the quantum formalism. It also opens up the possibility of analyzing situations where branches may recombine — and the operational Copenhagen approach must fail — such as in the early universe, exotic materials, the distant future, or thermalizing systems.

]]>“Elite Law Firms Cash in on Market Knowledge“:

…corporate transactions such as mergers and acquisitions or financings are characterized by several salient facts that lack a complete theoretical account. First, they are almost universally negotiated through agents. Transactional lawyers do not simply translate the parties’ bargain into legally enforceable language; rather, they are actively involved in proposing and bargaining over the transaction terms. Second, they are negotiated in stages, often with the price terms set first by the parties, followed by negotiations primarily among lawyers over the remaining non-price terms. Third, while the transaction terms tend to be tailored to the individual parties, in negotiations the parties frequently resort to claims that specific terms are (or are not) “market.” Fourth, the legal advisory market for such transactions is highly concentrated, with a half-dozen firms holding a majority of the market share.

[Our] claim is that, for complex transactions experiencing either sustained innovation in terms or rapidly changing market conditions, (1) the parties will maximize their expected surplus by investing in market information about transaction terms, even under relatively competitive conditions, and (2) such market information can effectively be purchased by hiring law firms that hold a significant market share for a particular type of transaction.

…The considerable complexity of corporate transaction terms creates an information problem: One or both parties may simply be unaware of the complete set of surplus-increasing terms for the transaction, and of their respective outside options should negotiations break down. This problem is distinct from the classic problem of valuation uncertainty. Rather than unawareness of facts that may affect the value of the capital asset to be transferred between the parties, the problem identified here is unawareness of the possibilities for contracting with respect to that asset.

The non-price terms of transactional agreements and their associated payoffs may change rapidly as a result of contractual innovation and market conditions, such that parties without current market information may have difficulty determining their expected surplus from transacting. This is particularly so for corporate transactions involving private companies or private securities offerings, because the transaction terms will remain private for at least some period of time

“Elite Law Firms Cash in on Market Knowledge“:

…corporate transactions such as mergers and acquisitions or financings are characterized by several salient facts that lack a complete theoretical account. First, they are almost universally negotiated through agents. Transactional lawyers do not simply translate the parties’ bargain into legally enforceable language; rather, they are actively involved in proposing and bargaining over the transaction terms. Second, they are negotiated in stages, often with the price terms set first by the parties, followed by negotiations primarily among lawyers over the remaining non-price terms. Third, while the transaction terms tend to be tailored to the individual parties, in negotiations the parties frequently resort to claims that specific terms are (or are not) “market.” Fourth, the legal advisory market for such transactions is highly concentrated, with a half-dozen firms holding a majority of the market share.

[Our] claim is that, for complex transactions experiencing either sustained innovation in terms or rapidly changing market conditions, (1) the parties will maximize their expected surplus by investing in market information about transaction terms, even under relatively competitive conditions, and (2) such market information can effectively be purchased by hiring law firms that hold a significant market share for a particular type of transaction.

…The considerable complexity of corporate transaction terms creates an information problem: One or both parties may simply be unaware of the complete set of surplus-increasing terms for the transaction, and of their respective outside options should negotiations break down. This problem is distinct from the classic problem of valuation uncertainty. Rather than unawareness of facts that may affect the value of the capital asset to be transferred between the parties, the problem identified here is unawareness of the possibilities for contracting with respect to that asset.

The non-price terms of transactional agreements and their associated payoffs may change rapidly as a result of contractual innovation and market conditions, such that parties without current market information may have difficulty determining their expected surplus from transacting. This is particularly so for corporate transactions involving private companies or private securities offerings, because the transaction terms will remain private for at least some period of time

In an effort to keep the community as clean and orderly as possible, new users have very little rights from the get-go. On paper, this is a pretty nice idea. In practice, it makes it difficult for new users to gain any traction. I read through a number of questions today and had several comments for the original poster. Unfortunately, I couldn’t make my comments, since new users cannot post comments on articles they themselves didn’t write (you have to gain “reputation” in order to gain that privilege). Posting my comment as an “answer” to the original question seemed like bad form, so I didn’t do that. Looking elsewhere around the site, I found a few questions I felt I could answer. As soon as I went to answer said questions, someone else (in some cases, a number of other people) had jumped in and beaten me to the punch. I never had a chance to provide a helpful answer. Not only do you have to be very knowledgeable about a subject, you’ve also got to be very fast in providing said answer. I eventually did provide an answer for a question, then realized that my approach wouldn’t work. Before I could take action and modify the answer, my submission had already been modded down by several people, several of whom left snarky remarks. What a warm welcome for a new user! I subsequently deleted my answer.

Here are my (reposted) thoughts: These issues seem to mirror complaints about Wikipedia over the past ~5 years. Certainly, I have experienced aspects of this phenomenon at both sites. In both cases, many people claim this represents a decline in quality of the site, perhaps due to incumbent users entrenching themselves with wiki-lawyering (and SO’s equivalent). These explanations used to appeal to me too.

However, my new tentative interpretation is that this doesn’t represent a decline in these sites so much as them asymptoting to their maximum quality given their rules and site structure. The key evidence is that, in both cases, the apparent decline in quality has not caused these sites to become less useful to *read*. My understanding is that their traffic continues to grow and there are no serious competitors. In other words: it becomes harder and harder to contribute to these sites *given their rules*, frustrating users, but their quality is either still slowly improving or stagnant.

(I’ve seen isolated examples where quality of a Wikipedia page as declined, but I think these are exceptions to the rule. Evidence that a decline is systematic would falsify my interpretation.)

Now, this is only a valid argument insofar as you assume the mission of the site is to produce good content for others rather than help the users. This seems a safe assumption for Wikipedia. It’s easy for SO users to think that the site is about answering their questions, but the founders make it pretty clear that this takes a back seat to producing useful content. Helping users is an instrumental goal, acting to incentivize contribution, for the primary goal of creating content. SO has become less helpful to users mostly because it has answered the questions that are best suited for the structure of the site, and now a large fraction of questions being asked are marginal questions, e.g., ones that are more subjective or localized and not a good fit for SO’s format.

This is *not* to say that these sites couldn’t improve with new structure/mechanisms/rules, and I think it’s completely true that people’s complaints are pointing exactly toward the issues that new mechanisms could help with. But I conjecture this mostly requires new insight about how to design collaborative websites, and new technical tools. For instance, I am an inclusionists, and I think Wikipedia should move in that direction. But deletionism exists for a very real reason: it’s as a crude method of avoiding the workload required to police a long tail of non-notable but verifiable content which has a high surface area for abuse. The way to fix this, in my opinion, is to create new tools for monitoring and verifying content at scale, *not* spending more time arguing against deletionism.

Aphrodite and Anvil were the World War II code names of United States Army Air Forces and United States Navy operations to use B-17 and PB4Y bombers as precision-guided munitions against bunkers and other hardened/reinforced enemy facilities, such as those targeted during Operation Crossbow….

Old Boeing B-17 Flying Fortress bombers were stripped of all normal combat armament and all other non-essential gear (armor, guns, bomb racks, transceiver, seats, etc.), relieving about 12,000 lb (5,400 kg) of weight. To allow easier exit when the pilot and co-pilot were to parachute out, the canopy was removed. Azon[9] radio remote-control equipment was added, with two television cameras fitted in the cockpit to allow a view of both the ground and the main instrumentation panel to be transmitted back to an accompanying CQ-17 ‘mothership’.

The drone was loaded with explosives weighing more than twice that of a B-17’s normal bomb payload. The British Torpex used for the purpose was itself 50% more powerful than TNT.

A relatively remote location in Norfolk, RAF Fersfield, was the launch site. Initially, RAF Woodbridge had been selected for its long runway, but the possibility of a damaged aircraft that diverted to Woodbridge for landings colliding with a loaded drone caused concerns. The remote control system was insufficient for safe takeoff, so each drone was taken aloft by a volunteer pilot and a volunteer flight engineer to an altitude of 2,000 ft (600 m) for transfer of control to the CQ-17 operators. After successful turnover of control of the drone, the two-man crew would arm the payload and parachute out of the cockpit. The ‘mothership’ would then direct the missile to the target….

but

Of 14 missions flown, none resulted in the successful destruction of a target. Many aircraft lost control and crashed or were shot down by flak, and many pilots were killed.

The basic idea is that in order to pick stocks, you must *necessarily* reduce your diversification. Even if you pick stocks completely randomly, so that your expected return is equal to the index, a risk adverse investor is worse off due to larger variance. Since there are many ways to interconvert between additional expected return and risk (e.g., insurance), inducing an exchange rate between the two, there is a quantifiable cost for active management even when random. (Indeed, I expect managers to take some steps to effectively buy insurance to reduce their variance.) In other words: the cost of active management does not go to zero in the limit where the manager picks randomly and fees are ignored. Rather, as soon as you start picking you are accruing a cost through reduced diversification; you cannot come out ahead unless the size of your edge compensates for this non-zero cost.

(Ben Hoskin pointed out to me that this edge could be from having small insight into the expected returns of stocks *or* their risk and/or correlation with other stocks. Skilled managers could pick a small portfolio of stocks with equal returns to the index but reduced risk.)

The authors emphasize the skewness of the distribution of performance over individual stocks, but I suspect the more precise cause is the fact that index returns are dominated by a small number of outperforming stocks *even as the number of stocks in the index grows large*. (If I knew more about this I would probably say “power law” or “black swans” or something.) The reason this is important is that, were the opposite true, active managers could still approach the volatility of the index by simply having a large portfolio on an absolute scale (); in contrast, for outlier-dominated indices, you must have an order-unity fraction of the index, i.e., a large portfolio on a *relative* scale. Now, such a distribution of stock performance implies skewness simply because there is a zero-lower-bound on performance, but there are skewed distributions where random active managers can do fine. I *think* my claim is different from the authors insofar as mine requires that the variance of stock performance is unbounded, whereas the authors assume an ideal lognormal distribution (which has finite variance) but obtain the effect simply because their simulation has a finite number of stocks (). The number of hypothetical stocks necessary to demonstrate the difference might exceed the actual number of real-world stocks, though, so maybe the distinction is moot.

Two parts of the brain are heavily involved in remembering our personal experiences. The hippocampus is the place for short-term memories while the cortex is home to long-term memories. This idea became famous after the case of Henry Molaison [“HM”] in the 1950s. His hippocampus was damaged during epilepsy surgery and he was no longer able to make new memories, but his ones from before the operation were still there. So the prevailing idea was that memories are formed in the hippocampus and then moved to the cortex where they are “banked”. The team at the Riken-MIT Center for Neural Circuit Genetics have done something mind-bogglingly advanced to show this is not the case. The experiments had to be performed on mice, but are thought to apply to human brains too.

They involved watching specific memories form as a cluster of connected brain cells in reaction to a shock. Researchers then used light beamed into the brain to control the activity of individual neurons – they could literally switch memories on or off. The results, published in the journal Science, showed that memories were formed simultaneously in the hippocampus and the cortex.

Most investors in search of an advisor are looking for someone they can trust. Yet, trust can be fragile. Typically, trust is established as part of the “courting” process, in which your clients are getting to know you and you are getting to know them. Once the relationship has been established, and the investment policy has been implemented, we believe the key to asset retention is keeping that trust.

So how best can you keep the trust? First and foremost, clients want to be treated as people, not just as portfolios. This is why beginning the client relationship with a financial plan is so essential. Yes, a financial plan promotes more complete disclosure about clients’ investments, but more important, it provides a perfect way for clients to share with the advisor what is of most concern to them: their goals, feelings about risk, their family, and charitable interests. All of these topics are emotionally based, and a client’s willingness to share this information is crucial in building trust and deepening the relationship.

A key problem, however, is that not all of the defective mitochondria can be eliminated. The boy, Zhang reports in the new paper, currently carries between 2.36 and 9.23 percent of potentially defective DNA, according to sampling of his urine, hair follicles and circumcised foreskin.

“That’s not surprising,” says Doug Wallace, head of the Center for Mitochondrial and Epigenomic Medicine at the Children’s Hospital of Philadelphia, who was not involved in the study. “As far as I know, very few cases have been found where there is absolutely no carryover of mitochondria from the donor nucleus.”

More here and here.

Within any field, there are certain theorems and certain techniques that are generally known and generally accepted. When you write a paper, you refer to these without proof. You look at other papers in the field, and you see what facts they quote without proof, and what they cite in their bibliography. You learn from other people some idea of the proofs. Then you’re free to quote the same theorem and cite the same citations. You don’t necessarily have to read the full papers or books that are in your bibliography. Many of the things that are generally known are things for which there may be no known written source. As long as people in the field are comfortable that the idea works, it doesn’t need to have a formal written source

Why one might produce cryptic proofs:

I’d like to spell out more what I mean when I say I proved this theorem. It meant that I had a clear and complete flow of ideas, including details, that withstood a great deal of scrutiny by myself and by others. Mathematicians have many different styles of thought. My style is not one of making broad sweeping but careless generalities, which are merely hints or inspirations: I make clear mental models, and I think things through. My proofs have turned out to be quite reliable. I have not had trouble backing up claims or producing details for things I have proven. I am good in detecting flaws in my own reasoning as well as in the reasoning of others.

However, there is sometimes a huge expansion factor in translating from the encoding in my own thinking to something that can be conveyed to someone else. My mathematical education was rather independent and idiosyncratic, where for a number of years I learned things on my own, developing personal mental models for how to think about mathematics. This has often been a big advantage for me in thinking about mathematics, because it’s easy to pick up later the standard mental models shared by groups of mathematicians. This means that some concepts that I use freely and naturally in my personal thinking are foreign to most mathematicians I talk to. My personal mental models and structures are similar in character to the kinds of models groups of mathematicians share—but they are often different models.

The social phenomenon of trust in mathematics is *weird*, and this description fits with how I have heard it described by others:

Mathematicians were actually very quick to accept my proof, and to start quoting it and using it based on what documentation there was, based on their experience and belief in me, and based on acceptance by opinions of experts with whom I spent a lot of time communicating the proof. The theorem now is documented, through published sources authored by me and by others, so most people feel secure in quoting it; people in the field certainly have not challenged me about its validity, or expressed to me a need for details that are not available.

Thurston seems to accept all this even though he clearly understands that the point of academics is to put knowledge into human minds, not write it down on paper. I mostly just feel that all of this is highly suboptimal, and can only hope that new technical tools will improve the situation.

The U.S. Food and Drug Administration granted 23andMe authorization to offer ten genetic health risk reports including late-onset Alzheimer’s disease, Parkinson’s disease, celiac disease, and a condition associated with harmful blood clots.

…the altruism-heuristic model predicts that increasing people’s altruistic dispositions toward other people will lead to greater use of action constraints such as “do not kill,” but instead the reverse occurs. Kurzban, DeScioli, and Fein (2012) found that participants reported greater willingness to kill one brother to save five brothers than to kill one stranger to save five strangers. Altruism causes people to be less likely, not more likely, to use Kantian action constraints.

(H/t Giego Caleiro.)

(H/t Paul Blackburn.)

Julian Simon helped revolutionize the airline industry by popularizing the idea that carriers should stop randomly removing passengers from overbooked flights and instead auction off the right to be bumped by offering vouchers that go up in value until all the necessary seats have been reassigned. Simon came up with the idea for these auctions in the 1960s, but he wasn’t able to get regulators interested in allowing it until the 1970s. Up until that time, Litan writes, “airlines deliberately did not fill their planes and thus flew with less capacity than they do now, a circumstance that made customers more comfortable, but reduced profits for airlines.” And this, of course, meant they had to charge passengers more to compensate.

By auctioning off overbooked seats, economist James Heins estimates that $100 billion has been saved by the airline industry and its customers in the 30-plus years since the practice was introduced.

To me, the take home message is that human institutions *really can* have huge flaws that can be demonstrated with simple arguments, yet persist for decades. There’s more work to be done.

The key trick is that you can get from LEO to the cislunar orbits through electrical propulsion which is much slower but more efficient than normal chemical rockets. This isn’t a good way to move astronauts (since it’s so slow), but it’s a great way to economically get lots of mass out of Earth’s gravity well, e.g., spacious Mars transportation with heavy radiation shielding, return fuel, Mars habitats, etc. So spend a few years slowing raising this stuff, then send out the actual human in a small pod (with fast chemical rockets) to pick up the goodies and head off for Mars.

Pat Troutman put out the idea of using Stickney crater on Phobos as a low radiation environment. With Phobos behind, Mars above, and crater walls all around it it’s about as sheltered as can be for a surface. (Even Kim Stanley Robinson didn’t think of that in Red Mars.) That benign environment allows long stays, up to 900 days. Astronauts stationed in Stickney could teleoperate rovers on the surface without noticeable latency.”

Also, here is a rad scatter plot of delta-V vs. trip duration comparing lunar, asteroid, and Mars missions.

- Checking extreme and near-extreme cases
- Reflexivity and self-reference — “Death by diagonalization”
- Self-undermining views
- Transform to nearby contexts/modalities — Interchange space with time, rationality with morality, chance with counterfactuals
- Argue for something’s possibility — Use conceivability, arbitrariness, continuity with interpolation or extrapolation, symmetry
- Trial and error — Break possibilities into enumerable subcomponents and check the combinations

If you have any experience reading philosophy papers, or are around thoughtful people, it’s easy to think some of these are obvious or too generic to write down. I think that’s a mistake, though; I can reach back and vaguely recall finding the initial uses of many of these heuristics revelatory and effective. And it’s a common mistake in academics to confuse novelty with depth/importance. I expect this to be most useful to students who are just being exposed to these, since it will help them keep their eyes open for their re-appearance.

My principle critique of this piece is that adding mathematical and scientific examples would have been more valuable than just philosophical ones. In the latter case we might often worry that some sleight of hand is being employed. (Have I really accomplished anything when I note that the statement “Unverifiable sentences are meaningless” cannot be verified?) Math and science provide more objective examples when “real explanatory work” has been done.

A summary of the arguments for why multigenerational mobility is not as low as Clark thinks. I may be misunderstanding this field, but it seems to me that the randomized lottery-style experiments show there’s not much long-term transmission of wealth through non-genetic means (which makes sense since only one person can get an inheritance). But transmission of wealth through genetic means is heavily dependent on assortative mating, since three generations out your descendants only have an eighth of your genes anyway. I wonder if anyone has looked into whether the places that have been found to have unusually low intergenerational mobility (medieval Venice?) are the ones that have the most assortative mating.

However, I am happy to note that someone launched PubCenter last year, which aims to build a searchable archive of RSS feeds since its date of inceptions (July 2016). The original, raw Google Reader archive data has apparently been saved, so these sources could in principle be combined, but I think the Google reader data is an unusable mess right now. There are also archives of select blogs going back to circa 2013 at the Old Reader, and you can organize them by “oldest first” and read from that list.

(↵ returns to text)

- I thought this XKCD depiction would be more helpful, but it suggests the opposite conclusion. I’m not sure how to reconcile all this.↵

Here’s the figureThe editor tried to convince me that this figure appeared on the cover for purely aesthetic reasons and this does not mean my letter is the best thing in the issue…but I know better!**a ** and caption:

It is my highly unusual opinion that identifying a definition for the branches in the wavefunction is the most conceptually important problem is physics. The reasoning is straightforward: (1) quantum mechanics is the most profound thing we know about the universe, (2) the measurement process is at the heart of the weirdness, and (3) the critical roadblock to analysis is a definition of what we’re talking about. (Each step is of course highly disputed, and I won’t defend the reasoning here.) In my biased opinion, the paper represents the closest yet anyone has gotten to giving a mathematically precise definition.

On the last page of the paper, I speculate on the possibility that branch finding may have practical (!) applications for speeding up numerical simulations of quantum many-body systems using matrix-product states (MPS), or tensor networks in general. The rough idea is this: Generic quantum systems are exponentially hard to simulate, but classical systems (even stochastic ones) are not. A definition of branches would identify *which degrees of freedom* of a quantum system could be accurately simulated classically, and when. Although classical computational transitions are understood in many certain special cases, our macroscopic observations of the real world strongly suggest that *all* systems we study admit classical descriptions on large enough scales.Note that whether certain degrees of freedom admit a classical effective description is a computational question. The existence of such a description is incompatible with them exhibiting quantum supremacy, but it is compatible with them encoding circumstantial evidence (at the macro scale) for quantum mechanics (at the micro scale).**b ** There is reason to think a general (abstract) description of the quantum-classical transition is possible, and would allow us to go beyond special cases.

In the rest this blog post I’m going to construct a simple MPS state featuring branches with records that are arbitrarily redundant. The state’s key features will be that it is translationally invariant and has finite correlation length. Translational invariance makes the state much easier to study and compare with the literature, and is a property shared with the simple inflationary model I’m studying with Elliot Nelson. Finite correlation length eliminates the trivial solution of a generalized GHZ state, guarantees that there are an infinite number of branches (if there are any at all), and is in some sense more natural.

Our strategy will be to build the state up from a *classical* probability distribution that already has the key features. Let , be a classical state of a finite 1D lattice of … Continue reading ]]>

When the wave function of a large quantum system unitarily evolves away from a low-entropy initial state, there is strong circumstantial evidence it develops “branches”: a decomposition into orthogonal components that is indistinguishable from the corresponding incoherent mixture with feasible observations. Is this decomposition unique? Must the number of branches increase with time? These questions are hard to answer because there is no formal definition of branches, and most intuition is based on toy models with arbitrarily preferred degrees of freedom. Here, assuming only the tensor structure associated with spatial locality, I show that branch decompositions are highly constrained just by the requirement that they exhibit redundant local records. The set of all redundantly recorded observables induces a preferred decomposition into simultaneous eigenstates unless their records are highly extended and delicately overlapping, as exemplified by the Shor error-correcting code. A maximum length scale for records is enough to guarantee uniqueness. Speculatively, objective branch decompositions may speed up numerical simulations of nonstationary many-body states, illuminate the thermalization of closed systems, and demote measurement from fundamental primitive in the quantum formalism.

Here’s the figure^{a } and caption:

Spatially disjoint regions with the same coloring (e.g., the solid blue regions ) denote different records for the same observable (e.g., ). (a) The spatial record structure of the Shor-code family of states, which can exhibit arbitrary redundancy (in this case four-fold) for two incompatible observables. (b) The solid orange observable pair-covers the hashed blue observable because the top two orange records overlap all blue records. However, if one of the top two orange records is dropped, then neither observable pair-covers the other, and hence both are compatible, despite many overlaps of individual records. (c) Any spatially bounded set of records can be contained inside a single record of a sufficiently dilated but otherwise identical set of records for an incompatible observable; such a state is given in Eq. (9). (d) Any observable with records satisfying the hypothesis of the Corollary for some length cannot pair-cover, or be pair-covered by, any other such observable.

It is my highly unusual opinion that identifying a definition for the branches in the wavefunction is the most conceptually important problem is physics. The reasoning is straightforward: (1) quantum mechanics is the most profound thing we know about the universe, (2) the measurement process is at the heart of the weirdness, and (3) the critical roadblock to analysis is a definition of what we’re talking about. (Each step is of course highly disputed, and I won’t defend the reasoning here.) In my biased opinion, the paper represents the closest yet anyone has gotten to giving a mathematically precise definition.

On the last page of the paper, I speculate on the possibility that branch finding may have practical (!) applications for speeding up numerical simulations of quantum many-body systems using matrix-product states (MPS), or tensor networks in general. The rough idea is this: Generic quantum systems are exponentially hard to simulate, but classical systems (even stochastic ones) are not. A definition of branches would identify *which degrees of freedom* of a quantum system could be accurately simulated classically, and when. Although classical computational transitions are understood in many certain special cases, our macroscopic observations of the real world strongly suggest that *all* systems we study admit classical descriptions on large enough scales.^{b } There is reason to think a general (abstract) description of the quantum-classical transition is possible, and would allow us to go beyond special cases.

In the rest this blog post I’m going to construct a simple MPS state featuring branches with records that are arbitrarily redundant. The state’s key features will be that it is translationally invariant and has finite correlation length. Translational invariance makes the state much easier to study and compare with the literature, and is a property shared with the simple inflationary model I’m studying with Elliot Nelson. Finite correlation length eliminates the trivial solution of a generalized GHZ state, guarantees that there are an infinite number of branches (if there are any at all), and is in some sense more natural.

Our strategy will be to build the state up from a *classical* probability distribution that already has the key features. Let , be a classical state of a finite 1D lattice of bits (), and consider the canonical ensemble starting with a uniform distribution and adding an energetic penalty of for misaligned nearest-neighbors^{c }:

(1)

with . It’s not hard to check that the classical expectation value obeys and

(2)

If we define our correlation length by the asymptotic behavior , then for large (large ) we have

(3)

The idea is to define our quantum state as a superposition of configurations of a spin chain weighted by this distribution: . Contiguous regions of size will be highly correlated. The coarse-grained variables that are being recorded are something like “whether this local region is mostly or mostly ”, but we don’t necessarily want each qubit to have a perfect record of this information. Rather, we choose

(4)

for some fixed (typically small) angle . For one can’t reliably infer the classical variable from a measurement on a single spin, but one can make an arbitrarily reliable inference by measuring lots of them: . If we define our “recording length” by the asymptotic behavior , then for small we have

(5)

Thus our final (unnormalized) state is

(6)

for arbitrary positive parameters and . The state inherits from the classical probability distribution the properties of finite correlation length and translational invariance. Any contiguous set of spins is very likely to all be in the same state, and by measuring a subset of spins one can distinguish between the cases and with high reliability. Therefore in the limit it makes sense to define the redundancy ; this is the approximate number of disjoint records that are available about each branch, where there is a (binary) branch density of .

Now lets turn this into a matrix-product state.^{d } That is, it should take the form^{e }^{f }

(7)

where the range over the local Hilbert spaces of the qubits and the are the “bonds” representing contraction of matrices of as-yet unspecified dimension (depicted in the figure as lines connecting the ‘s horizontally). The trick is to pretend that our classical probability distribution was achieved by starting with a quantum state of appropriately weighted bras,

(8)

and contracting (i.e., multiplying from the right to inner-product over the fictional bras) with a specially designed state that picks out the correct conditional states of the original Hilbert space:

(9)

The object

(10)

then takes the form of (7) and is equal to so long as we choose to satisfy

(11)

or, more cleanly,

(12)

*[I think Martin Ganahl and Guifre Vidal for discussion.]*

(↵ returns to text)

- The editor tried to convince me that this figure appeared on the cover for purely aesthetic reasons and this does not mean my letter is the best thing in the issue…but I know better!↵
- Note that whether certain degrees of freedom admit a classical effective description is a
*computational*question. The existence of such a description is incompatible with them exhibiting quantum supremacy, but it is compatible with them encoding circumstantial evidence (at the macro scale) for quantum mechanics (at the micro scale).↵ - We assume periodic boundary conditions: ↵
- This will not be a general algorithm for building MPSs since it exploits the special form of the state (6) we’re trying to build, which allows us to identify the ‘s with the ‘s.↵
- Again, periodic boundary conditions↵
- The tikzpicture code was obtained from Piotr Migdał.↵

I’m pleased to report that Cotler, Penington, and Ranard have tackled a closely related problem, and made a lot more progress:

The paper has a nice, logical layout and is clearly written. It also has an illuminating discussion of the purpose of *nets* of observables (which appear often in the algebraic QFT literature) as a way to define “physical” states and “local” observables when you have no access to a tensor decomposition into local regions.

For me personally, a key implication is that if I’m right in suggesting that we can uniquely identify the branches (and subsystems) just from the notion of locality, then this paper means we can probably reconstruct the branches just from the spectrum of the Hamiltonian.

Below are a couple other comments.

The proper conclusion to draw from this paper is that if a quantum system can be interpreted in terms of spatially local interactions, this interpretation is probably unique. It is tempting, but I think mistaken, to also conclude that the spectrum of the Hamiltonian is more fundamental than notions of locality. Let me explain.

I am a big fan of trying to condense our physical knowledge down to the smallest number of elegant axioms, and I think that sort of work should be ongoing.For instance, I celebrate using amplification to identify observables with Hermitian (or normal!) operator rather than simply postulating this. The general constructive approach in the introductory chapter of Weinberg’s QFT test is also excellent on this front (although much could be improved).**a ** Of course, axiom counting and elegance assessment are subjective. One way to try and formalize this is with algorithmic complexity (although there is still plenty of hand waving). We consider two physical theories *(observationally) equivalent* if they make the same experimental predictions and, when multiple equivalent theories are available, we consider the theory described by the shortest algorithm (as measured in bits) to be preferred.There are objections to this approach. It’s not at all obvious what language such an algorithm would be written in, leading to an constant-factor ambiguity in the size of the program. The choice of language is intertwined with questions about the form that fundamental physical axioms are allowed to take. The subject assessment of elegence is at least partly driven by the extent to which physical axioms can be matched up to sensory experience and … Continue reading ]]>
*preferred basis problem* to the *preferred subsystem problem*; merely specifying the system of interest (by delineating it from its environment or measuring apparatus) is enough, in important special cases, to derive the measurement basis. But this immediately prompts the question: what are the preferred systems? I spent some time in grad school with my advisor trying to see if I could identify a preferred system just by looking at a large many-body Hamiltonian, but never got anything worth writing up.

I’m pleased to report that Cotler, Penington, and Ranard have tackled a closely related problem, and made a lot more progress:

Essential to the description of a quantum system are its local degrees of freedom, which enable the interpretation of subsystems and dynamics in the Hilbert space. While a choice of local tensor factorization of the Hilbert space is often implicit in the writing of a Hamiltonian or Lagrangian, the identification of local tensor factors is not intrinsic to the Hilbert space itself. Instead, the only basis-invariant data of a Hamiltonian is its spectrum, which does not manifestly determine the local structure. This ambiguity is highlighted by the existence of dualities, in which the same energy spectrum may describe two systems with very different local degrees of freedom. We argue that in fact, the energy spectrum alone almost always encodes a unique description of local degrees of freedom when such a description exists, allowing one to explicitly identify local subsystems and how they interact. In special cases, multiple dual local descriptions can be extracted from a given spectrum, but generically the local description is unique.

The paper has a nice, logical layout and is clearly written. It also has an illuminating discussion of the purpose of *nets* of observables (which appear often in the algebraic QFT literature) as a way to define “physical” states and “local” observables when you have no access to a tensor decomposition into local regions.

For me personally, a key implication is that if I’m right in suggesting that we can uniquely identify the branches (and subsystems) just from the notion of locality, then this paper means we can probably reconstruct the branches just from the spectrum of the Hamiltonian.

Below are a couple other comments.

The proper conclusion to draw from this paper is that if a quantum system can be interpreted in terms of spatially local interactions, this interpretation is probably unique. It is tempting, but I think mistaken, to also conclude that the spectrum of the Hamiltonian is more fundamental than notions of locality. Let me explain.

I am a big fan of trying to condense our physical knowledge down to the smallest number of elegant axioms, and I think that sort of work should be ongoing.^{a } Of course, axiom counting and elegance assessment are subjective. One way to try and formalize this is with algorithmic complexity (although there is still plenty of hand waving). We consider two physical theories *(observationally) equivalent* if they make the same experimental predictions and, when multiple equivalent theories are available, we consider the theory described by the shortest algorithm (as measured in bits) to be preferred.^{b }

Cotler et al. don’t address it, but it’s natural to wonder whether we should conclude from their work that the spectrum gives a more fundamental theoretical description of a quantum many-body system in the above sense of algorithmic complexity. But I think it probably does *not* offer improved compression for the simple reason that specifying a Hamiltonian by its spectrum, rather than a lattice with a notion of locality, will require more bits. The reason isn’t surprising, and basically follows from their discussion: under plausible quantification, the abstract space of possible local lattices is smaller than the space of possible spectra. Most Hamiltonian spectra do not correspond to local theories, which of course is closely related to the main idea that there generically aren’t multiple local theories corresponding to the same spectrum. This is especially true for a symmetric lattice, which, even if arbitrarily large, is specified by just a compact list of symmetries plus maybe a small number of additional parameters.

Now, one of the motivations of this work is that the spectrum of an operator does seem somehow more fundamental than its representation in any particular basis. Although this is true in a certain sense, and I’m not completely sure how to think about all this, it’s worth remembering that the the notion of locality itself is more strongly grounded in observational evidence than the spectrum of the Hamiltonian. This is just the statement that we do not directly measure the spectrum, but rather infer it from a bunch of experimental observations that are all interpreted through locality. If we discovered that locality broke down, it would not be by measuring the spectrum directly.

So instead, the tentative conclusion is that, under the assumptions taken in the paper, the notion of locality is *objective*. That is, locality isn’t subjective/arbitrary like the inertial frame of a particular observer, or a choice of gauge. This is very relevant in the context of holographic approaches to quantum gravity, where there may be incompatible notions of locality. In that context, the appearance of conventional spatial locality is an approximation that breaks down in extreme regimes, evading the assumptions of this paper. (See Cao, Carroll, & Michalakis for a complimentary approach.)

Many of Cotler et al.’s claims are about *generic* local Hamiltonians in the sense that they do not apply to some measure-zero sets in the space of local Hamiltonians. Such claims always need to be considered with care. Recall the following: if we take the space of all pure states on a lattice with a finite spacing, and then let the lattice spacing go to zero, we find that the *physical* states with bounded energy (according to any smooth field-theoric Hamiltonian) forms a (rather small) measure-zero subspace of all states. In other words, in this naive construction of a continuous field theory, the generic pure state has discontinuous spatial derivative and divergent/undefined energy, and all the physical states (and physical Hamiltonians) we care about are *not* generic.

Indeed, there really is no well-defined notion of a spatially local tensor-product structure for a continuous field theory.^{c } This isn’t just the statement that you can’t define the Hilbert space as “tensor-product integral” of Hilbert spaces attached to each infinitesimal point in space. You can’t even break the state space up as a tensor product of two spatially disjoint regions. This means that extending their analysis to field theory will be nontrivial.

Cotler et al. acknowledge some of this briefly at the end of section 6, but the issue is distinct from the problems posed by continuous spectra that they emphasize most. [The tensor-structure breaks down as we remove the short-distance (UV) cutoff but, so long as there is still an long-distance (IR) cutoff, the spectrum remains discrete, though unbounded.] They discuss observable nets as a replacement for tensor-product structures, which have long been used to deal with these sorts of issues, but it’s not clear whether this will actually be successful for the task of defining locality from the spectrum of the Hamiltonian.

Of course, there are *always* hairy issues with taking the continuum limit, and one can make two retorts:

- If anything, the parameter-counting argument gets more severe in the continuum limit; so shouldn’t we suspect that the core intuitive idea — that locality is, more or less, uniquely defined by the Hamiltonian — will survive for a continuous QFT, modulo details?
- All we know is that the world is approximately described by an effective field theory, and here’s no overwhelming reason to think things are truly continuous at the most fundamental level; so isn’t it valuable to know that locality is unique in a spatially discretized theory?

I think the answer to both of these questions is “yes, probably”. But here is my best guess at how things could go wrong: It could be that, at any finite lattice spacing, generic Hamiltonians admit at most a single notion of locality, but that they admit multiple incompatible *approximate* notions of locality. Even if these are bad approximations for some coarse lattice spacing, it could be that in the continuum limit they become arbitrarily good approximations.

It goes without saying that these arm-chair worries don’t detract from the value of Cotler et al.’s result. One always handles the exact case before doing an epsilon-delta treatment.

*[I thank the authors for discussion that significantly clarified my thinking.]*

(↵ returns to text)

- For instance, I celebrate using
*amplification*to identify observables with Hermitian (or normal!) operator rather than simply postulating this. The general constructive approach in the introductory chapter of Weinberg’s QFT test is also excellent on this front (although much could be improved).↵ - There are objections to this approach. It’s not at all obvious what language such an algorithm would be written in, leading to an constant-factor ambiguity in the size of the program. The choice of language is intertwined with questions about the form that fundamental physical axioms are allowed to take. The subject assessment of elegence is at least partly driven by the extent to which physical axioms can be matched up to sensory experience and our intuition.↵
- They put it clearly: “…a subspace of a space with an explicit TPS [tensor-product structure] will not inherit the TPS in any natural way”.↵

Scott Alexander points to this blog post discussing how the venerable NYTimes massages plots to tell the story they want to tell. The NY Times is far from alone in doing this, of course. The depressing part is that if even *they* are doing it, who isn’t? Now’s as good a time as ever to reiterate that the Times explicitly condones narration and eschews neutrality. In the words of their ombudsman:

I often hear from readers that they would prefer a straight, neutral treatment — just the facts. But The Times has moved away from that, reflecting editors’ reasonable belief that the basics can be found in many news outlets, every minute of the day. They want to provide “value-added” coverage.

Also from Scott: Heritability and stability of intelligence vs. personality:

Results: Both cognition and personality are moderately heritable and exhibit large increases in stability with age; however, marked differences are evident. First, the heritability of cognition increases substantially with child age, while the heritability of personality decreases modestly with age. Second, increasing stability of cognition with age is overwhelmingly mediated by genetic factors, whereas increasing stability of personality with age is entirely mediated by environmental factors. Third, the maturational time-course of stability differs: Stability of cognition nears its asymptote by the end of the first decade of life, whereas stability of personality takes three decades to near its asymptote.

Scott Alexander points to this blog post discussing how the venerable NYTimes massages plots to tell the story they want to tell. The NY Times is far from alone in doing this, of course. The depressing part is that if even *they* are doing it, who isn’t? Now’s as good a time as ever to reiterate that the Times explicitly condones narration and eschews neutrality. In the words of their ombudsman:

I often hear from readers that they would prefer a straight, neutral treatment — just the facts. But The Times has moved away from that, reflecting editors’ reasonable belief that the basics can be found in many news outlets, every minute of the day. They want to provide “value-added” coverage.

Also from Scott: Heritability and stability of intelligence vs. personality:

Results: Both cognition and personality are moderately heritable and exhibit large increases in stability with age; however, marked differences are evident. First, the heritability of cognition increases substantially with child age, while the heritability of personality decreases modestly with age. Second, increasing stability of cognition with age is overwhelmingly mediated by genetic factors, whereas increasing stability of personality with age is entirely mediated by environmental factors. Third, the maturational time-course of stability differs: Stability of cognition nears its asymptote by the end of the first decade of life, whereas stability of personality takes three decades to near its asymptote.

KOLHATKAR:…So if you’re a hedge fund trader or an analyst, you might come into work one morning and your boss, the powerful, you know, portfolio manager at your hedge fund might say, hey, I want you look at, you know, XYZ’s stock. They make microchips. Figure out if we should buy some.

OK, well, you don’t know much about microchip manufacturing. This is a new subject for you. So you’ve got to figure it out really quickly. You’ve got to do a little bit of what a journalist might do. You’ve got to start calling people and reading things to educate yourself about this business and to try and figure out if this particular company is healthy, is going to do well, if you should risk some of your investor’s money buying shares of this company.

So expert network firms rose up to kind of meet this demand. They realized that these hedge fund traders had a lot of money to spend on research and they had a constant, rapacious need for information and intelligence about all sorts of different companies in different industries. A lot of them tended to be in the medical pharmaceutical field and also in technology because both of those area’s stocks tend to move very dramatically based on the news and the earnings. The – you know, the businesses themselves are very complicated. The products are complicated. It’s hard to understand if you’re not an expertise, you’re not a medical doctor.

So these expert network firms rose up and they kind of peddled themselves to hedge funds and they said, listen, for $1,000 an hour, we will connect you with a middle manager at Caterpillar who can talk to you about, you know, how they manufacture their big farm machines and, you know, how things are going and what kind of parts they need and where they buy them. And so you could call these experts and, you know, quite legitimately just educate yourself about these different industries and companies.

DAVIES: You’re getting people who are players in the business world and they’re on the phone getting $1,000 an hour and, in theory, not divulging insider information. But as relationships develop given the amount of money they’re being paid, this seems to invite insider trading, doesn’t it? I couldn’t believe it when I read about this years ago.

KOLHATKAR: A number of people have said to me, how is that possibly legal? And, you know, from the outside, of course, it sounds like it’s going to lead to a lot of illegality. And in fact, that is what the FBI thought when they first learned about this as well. BJ Kang, who was one of the star FBI agents who worked on this case, you know, he kind of said to himself at one point, how is it possible that hedge funds are going to spend all this money on these experts and these consultants if they’re not getting anything valuable?

My calculation, admittedly very rough, is that the search by the elite for superior investment advice has caused it, in aggregate, to waste more than $100 billion over the past decade.