AI goalpost moving is not unreasonable

[Summary: Constantly evolving tests for what counts as worryingly powerful AI is mostly a consequence of how hard it is to design tests that will identify the real-world power of future automated systems. I argue that Alan Turing in 1950 could not reliably distinguish a typical human from an appropriately-fine-tuned GPT-4, yet all our current automated systems cannot produce growth above historic trends.A draft of this a  ]

What does the phenomena of “moving the goalposts” for what counts as AI tell us about AI?

It’s often said that people repeatedly revising their definition of AI, often in response to previous AI tests being passed, is evidence that people are denying/afraid of reality, and want to put their head in the sand or whatever. There’s some truth to that, but that’s a comment about humans and I think it’s overstated.

Closer to what I want to talk about is the idea AI is continuously redefined to mean “whatever humans can do that hasn’t been automated yet”, often taken to be evidence that AI is not a “natural” kind out there in the world, but rather just a category relative to current tech. There’s also truth to this, but not exactly what I’m interested in.

To me, it is startling that (I claim) we have systems today that would likely pass the Turing test if administered by Alan Turing, but that have negligible impact on a global scale. More specifically, consider fine-tuning GPT-4 to mimic a typical human who lacks encyclopedic knowledge of the contents of the internet. Suppose that it’s mimicking a human with average intelligence whose occupation has no overlap with Alan Turing’s expertise.… [continue reading]

GPT-3, PaLM, and look-up tables

[This topic is way outside my expertise. Just thinking out loud.]

Here is Google’s new language model PaLM having a think:

Alex Tabarrok writes

It seems obvious that the computer is reasoning. It certainly isn’t simply remembering. It is reasoning and at a pretty high level! To say that the computer doesn’t “understand” seems little better than a statement of religious faith or speciesism…

It’s true that AI is just a set of electronic neurons none of which “understand” but my neurons don’t understand anything either. It’s the system that understands. The Chinese room understands in any objective evaluation and the fact that it fails on some subjective impression of what it is or isn’t like to be an AI or a person is a failure of imagination not an argument…

These arguments aren’t new but Searle’s thought experiment was first posed at a time when the output from AI looked stilted, limited, mechanical. It was easy to imagine that there was a difference in kind. Now the output from AI looks fluid, general, human. It’s harder to imagine there is a difference in kind.

Tabarrok uses an illustration of Searle’s Chinese room featuring a giant look-up table:

But as Scott Aaronson has emphasized [PDF], a machine that simply maps inputs to outputs by consulting a giant look-up table should not be considered “thinking” (although it could be considered to “know”). First, such a look-up table would be beyond astronomically large for any interesting AI task and hence physically infeasible to implement in the real universe. But more importantly, the fact that something is being looked up rather than computed undermines the idea that the system understands or is reasoning.… [continue reading]

Distinguish between straight research and scientific opinion?

Summary: Maybe we should start distinguishing “straight research” from more opinionated scientific work and encourage industrial research labs to commit to protecting the former as a realistic, limited version of academic freedom in the private for-profit sector.

It seems clear enough to me that, within the field of journalism, the distinction between opinion pieces and “straight reporting” is both meaningful and valuable to draw. Both sorts of works should be pursued vigorously, even by the same journalists at the same time, but they should be distinguished (e.g., by being placed in different sections of a newspaper, or being explicitly labeled “opinion”, etc.) and held to different standards.In my opinion it’s unfortunate that this distinction has been partially eroded in recent years and that some thoughtful people have even argued it’s meaningless and should be dropped. That’s not the subject of this blog post, though.a   This is true even though there is of course a continuum between these categories, and it’s infeasible to precisely quantify the axis. (That said, I’d like to see more serious philosophical attempts to identify actionable principles for drawing this distinction more reliably and transparently.)

It’s easy for idealistic outsiders to get the impression that all of respectable scientific research is analogous to straight reporting rather than opinion, but just about any researcher will tell you that some articles are closer than other articles to the opinion category; that’s not to say it’s bad or unscientific, just that such articles go further in the direction of speculative interpretation and selective highlighting of certain pieces of evidence, and are often motivated by normative claims (“this area is more fruitful research avenue than my colleagues believe”, “this evidence implies the government should adopt a certain policy”, etc.).… [continue reading]

Comments on “Longtermist Institutional Reform” by John & MacAskill

Tyler John & William MacAskill have recently released a preprint of their paper “Longtermist Institutional Reform” [PDF]. The paper is set to appear in an EA-motivated collection “The Long View” (working title), from Natalie Cargill and Effective Giving.

Here is the abstract:

There is a vast number of people who will live in the centuries and millennia to come. In all probability, future generations will outnumber us by thousands or millions to one; of all the people who we might affect with our actions, the overwhelming majority are yet to come. In the aggregate, their interests matter enormously. So anything we can do to steer the future of civilization onto a better trajectory, making the world a better place for those generations who are still to come, is of tremendous moral importance. Political science tells us that the practices of most governments are at stark odds with longtermism. In addition to the ordinary causes of human short-termism, which are substantial, politics brings unique challenges of coordination, polarization, short-term institutional incentives, and more. Despite the relatively grim picture of political time horizons offered by political science, the problems of political short-termism are neither necessary nor inevitable. In principle, the State could serve as a powerful tool for positively shaping the long-term future. In this chapter, we make some suggestions about how we should best undertake this project. We begin by explaining the root causes of political short-termism. Then, we propose and defend four institutional reforms that we think would be promising ways to increase the time horizons of governments: 1) government research institutions and archivists; 2) posterity impact assessments; 3) futures assemblies; and 4) legislative houses for future generations.

[continue reading]

How to think about Quantum Mechanics—Part 8: The quantum-classical limit as music

[Other parts in this series: 1,2,3,4,5,6,7,8.]

On microscopic scales, sound is air pressure f(t) fluctuating in time t. Taking the Fourier transform of f(t) gives the frequency distribution \hat{f}(\omega), but in an eternal way, applying to the entire time interval for t\in [-\infty,\infty].

Yet on macroscopic scales, sound is described as having a frequency distribution as a function of time, i.e., a note has both a pitch and a duration. There are many formalisms for describing this (e.g., wavelets), but a well-known limitation is that the frequency \omega of a note is only well-defined up to an uncertainty that is inversely proportional to its duration \Delta t.

At the mathematical level, a given wavefunction \psi(x) is almost exactly analogous: macroscopically a particle seems to have a well-defined position and momentum, but microscopically there is only the wavefunction \psi. The mapping of the analogyI am of course not the first to emphasize this analogy. For instance, while writing this post I found “Uncertainty principles in Fourier analysis” by de Bruijn (via Folland’s book), who calls the Wigner function of an audio signal f(t) the “musical score” of f.a   is \{t,\omega,f\} \to \{x,p,\psi\}. Wavefunctions can of course be complex, but we can restrict ourself to a real-valued wavefunction without any trouble; we are not worrying about the dynamics of wavefunctions, so you can pretend the Hamiltonian vanishes if you like.

In order to get the acoustic analog of Planck’s constant \hbar, it helps to imagine going back to a time when the pitch of a note was measured with a unit that did not have a known connection to absolute frequency, i.e.,… [continue reading]

How shocking are rare past events?

This post describes variations on a thought experiment involving the anthropic principle. The variations were developed through discussion with Andreas Albrecht, Charles Bennett, Leonid Levin, and Andrew Arrasmith at a conference at the Neils Bohr Institute in Copenhagen in October of 2019. I have not yet finished reading Bostrom’s “Anthropic Bias“, so I don’t know where it fits in to his framework. I expect it is subsumed into such existing discussion, and I would appreciate pointers.

The point is to consider a few thought experiments that share many of the same important features, but for which we have very different intuitions, and to identify if there are any substantive difference that can be used to justify these intuitions.

I will use the term “shocked” (in the sense of “I was shocked to see Bob levitate off the ground”) to refer to the situation where we have made observations that are extremely unlikely to be generated by our implicit background model of the world, such that good reasoners would likely reject the model and start entertaining previously disfavored alternative models like “we’re all brains in a vat”, the Matrix, etc. In particular, to be shocked is not supposed to be merely a description of human psychology, but rather is a normative claim about how good scientific reasoners should behave.

Here are the three scenarios:

Scenario 1: Through advances in geology, paleontology, theoretical biology, and quantum computer simulation of chemistry, we get very strong theoretical evidence that intelligent life appears with high likelihood following abiogenesis events, but that abiogenesis itself is very rare: there is one expected abiogenesis event per 1022 stars per Hubble time.
[continue reading]

FAQ about experimental quantum Darwinism

I am briefly stirring from my blog-hibernationThis blog will resume at full force sometime in the future, but not just yet.a   to present a collection of frequently asked questions about experiments seeking to investigate quantum Darwinism (QD). Most of the questions were asked by (or evolved from questions asked by) Phillip Ball while we corresponded regarding his recent article “Quantum Darwinism, an Idea to Explain Objective Reality, Passes First Tests” for Quanta magazine, which I recommend you check out.


Who is trying see quantum Darwinism in experiments?

I am aware of two papers out of a group from Arizona State in 2010 (here and here) and three papers from separate groups last year (arXiv: 1803.01913, 1808.07388, 1809.10456). I haven’t looked at them all carefully so I can’t vouch for them, but I think the more recent papers would be the closest thing to a “test” of QD.

What are the experiments doing to put QD the test?

These teams construct a kind of “synthetic environment” from just a few qubits, and then interrogate them to discover the information that they contain about the quantum system to which they are coupled.

What do you think of experimental tests of QD in general?

Considered as a strictly mathematical phenomenon, QD is the dynamical creation of certain kinds of correlations between certain systems and their environments under certain conditions. These experiments directly confirm that, if such conditions are created, the expected correlations are obtained.

The experiments are, unfortunately, not likely to offer many insight or opportunities for surprise; the result can be predicted with very high confidence long in advance.… [continue reading]

Tishby on physics and deep learning

Having heard Geoffrey Hinton’s somewhat dismissive account of the contribution by physicists to machine learning in his online MOOC, it was interesting to listen to one of those physicists, Naftali Tishby, here at PI:


The Information Theory of Deep Neural Networks: The statistical physics aspects
Naftali Tishby
Abstract:

The surprising success of learning with deep neural networks poses two fundamental challenges: understanding why these networks work so well and what this success tells us about the nature of intelligence and our biological brain. Our recent Information Theory of Deep Learning shows that large deep networks achieve the optimal tradeoff between training size and accuracy, and that this optimality is achieved through the noise in the learning process.

In this talk, I will focus on the statistical physics aspects of our theory and the interaction between the stochastic dynamics of the training algorithm (Stochastic Gradient Descent) and the phase structure of the Information Bottleneck problem. Specifically, I will describe the connections between the phase transition and the final location and representation of the hidden layers, and the role of these phase transitions in determining the weights of the network.

Based partly on joint works with Ravid Shwartz-Ziv, Noga Zaslavsky, and Shlomi Agmon.


(See also Steve Hsu’s discussion of a similar talk Tishby gave in Berlin, plus other notes on history.)

I was familiar with the general concept of over-fitting, but I hadn’t realized you could talk about it quantitatively by looking at the mutual information between the output of a network and all the information in the training data that isn’t the target label.… [continue reading]