GPT-3, PaLM, and look-up tables

[This topic is way outside my expertise. Just thinking out loud.]

Here is Google’s new language model PaLM having a think:

Alex Tabarrok writes

It seems obvious that the computer is reasoning. It certainly isn’t simply remembering. It is reasoning and at a pretty high level! To say that the computer doesn’t “understand” seems little better than a statement of religious faith or speciesism…

It’s true that AI is just a set of electronic neurons none of which “understand” but my neurons don’t understand anything either. It’s the system that understands. The Chinese room understands in any objective evaluation and the fact that it fails on some subjective impression of what it is or isn’t like to be an AI or a person is a failure of imagination not an argument…

These arguments aren’t new but Searle’s thought experiment was first posed at a time when the output from AI looked stilted, limited, mechanical. It was easy to imagine that there was a difference in kind. Now the output from AI looks fluid, general, human. It’s harder to imagine there is a difference in kind.

Tabarrok uses an illustration of Searle’s Chinese room featuring a giant look-up table:

But as Scott Aaronson has emphasized [PDF], a machine that simply maps inputs to outputs by consulting a giant look-up table should not be considered “thinking” (although it could be considered to “know”). First, such a look-up table would be beyond astronomically large for any interesting AI task and hence physically infeasible to implement in the real universe. But more importantly, the fact that something is being looked up rather than computed undermines the idea that the system understands or is reasoning.… [continue reading]

Quantum computing timelines

[Jaime Sevilla is a Computer Science PhD Student at the University of Aberdeen. In this guest post, he describes our recent forecasting work on quantum computing. – Jess Riedel]

In Short: We attempt to forecast when quantum computers will be able to crack the common cryptographic scheme RSA2048, and develop a model that predicts less than 5% confidence that this capability will be reached before 2039. A preprint is available at arXiv:2009.05045.

Advanced quantum computing comes with some new applications as well as a few risks, most notably threatening the foundations of modern online security.

In light of the recent experimental crossing of the “quantum supremacy” milestone, it is of great interest to estimate when devices capable of attacking typical encrypted communication will be constructed, and whether the development of communication protocols that are secure against quantum computers is progressing at an adequate pace.  

Beyond its intrinsic interest, quantum computing is also fertile ground for quantified forecasting. Exercises on forecasting technological progress have generally been sparse — with some notable exceptions — but it is of great importance: technological progress dictates a large part of human progress.

To date, most systematic predictions about development timelines for quantum computing have been based on expert surveys, in part because quantitative data about realistic architectures has been limited to a small number of idiosyncratic prototypes. However, in the last few years the number of device has been rapidly increasing and it is now possible to squint through the fog of research and make some tentative extrapolations. We emphasize that our quantitative model should be considered to at most augment, not replace, expert predictions.  Indeed, as we discuss in our preprint, this early data is noisy, and we necessarily must make strong assumptions to say anything concrete.[continue reading]

Review of “Lifecycle Investing”

Summary

In this post I review the 2010 book “Lifecycle Investing” by Ian Ayres and Barry Nalebuff. (Amazon link here; no commission received.) They argue that a large subset of investors should adopt a (currently) unconventional strategy: One’s future retirement contributions should effectively be treated as bonds in one’s retirement portfolio that cannot be efficiently sold; therefore, early in life one should balance these low-volatility assets by gaining exposure to volatile high-return equities that will generically exceed 100% of one’s liquid retirement assets, necessitating some form of borrowing.

“Lifecycle Investing” was recommended to me by a friend who said the book “is extremely worth reading…like learning about index funds for the first time…Like worth paying 1% of your lifetime income to read if that was needed to get access to the ideas…potentially a lot more”. Ayres and Nalebuff lived up to this recommendation. Eventually, I expect the basic ideas, which are simple, to become so widespread and obvious that it will be hard to remember that it required an insight.

In part, what makes the main argument so compelling is that (as shown in the next section), it is closely related to an elegant explanation for something we all knew to be true — you should increase the bond-stock ratio of your portfolio as you get older — yet previously had bad justifications for. It also gives new actionable, non-obvious, and potentially very important advice (buy equities on margin when young) that is appropriately tempered by real-world frictions. And, most importantly, it means I personally feel less bad about already being nearly 100% in stocks when I picked up the book.

My main concerns, which are shared by other reviewers and which are only partially addressed by the authors, are:

  • Future income streams might be more like stocks than bonds for the large majority of people.
[continue reading]

COVID Watch and privacy

[Tina White is a friend of mine and co-founder of COVID Watch, a promising app for improving contact tracing for the coronavirus while preserving privacy. I commissioned Tom Higgins to write this post in order to bring attention to this important project and put it in context of related efforts. -Jess Riedel]

Countries around the world have been developing mobile phone apps to alert people to potential exposure to COVID-19. There are two main mechanism used:

  1. Monitoring a user’s location, comparing it to an external (typically, government) source of information about infections, and notifying the user if they are entering, or previously entered, a high-risk area.
  2. Detecting when two users come in close proximity to each other and then, if one user later reports to have been infected, notifying the second user and/or the government.

The first mechanism generally uses the phone’s location data, which is largely inferred from GPS.In urban areas, GPS is rather inaccurate, and is importantly augmented with location information inferred from WiFi signal strength maps.a   The second method can also be accomplished with GPS, by simply measuring the distance between users, but it can instead be accomplished with phone-to-phone bluetooth connectionsA precursor to smartphone-based contact tracing can be found in the FluPhone app, which was developed in the University of Cambridge Computer Laboratory in 2011. (BBC Coverage.) Contact tracing was provided over bluetooth and cases of the flu were voluntarily reported by users so that those with whom they had come into contact would be alerted. Despite media coverage, less than one percent of Cambridge residents downloaded the app, whether due to a lack of concern over the flu or concerns over privacy.[continue reading]

Hennessey on Career Regret

I’ve been mulling for a long time whether to stay in physics, and a colleague pointed me toward this Master’s thesis on career regret by Hennessey.

The study examines the experiences of individuals who, if given their time back, would have chosen a different career path. Despite the fact that career has been consistently documented as a major life regret for many it is rarely mentioned, or only referred to tangentially, in career development literature. Five individual interviews, four female, one male, with people retired or transitioning to retirement are presented to explore the experience of regret as it persists throughout the adult lives of participants. Although the narratives shared by participants are unique and deeply personal, common themes emerged through qualitative analysis. Four themes relate to perceptions of the past: Early Influences, Why I Regret My Choice, The Passage of Time, and Balancing Work and Family. One theme relates to the present: If I Could Do It Over Again, and one to the future: What the Future will Be. Findings from the current study add to the limited research on the topic of career regret and implications for theory and practice are examined.

In this blog post I’ll mostly just pull out notable excerpts. I encourage you to read the thesis if this catches your interest. (See also Hanson on deathbed regrets.)

From the introduction:

If you work full time for thirty years the number of hours spent on the job would be approximately 60,000…

What if you never figured out what you want to do with your life? What if you spent your whole life searching and never found the work you wanted?

[continue reading]

Tishby on physics and deep learning

Having heard Geoffrey Hinton’s somewhat dismissive account of the contribution by physicists to machine learning in his online MOOC, it was interesting to listen to one of those physicists, Naftali Tishby, here at PI:


The Information Theory of Deep Neural Networks: The statistical physics aspects
Naftali Tishby
Abstract:

The surprising success of learning with deep neural networks poses two fundamental challenges: understanding why these networks work so well and what this success tells us about the nature of intelligence and our biological brain. Our recent Information Theory of Deep Learning shows that large deep networks achieve the optimal tradeoff between training size and accuracy, and that this optimality is achieved through the noise in the learning process.

In this talk, I will focus on the statistical physics aspects of our theory and the interaction between the stochastic dynamics of the training algorithm (Stochastic Gradient Descent) and the phase structure of the Information Bottleneck problem. Specifically, I will describe the connections between the phase transition and the final location and representation of the hidden layers, and the role of these phase transitions in determining the weights of the network.

Based partly on joint works with Ravid Shwartz-Ziv, Noga Zaslavsky, and Shlomi Agmon.


(See also Steve Hsu’s discussion of a similar talk Tishby gave in Berlin, plus other notes on history.)

I was familiar with the general concept of over-fitting, but I hadn’t realized you could talk about it quantitatively by looking at the mutual information between the output of a network and all the information in the training data that isn’t the target label.… [continue reading]

Meh deep fakes

A lot of people sound worried that new and improving techniques for creating very convincing videos of anyone saying and doing anything will lead to widespread misinformation and even a break down of trust in society.



I’m not very worried. Two hundred years ago, essentially all communication, other than in-person conversation, was done through written word, which is easy to fake and impersonate. In particular, legal contracts were (and are) typeset, and so are trivially fakeable. But although there were (and are) cases of fraud and deception through foraged documents, society has straightforward mechanisms for correctly attributing such communication to individuals. Note, for instance, that multi-billion-dollar contracts between companies are written in text, and we have never felt it necessary to record difficult-to-fake videos of the CEOs reciting them.

The 20th century was roughly a technological Goldilocks period where technology existed to capture images and video but not to fake them. Images, of course, have been fakeable at modest cost for many years. Even in 1992, Michael Crichton’s Rising Sun used high-tech fraudulent security footage as a realistic plot point in then-current day. Although we may see some transition costs as folks are tricked into believing fraudulent videos because the ease of faking them has not yet entered the conventional wisdom, eventually people will learn that video can’t be trusted much more than the written word.Which is to say, most of the time you can trust both text and video because most people aren’t trying to defraud you, but extra confirmatory steps are taken for important cases.a   This will not be catastrophic because our trust networks are not critically dependent on faithful videos and images.… [continue reading]

Reply to Hanson

I was privileged to receive a reply from Robin Hanson on my critique of his largely excellent book The Elephant in the Brain with co-author Kevin Simler. I think in several cases he rebutted something other than what I argued, but I encourage you to read it and judge for yourself.

Given the high-profile book reviews that are probably forthcoming from places like the Wall Street Journal, I thank Robin for taking the time to engage with the little guys!

Replies

I’ll follow Robin’s lead and switch to first names.

Some say we should have been more academic and detailed, while other say we should have been more accessible and less detailed….Count Jess as someone who wanted a longer book.

It’s true that I’d have preferred a longer book with more details, but I think I gestured at ways Kevin and Robin could hold length constant while increasing convincingness. And there are ways of keeping the book accessible while augmenting the rigor (e.g., endnotes), although of course they are more work.

Yes for each motive one can distinguish both a degree of consciousness and also a degree of current vs past adaptation. But these topics were not essential for our main thesis, making credible claims on them takes a lot more evidence and argument, and we already had trouble with trying to cover too much material for one book.

I was mostly happy with how the authors handled the degree of consciousness. However, I think the current- vs past-adaptation distinction is very important for designing institutions, which Kevin and Robin correctly list as one of the main applications of the book’s material. For instance, should the arXiv host comments on papers, and how should they be implemented to avoid pissing contests?… [continue reading]