### LaTeX in comments

Include`[latexpage]`to render LaTeX in comments. Basic HTML works too. (More.)### Recent Comments

**AI goalpost moving is not unreasonable (2)****Unital dynamics are mixedness increasing (1)****A checkable Lindbladian condition (2)****How to think about Quantum Mechanics—Part 1: Measurements are about bases (15)****Consistency conditions in consistent histories (3)****Lindblad operator trace is 1st-order contribution to Hamiltonian part of reduced dynamics (2)****Weingarten's branches from quantum complexity (1)**

### Recent Posts

- Generalizing wavefunction branches to indistinguishable subspacesJanuary 21, 2024
- What logical structure for branches?January 7, 2024
- Branching theories vs. collapse theoriesDecember 9, 2023
- Comments on Ollivier’s “Emergence of Objectivity for Quantum Many-Body Systems”November 27, 2023
- Compact precise definition of a transformer functionOctober 9, 2023
- Unital dynamics are mixedness increasingSeptember 27, 2023
- AI goalpost moving is not unreasonableJuly 21, 2023
- Notable reviews of arguments for AGI ruinApril 24, 2023
- Table of proposed macroscopic superpositionsApril 21, 2022
- GPT-3, PaLM, and look-up tablesApril 7, 2022

- Generalizing wavefunction branches to indistinguishable subspaces
### Categories

### Archives

### Meta

### Licence

For maximum flexibility, foreXiv by C. Jess Riedel is multi-licensed separately under CC BY-SA 4.0, CC BY-NC-SA 4.0, and GFDL 1.3.### Math & Physics Blogs

- Azimuth
- Backreaction (Sabine Hossenfelder)
- Information Processing (Stephen Hsu)
- Matt Leifer
- Not Even Wrong (Peter Woit)
- Preposterous Universe (Sean Carroll)
- Quanta Magazine
- Quantum Diaries Survivor (Tommaso Dorigo)
- Quantum Frontiers
- Quantum Pontiff (Dave Bacon)
- Résonaances (Adam Falkowski)
- Schroedinger's rat (Miguel Navascues)
- SciRate
- Shtetl-Optimized (Scott Aaronson)
- Terence Tao
- The Morning Paper (Adrian Colyer)
- Tobias Osborne

### Other Blogs & Links

- 80,000 Hours
- AI Impacts
- Arts & Letters Daily
- ArXiv.org Blog
- Astral Codex Ten (S. Alexander)
- ChinAI Newsletter
- Construction Physics (Brian Potter)
- EconLog
- Economist's View
- Effective Altruism Forum
- Future Primaeval
- GiveWell Blog
- Giving Gladly (Julia Wise)
- Giving What We Can
- Good Ventures: Give & Learn
- Gwern
- HackerNews
- Jeff Kaufman
- Luke Muehlhauser
- Manifold (Stephen Hsu)
- Marginal Revolution (Cowen & Tabarrok)
- MIRI Blog
- Money Stuff (Matt Levine)
- Noahpinion (Noah Smith)
- Open Philanthropy Blog
- Orbital Index
- Otium
- Overcoming Bias (Robin Hanson)
- Philip Trammell
- Reflective Disequilibrium (Carl Shulman)
- Scholar's Stage
- SCOTUS Blog

### Podcasts

## Selsam on formal verification of machine learning

Here is the first result out of the project Verifying Deep Mathematical Properties of AI SystemsTechnical abstract available here. Note that David Dill has taken over as PI from Alex Aiken.

^{a }funded through the Future of Life Institute.Developing Bug-Free Machine Learning Systems With Formal MathematicsDaniel Selsam, Percy Liang, David L. DillYou can find discussion on HackerNews. The lead author was kind enough to answers some questions about this work.

Q:Is the correctness specification usually a fairly singular statement? Or will it often be of the form “The program satisfied properties A, B, C, D, and E”? (And then maybe you add “F” later.)Daniel Selsam:There are a few related issues: how singular is a specification, how much of the functionality of the system is certified (coverage), and how close the specification comes to proving that the system actually does what you want (validation).Singular vs plural. If you want to certify a red-black tree, then you will probably want to prove many different properties about how the core methods (e.g. finding, inserting, deleting) interact, and so the specification will be rather plural. But a different system mayusethe certified red-black tree to do one very specific thing and may have a singular specification. Thus how singular or plural a specification is depends heavily on where we draw the boundaries between systems and is somewhat arbitrary. One way or another, the internals of any proof of correctness will need to make use of many different lemmas; sometimes you can tie them all up in a bow for a particular project and sometimes you cannot.Coverage. In Certigrad, we prove that the sampled gradients are unbiased estimates of the true gradients, which arguably constitutes total functional correctness for the stochastic backpropagation algorithm. We also prove a few other properties (e.g. that two program optimizations are sound). However, there are still some parts of the system about which we prove nothing. For example, a user could enter their model and compute the correct gradients, but then do gradient ascent instead of gradient descent and get nonsense when trying to train it. In principle we could prevent this kind of error as well by also proving that the learning algorithms converge to local optima on models that satisfy certain properties, but we do not do this yet in Certigrad.Validation. In Certigrad, even if we approach full coverage of the system by proving convergence of the learning algorithms, we still would not have any theorems that even come close to saying that Certigrad does what we really want it to, which might be something like “after training a model with Certigrad on a dataset, the model will be extremely accurate on unseen data”. Traditionally,verificationrefers to proving that you implemented a system correctly (e.g. that your gradients are correct), whereasvalidationrefers to proving that the system you chose to implement really does what you want (e.g. that your model will be accurate on unseen data). The line is blurry between verification and validation but I find the distinction useful. The limiting factor in proving validation properties for Certigrad is that nobody knows that much yet about the assumptions under which most useful models (such as neural networks) are guaranteed to perform well. Also, the set of properties we might want to prove is open-ended. There may even be multiple interesting properties we could prove about individual Certigrad models.Q:Why concentrate on machine learning?D.S.:Formal verification may be especially useful in machine learning (ML) for several reasons. First, ML systems are notoriously difficult to debug because developers don’t know what the system is supposed to do on a given input (we argue this in the introduction). Second, ML systems involve advanced mathematics that most developers will not be masters of, and formal math lets the ML experts communicate to the developers what the system needs to do in a precise and self-contained way that does not require years of training to be able to make sense of. Third, (math-aware) developers often need to do tedious and error-prone calculations by hand, and many of these calculations can be automated using tools from computer algebra. Fourth, because performance is dominated by matrix multiplication, machine learning algorithms do not require low-level optimizations to be competitive and so it is easier to formally verify real systems.Q:It would seem that theorem provers are only useful insofar as the mathematical statement that specifies what it means to be correct is easier to understand than the code itself. What sort of “compression ratios” does Certigrad achieve? Will this depend on machine learning vs. another application?D.S.:Theorem provers are also useful when the dumbest possible implementation is easier to understand than the most optimized one, which is almost always the case. The specification can simply be “it does the same thing as the naive implementation”. Similarly, a compiler may have millions of lines of code implementing program transformations, and yet the specification for each transformation may only be “it preserves semantics”. Compression ratios can vary though, and we did not get much actual compression in Certigrad. The full specification for backpropagation (including the definitions of all the preconditions) is about as many characters as the code that implements it. But it is much easier to provide than the implementation, and much easier to inspect and confirm that it is correct. I think a better metric may be how much shorter the specification is than the codeplusthe proof that the code is correct, since you would need to understand the proof to really understand the code in the first place.## Footnotes

(↵ returns to text)