If anyone wants to have a voice chat with me about a topic that I'm interested in (see my recent post/comment history to get a sense), please contact me via PM.
My main "claims to fame":
Some evidence about this: Eliezer was deliberating holding off on publishing TDT to use it as a test of philosophical / FAI research competence. He dropped some hints on LW (I think mostly that it had to do with Newcomb or cooperating in one-shot PD, and of course people knew that it had to do with AI) and also assigned MIRI (then SIAI) people to try to guess/reproduce his advance, and none of the then-SIAI people figured out what he had in mind or got very close until I posted about UDT (which combined my guess of Eliezer's idea with some of my own and other discussions on LW at the time, mainly from Vladmir Nesov).
Also, although I was separately interested in AI safety and decision theory, I didn't connect the dots between the two until I saw Eliezer's hints. I had investigated proto-updateless ideas to bypass difficulties in anthropic reasoning, and by the time Eliezer dropped his hints I had mostly given up on anyone being interested in my DT ideas. I also didn't think to question what I saw as the conventional/academic wisdom, that Defecting in one-shot PD is rational, as is two-boxing in NP.
So my guess is that while some people might have eventually come up with something like UDT even without Eliezer, it probably would have been seen as just one DT idea among many (e.g. SIAI people were thinking in various different directions, Gary Drescher who was independently trying to invent a one-boxing/cooperating DT had came up with a bunch of different ideas and remained unconvinced that UDT was the right approach), and also decision theory itself was unlikely to have been seen as central to AI safety for a time.
Ok, this changes my mental picture a little (although it's not very surprising that there would be some LW-influenced people at the labs privately still thinking/talking about decision theory). Any idea (or can you ask next time) how they feel about decision theory seemingly far from being solved, and their top bosses seemingly unaware or not concerned about this, or this concern being left out of all official communications?
Are people at the major AI companies talking about it privately? I don't think I've seen any official communications (e.g. papers, official blog posts, CEO essays) that mention it, so from afar it looks like decision theory has dropped off the radar of mainstream AI safety.
In retrospect it seems like such a fluke that decision theory in general and UDT in particular became a central concern in AI safety. In most possible worlds (with something like humans) there is probably no Eliezer-like figure, or the Eliezer-like figure isn't particularly interested in decision theory as a central part of AI safety, or doesn't like UDT in particular. I infer this from the fact that where Eliezer's influence is low (e.g. AI labs like Anthropic and OpenAI) there seems little interest in decision theory in connection with AI safety (cf Dario Amodei's recent article which triggered this reflection), and in other places interested in decision theory, that aren't downstream of Eliezer popularizing it, like academic philosophy, there's little interest in UDT.
If this is right, it's another piece of inexplicable personal "luck" from my perspective, i.e., why am I experiencing a rare timeline where I got this recognition/status.
How does the Shareholder Value Revolution fit into your picture? From an AI overview:
1. The Intellectual Origins (The 1970s)
The revolution was born out of economic stagnation in the 1970s. As U.S. corporate profits dipped and competition from Japan and Germany rose, economists and theorists argued that American managers had become "fat and happy," running companies for their own comfort rather than efficiency.
Two key intellectual pillars drove the change:
- Milton Friedman (The Moral Argument): In a famous 1970 New York Times essay, Friedman argued, "The social responsibility of business is to increase its profits." He posited that executives spending money on "social causes" (like keeping inefficient plants open to save jobs) were essentially stealing from the owners (shareholders).
- Jensen and Meckling (The Economic Argument - "Agency Theory"): In 1976, these economists published a paper describing the "Principal-Agent Problem." They argued that managers (agents) were not aligned with shareholders (principals). Managers wanted perks (corporate jets, large empires), while shareholders wanted profit. The solution? Align their interests by paying executives in stock.
It seems to better fit my normative picture of human values: terminal values come from philosophy, and subservience of instrumental values to terminal values improves over time as we get better at it, without need to permanently raise instrumental values to terminal status or irreversibly commingle the two.
Possible root causes if we don't end up having a good long term future (i.e., realize most of the potential value of the universe), with illustrative examples:
Is this missing anything, or perhaps not a good way to break down the root causes? The goal for this includes:
A less-spooky solution might involve the principal simply asking the agent to write a comprehensive guide to building a truly friendly AGI which would be aligned with human values in a way that was robustly good, then follow that guide (with the corrigible agent’s help) to produce an aligned, superintelligent sovereign.
Please take a look at A Conflict Between AI Alignment and Philosophical Competence (especially the last paragraph, about corrigibility), which is in part a reaction to this.
I'm not surprised by this, my sense is that it's usually young people and outsiders who pioneer new fields. Older people are just so much more shaped by existing paradigms, and also have so much more to lose, that it outweighs the benefits of their expertise and resources.
All of the fields that come to my mind (cryptography, theory of computation, algorithmic information theory, decision theory, game theory) were founded by much more established researchers. (But on reflection these all differ from AI safety by being fairly narrow and technical/mathematical, at least at their founding.) Which fields are you thinking of, that were founded by younger people and outsiders?
Perplexity AI Pro (with GPT-5.1-Thinking)'s answer to "Who were the founders of academic cryptography research as a field and what where their jobs at the time?"
There isn’t a single universally agreed-on “founder” of academic cryptography. Instead, a small group of researchers in the 1940s–1970s are usually credited with turning cryptography into an open, university-based research field.
Histories of the subject generally describe a progression: Claude Shannon’s mathematical theory of secrecy in the 1940s, followed by the public‑key revolution of the 1970s and early 1980s that created today’s academic cryptography community. Shannon’s work was foundational, but it did not yet create an academic field in the modern sense; that came later with Whitfield Diffie, Martin Hellman, Ralph Merkle, and the inventors of RSA, whose work is often described as pioneering “modern” cryptography and has been recognized by ACM Turing Awards for cryptography pioneers.wikipedia+1
Claude Shannon is widely regarded as the founder of mathematical cryptography; in the 1940s he worked at Bell Labs as a researcher, where he developed the information‑theoretic framework for secrecy systems that later influenced public‑key cryptography. At roughly the same time and into the 1960s, cryptography research also existed in industry—most notably at IBM, where Horst Feistel headed an internal cryptography research group that designed ciphers such as Lucifer, which evolved into the Data Encryption Standard (DES), but this work was largely not yet an open academic discipline.research.ibm+1
Most accounts of “academic cryptography as a field” point first to the group around Stanford in the 1970s, whose work on public‑key ideas made cryptography a mainstream research topic in universities. In that period, the key people and their roles were approximately:sandilands+2
These three are often singled out as the core founders of modern, open, university‑based cryptography research because their work shifted cryptography from a mostly classified or industrial activity into a widely studied academic topic.awards.acm+2
Immediately after the Diffie–Hellman–Merkle work, Ronald Rivest, Adi Shamir, and Leonard Adleman introduced RSA while they were all at MIT, with Rivest and Shamir as faculty members and Adleman as a researcher/graduate student; this solidified public‑key cryptography as a central academic topic and led to sustained university research groups in cryptography. Their roles at MIT and the impact of RSA meant that, alongside the Stanford group, they are commonly treated as co‑founders of modern academic cryptography.
In the 1980s, researchers such as Manuel Blum, Shafi Goldwasser, and Silvio Micali (then faculty and graduate students at Berkeley and MIT) developed the complexity‑theoretic foundations of cryptography—zero‑knowledge proofs, formal security definitions, and provable security—which are why they are frequently listed among the leading theoreticians and foundational figures of the field in modern cryptographer lists.wikipedia
"What about theory of computation, algorithmic information theory, decision theory, game theory? You can be more concise with these."
Each of these fields has several widely recognized “founders,” mostly mathematicians, logicians, and economists working in universities or research institutes. Below is a concise mapping from field → key founders → what they were professionally doing at the time.
Modern theory of computation is usually traced to Alonzo Church and Alan Turing in the 1930s, along with related work by Kurt Gödel. Church was a logician and professor of mathematics at Princeton University when he introduced the lambda calculus and the Church–Turing thesis–style view of effective calculability. Turing was a young mathematician at Cambridge University when he published his 1936 paper defining Turing machines and formalizing computation, later becoming widely regarded as a founder of theoretical computer science.wikipedia+1
Algorithmic information theory was founded by Ray Solomonoff, an American mathematician who, working largely as an independent researcher on algorithmic probability and inductive inference, first presented the core ideas at a Caltech conference and in a 1960 report on a “General Theory of Inductive Inference.” It was then independently and almost simultaneously redeveloped by Andrey Kolmogorov, a leading Soviet mathematician, and Gregory Chaitin, a young mathematician working in industrial research, both of whom formalized similar notions of algorithmic complexity.wikipedia+2
Modern game theory is generally credited to John von Neumann and Oskar Morgenstern. Von Neumann, a Hungarian mathematician working in pure and applied mathematics (including at Princeton and later the Institute for Advanced Study), introduced the foundational minimax theorem for zero‑sum games and then co‑authored Theory of Games and Economic Behavior in 1944 with Morgenstern, who was at the time an economist and professor (later at Princeton), thereby establishing game theory as a major branch of economics and social science.cmu+2
Normative decision theory in economics is typically traced to the von Neumann–Morgenstern expected‑utility framework, developed in the same 1944 book that founded game theory. In that work, von Neumann (mathematician) and Morgenstern (economist) provided an axiomatic treatment of rational choice under uncertainty, which is widely treated as the foundational formulation of modern decision theory in economics and statistics.wikipedia+2
Though I guess the thing I'm pointing at could also be summarized as "why hasn't someone created a new paradigm of AI safety in the last decade?"
Creating a new paradigm within an existing field seems different enough from creating a new field that the important factors might differ a lot. Also, by asking this question it seems like you're assuming that someone should have created a new paradigm of AI safety in the last decade, which a lot of people would presumably disagree with (because they either think the existing paradigms are good enough, or this is just too hard technically). (Basically I'm suggesting it may be hard to interest people in this question, until someone has created such a paradigm, and then you can go back and say "why didn't someone do this earlier?")
Given that Vernor Vinge wrote The Coming Technological Singularity: How to Survive in the Post-Human Era in 1993, which single-handedly established much of the memeplex, including the still ongoing AI-first vs IA-first debate, another interesting question is why didn't anyone found the AI safety field until around 2000.
For me, I'm not sure when I read this essay, but I did read Vinge's A Fire Upon the Deep in 1994 as a college freshman, which made me worried about a future AI takeover, but (as I wrote previously) I thought there would be plenty of smarter people working in AI safety so I went into applied cryptography instead (as a form of d/acc). Eliezer after reading Vinge (as a teen) didn't immediately heed the implicit or explicit safety warnings and instead wanted to accelerate the arrival of the Singularity as much as possible. It took him until around 2000 to pivot to safety. Nick Bostrom I think was concerned from the beginning or very early, but he was a PhD student when he got interested and I guess it took him a while to work through the academic system until he could found FHI in 2005.
Maybe the real question is why didn't anyone else, i.e., someone with established credentials and social capital, found the field. Why did the task fall to a bunch of kids/students? The fact that nobody did it earlier does seem to suggest that it takes a very rare confluence of factors/circumstances for someone to do it.
(Another tangential puzzle is why Vinge himself didn't get involved, as he was a professor of computer science in addition to science fiction writer. AFAIK he stayed completely off the early mailing lists as well as OB/LW nor had any contacts with anyone in AI safety.)
Thanks. This sounds like a more peripheral interest/concern, compared to Eliezer/LW's, which was more like, we have to fully solve DT before building AGI/ASI, otherwise it could be catastrophic due to something like the AI falling prey to an acausal threat or commitment races, or can't cooperate with other AIs.