Someone is well-calibrated if the things they predict with X% chance of happening in fact occur X% of the time. Importantly, calibration is not the same as accuracy. Calibration is about accurately assessing how good your predictions are, not making good predictions. Person A, whose predictions are marginally better than chance (60% of them come true when choosing from two options) and who is precisely 60% confident in their choices, is perfectly calibrated. In contrast, Person B, who is 99% confident in their predictions, and right 90% of the time, is more accurate than Person A, but less well-calibrated... (read more)
AI Risk is analysis of the risks associated with building powerful AI systems... (read more)
Rationality is the art of thinking in ways that result in accurate beliefs and good decisions. It is the primary topic of LessWrong.
Rationality is not only about avoiding the vices of self-deception and obfuscation (the failure to communicate clearly), but also about the virtue of curiosity, seeing the world more clearly than before, and achieving things previously unreachable to you. The study of rationality on LessWrong includes a theoretical understanding of ideal cognitive algorithms, as well as building a practice that uses these idealized algorithms to inform heuristics, habits, and techniques, to successfully reason and make decisions in the real world... (read more)
| User | Post Title | Wikitag | Pow | When | Vote |
Consent is a foundational concept in many practical systems of ethics (such as found in medicine).
Self-immolation is a hypothetical act that could be executed by a leading AI (capabilities) lab, involving self-destruction, including the destruction of all its resources relevant for furthering progress towards AGI (or more broadly, extremely dangerous capabilities). It would also be a signal to the world that existential/catastrophic risk from AI has been taken seriously by one of the leading AI capabilities actors.
A weaker version would involve a credibly signaled and faithfully executed pivot away from AGI progress towards safer, narrower, bounded AI systems (see Tool AI).
The idea has been independently proposed at least two times:...
The agent-structure problem is the question about whether systems that behave like agents necessarily have an internal structure that makes them agentic. This structure usually involves some search procedure.
Kant's third formulation of the categorical imperative lets you build up most of the structure of the key moral ideas from a simple rule: "treat no person as purely a means to andan end, but always also as an end in themselves". Many applications of the categorical imperative require baroque derivations to loop back and be justified from this premise (treated as a generative axiom) but "consent ethics" in general, and "slavery is forbidden" is anare both elementary proof.proofs from this starting point. A slave is a person, turned into a tool and piece of property of another person... a literal "means" to ANY end that the owning person (or "Master") deems desirable and feasible.
The Machine Alignment, Transparency, and Security (MATS) Program is an independent research and educational seminar program that provides emerging researchers with mentorship, talks, workshops, and workshopsresearch support and connects them with the SF Bay Area and London AI safety research communities.
When CDT makes its decisions, it only thinks it controls things causally downstream of its actions. UDT by contrast, is choosing as if it controls every part of reality that is logically correlated withdownstream of its actions.logical output. This allows it to acausally bargaindetermine a wide range of other facts across the multiverse.universe that are logically correlated with itself, like what is or has been reliably predicted about its present decision, or what other agents sufficiently similar to itself will choose. Son of CDT is somewhere in the middle. It acts as if it controls only the things logically correlated with its actions that are causally downstream of its moment of original creation.
If a Son of CDT agent goes on to create further agents, all of those agents will have the same magic moment. They will all care about whether or not Omega's knowledge of them is causally downstream of the moment the moment the CDT agent first wrote Son-of-CDT code.
CDT agents don't consider the acausal impactlogical impacts of their decisionsdecision algorithms' outputs when choosing actions.actions, only the physical consequences of their physical act. Whenever a CDT agent is put in a situation where it has to make a decision, it considers multiple hypothetical worlds,hypotheticals, one for each decision it could make. In a CDT agent, the only difference between these hypothetical worlds is the decision it makes.physical act in the moment of that act, and what happens physically / causally downstream from that. This means that when CDT is faced with something trying to predict its actions, CDT imagines its decision to not have any effect on its predicted decision.
The name was suggested by Ryan Grenblatt in niplav in a reply“AI companies are unlikely to Daniel Kokotajlo's shortformmake high-assurance safety cases if timelines are short”.
ATOW (2025-09-09)(2026-04-03), nothing has been published that claimMoore et al. (2026) is probably the best academic account of LLM-Induced Psychosis (LIP) is a definite, real, phenomena. Though, many anecdotal accounts exist. It is not yet clear, if LIP is caused by AIs, if pre-existing disillusion are 'sped up' or reinforced by interactinginduced psychosis. They "analyze logs of conversations with an AI, or, if LIP exists at all.LLM chatbots from 19 users who report having experienced psychological harms from chatbot use" where the users mostly came from " support group for such chatbot users."
Although slavery is usually involuntary and involves coercion, there are also cases where people voluntarily enter into slavery (like(link to wikipedia!) to pay a debt or earn money due to poverty.
The Machine Alignment, Transparency, and Security (MATS) Program is an independent research and educational seminar program that provides emerging researchers with mentorship, talks,talks & workshops, research support, and research support and connects themconnections with the SF Bay Area and London AI safety research communities.
Consent is a foundational concept in many practical systems of ethics (such as found in medicine).
The name was suggested by niplav in a reply to Daniel Kokotajlo's shortform.
The agent-structure problem is the question about whether systems that behave like agents necessarily have an internal structure that makes them agentic. This structure usually involves some search procedure.
SlavesKant's third formulation of the categorical imperative lets you build up most of the structure of the key moral ideas from a simple rule: "treat no person as purely a means to and end, but always also as an end in themselves". Many applications of the categorical imperative require baroque derivations to loop back and be justified from this premise (treated as a generative axiom) but "slavery is forbidden" is an elementary proof. A slave is a person, turned into a tool and piece of property of another person... a literal "means" to ANY end that the owning person (or "Master") deems desirable and feasible.
Historically, slaves would be kept in bondage for life, or for a fixed period of time after which they would be grated freedom. Many historical cases of enslavement occurred as a result of breaking the law, becoming indebted, suffering a military defeat, or exploitation for cheaper labor; other forms of slavery were instituted along demographic lines such as race or sex.
Self-immolation is a hypothetical act that could be executed by a leading AI (capabilities) lab, involving self-destruction, including the destruction of all its resources relevant for furthering progress towards AGI (or more broadly, extremely dangerous capabilities). It would also be a signal to the world that existential/catastrophic risk from AI has been taken seriously by one of the leading AI capabilities actors.
A weaker version would involve a credibly signaled and faithfully executed pivot away from AGI progress towards safer, narrower, bounded AI systems (see Tool AI).
The idea has been independently proposed at least two times:










If youWe used to have 100 or more karma on both LessWronga feature for crossposting to EA Forum. It caused a lot of bugs that were difficult to deal with and didn't feel like it was pulling its weight, so we remove it in the EA Forum, you can automatically crosspost from LessWronglatest update to the EA Forum (and from the EA Forum to LessWrong). You also need to have accepted the EA Forum's Terms of Use,which you can do by trying to create a new post on the EA Forum (if you haven't already done so after the Terms of Use requirement was put in place).
You should be logged in on both sites. To ensure that a post is crossposted after it's published, or to crosspost an already-published post, follow the authentication flow in the Options menu on the post editor page.
Eliezer Yudkowsky is a research fellow of the Machine Intelligence Research Institute, which he co-founded in 2001. He is mainly concerned with the obstacles and importance of developing a Friendly AI, such as a reflective decision theory that would lay a foundation for describing fully recursive self modifying agents that retain stable preferences while rewriting their source code. He also co-founded LessWrong, writing the Sequences, long sequences of posts dealing with epistemology, AGI, metaethics, rationality and so on... (read more)