Wiki-Tags in Need of Work

Axioms (together with definitions) forms the basis of mathematical theorems. Every mathematical theorem is only proven inside its axiom system... (read more)

AI Control in the context of AI Alignment is a category of plans that aim to ensure safety and benefit from AI systems, even if they are goal-directed and are actively trying to subvert your control measures. From The case for ensuring that powerful AIs are controlled:.. (read more)

The Open Agency Architecture ("OAA") is an AI alignment proposal by (among others) @davidad and @Eric Drexler.  .. (read more)

Singluar learning theory is a theory that applies algebraic geometry to statistical learning theory, developed by Sumio Watanabe. Reference textbooks are "the grey book", Algebraic Geometry and Statistical Learning Theory, and "the green book", Mathematical Theory of Bayesian Statistics.

Archetypal Transfer Learning (ATL) is a proposal by @whitehatStoic for what is argued by the author to be a fine tuning approach that "uses archetypal data" to "embed Synthetic Archetypes". These Synthetic Archetypes are derived from patterns that models assimilate from archetypal data, such as artificial stories. The method yielded a shutdown activation rate of 57.33% in the GPT-2-XL model after fine-tuning. .. (read more)

Open Threads are informal discussion areas, where users are welcome to post comments that didn't quite feel big enough to warrant a top-level post, nor fit in other posts... (read more)

A Black Marble is a technology that by default destroys the civilization that invents it. It's one type of Existential Risk. AGI may be such an invention, but isn't the only one... (read more)

AI Evaluations focus on experimentally assessing the capabilities, safety, and alignment of advanced AI systems. These evaluations can be divided into two main categories: behavioral and understanding-based... (read more)

Tag Voting Activity

User Post Title Tag Pow When Vote

Recent Tag & Wiki Activity

Addresses aligning human-levelAI alignment & monitoring & control techniques which consist of doing many sometimes-fail tricks rather than a single never-fails method. This tag applies both to posts proposing an entire bag and moderately-above-human-level AI systems, not arbitrarily intelligent systems.posts that offer a single trick.

Just a research idea, not a whole damn agenda

Research idea

Just a research idea, not a whole damn agenda

Addresses aligning human-level and moderately-above-human-level AI systems, not arbitrarily intelligent systems.

Bag-of-tricks alignment

AI alignment & monitoring & control techniques which consist of doing many sometimes-fail tricks rather than a single never-fails method. This tag applies both to posts proposing an entire bag and posts that offer a single trick.

A tag for conversations with, rather than about, digital minds. This tag needs expansion and checking for duplicate tags; I checked shallowly before creating it. Should it be merged with AI Evaluations?

Dungeons and Data Science, or D&D.Sci, is a series of analytical exercises played on Less Wrong, frequently framed as problems in a D&D style adventuring world.

Generally D&D.Sci exercises will give the playerplayers a set of datasynthetic dataset but not the rules used to generate it, and the playerit: players must extrapolate patterns from the data in order to optimize the solution to some problem. After each exercise there is usually a follow-up post which explains the underlying rules of the scenario and gives a score to each player who participated at the time.

A tag for conversations with, rather than about, digital minds. This tag needs expansion and checking for duplicates.duplicate tags; I checked shallowly before creating it.

A tag for conversations with, rather than about, digital minds. This tag needs expansion and checking for duplicates.

Conversations with AIs

A tag for conversations with, rather than about, digital minds. This tag needs expansion and checking for duplicate tags; I checked shallowly before creating it. Should it be merged with AI Evaluations?

Sparse Autoencoders (SAEs) are an unsupervised technique for decomposing the activations of a neural network into a sum of interpretable components (often referred to as features). Sparse Autoencoders may be useful interpretability and related alignment agendas. 

For more information on SAEs see:

Sparse Autoencoders (SAEs)

Sparse Autoencoders (SAEs) are an unsupervised technique for decomposing the activations of a neural network into a sum of interpretable components (often referred to as features). Sparse Autoencoders may be useful interpretability and related alignment agendas. 

For more information on SAEs see:

Pause AI is an AI advocacy organizing with the aim of pausing AGI research due to concerns about future risks from AI. From their website:

From their website:

"Who are you?

We are a community of volunteers and local communities coordinated by a non-profit that aims to mitigate the risks of AI (including the risk of human extinction). We aim to convince our governments to step in and pause the development of superhuman AI. We do this by informing the public, talking to decision-makers, and organizing protests. We are not affiliated with any company or organization."

PauseAI

Pause AI is an AI advocacy organizing with the aim of pausing AGI research due to concerns about future risks from AI. From their website:

"Who are you?

We are a community of volunteers and local communities coordinated by a non-profit that aims to mitigate the risks of AI (including the risk of human extinction). We aim to convince our governments to step in and pause the development of superhuman AI. We do this by informing the public, talking to decision-makers, and organizing protests. We are not affiliated with any company or organization."

  • 3 points – if you have 25,000 karma
  • 2 points – if you have 1,000 karma
  • 1 point  – if you have 0 karma

And there are different sets of axiom systems: geometry ( eucilidianEuclidean plane )geometry, the first,Zermelo-Fraenkel axioms for set theory, Kolmogorov's axioms for probability theory another, stochasic another and so on.

The seminglyseemingly "absolute trueness"truth" of mathematics is an illusion. Playing with "mathematical certainties" outside their field can end in more and more and more illusionalillusory certainties.

A cost-benefit analysis is a method for quantitatively assigning values to different actions.

For each action, it weighs up the costs of taking that action against the benefit of that action, usually in expectation if uncertainty is involved.

Cost-benefit analyses can be done at varying levels of precision, starting with fermi estimates and going up to Monte-Carlo analyses with research into the distributions underlying the relevant factors.

See Also:

Cost-Benefit Analysis

A cost-benefit analysis is a method for quantitatively assigning values to different actions.

For each action, it weighs up the costs of taking that action against the benefit of that action, usually in expectation if uncertainty is involved.

Cost-benefit analyses can be done at varying levels of precision, starting with fermi estimates and going up to Monte-Carlo analyses with research into the distributions underlying the relevant factors....

(Read More)