Archetypal Transfer Learning (ATL) is a proposal by @whitehatStoic for what is argued by the author to be a fine tuning approach that "uses archetypal data" to "embed Synthetic Archetypes". These Synthetic Archetypes are derived from patterns that models assimilate from archetypal data, such as artificial stories. The method yielded a shutdown activation rate of 57.33% in the GPT-2-XL model after fine-tuning. .. (read more)
Open Threads are informal discussion areas, where users are welcome to post comments that didn't quite feel big enough to warrant a top-level post, nor fit in other posts... (read more)
A technology that by default destroys the civilization that invents it. It's one type of Existential Risk. AGI may be such an invention... (read more)
AI Evaluations focus on experimentally assessing the capabilities, safety, and alignment of advanced AI systems. These evaluations can be divided into two main categories: behavioral and understanding-based... (read more)
Löb's Theorem is theorem proved by Martin Hugo Löb which states: .. (read more)
ChatGPT is a language model created by OpenAI... (read more)
.. (read more)[T]he orangutan effect: If you sit down with an orangutan and carefully explain to it one of your cherished ideas, you may leave behind a puzzled primate, but will yourself exit thinking more clearly.
User | Post Title | Tag | Pow | When | Vote |
Language model cognitive architectures (LMCAs) are also called language model agents or scaffold agents. AutoGPT is the best-known example. It and other LMAs use a large language model and other components to create a goal-directed, agentic system. They use a large language model as a sort of "cognitive engine" performing much of the cognitive work, and incorporate other components to make the composite system goal-directed, and to fill in gaps in and extend the LLMs cognitive abilities.
"Chain-of-thought" autonomous agentic wrappers such as AutoGPT around an LLM such as GPT-4, and similar Language Model Cognitive Architectures (LMCAs) [other(other commonly used terms are Language Model Autonomous Agents (LMAAs), or Scaffolded LLMs]), are a recent candidate approach to building an AGI.
Memory Reconsolidation is a process by which a memory can be returned to a labile (subject to change) state, after it has already entered long term memory.
For some time, the scientific consensus was that memory was only consolidated once, but there is now growing evidence for reconsolidation processes.
There is speculation that some therapies and change modalities work through the mechanism of memory reconsolidation, changing the emotional valence of memories by accessing them, bringing them to a labile state, then reconsolidating them.
Corrigibility is also used in a broader sense, something like a helpful agent. Paul Christiano has defined corrigibility as an agent that will help me:
- Figure out whether I built the right AI and correct any mistakes I made
- Remain informed about the AI’s behavior and avoid unpleasant surprises
- Make better decisions and clarify my preferences
- Acquire resources and remain in effective control of them
- Ensure that my AI systems continue to do all of these nice things
- …and so on
Language model cognitive architectures (LMCAs) are also called language model agents. AutoGPT is the best-known example. It and other LMAs use a large language model and other components to create a goal-directed, agentic system. They use a large language model as a sort of "cognitive engine" performing much of the cognitive work, and incorporate other components to make the composite system goal-directed, and to fill in gaps in and extend the LLMs cognitive abilities.
The term language model agent better captures the inherent agency of such a system, while "language model cognitive architecture" better captures the extensive additions to the LLM, and the resulting change in function and capabilities.
LMCAs are arguably a likely candidate for the first agentic, self-aware and self-improving AGIs.
Language model cognitive architectures (LMCAs) are also called language model agents or scaffold agents. AutoGPT is the best-known example. It and other LMAs use a large language model and other components to create a goal-directed, agentic system. They use a large language model as a sort of "cognitive engine" performing much of the cognitive work, and incorporate other components to make the composite system goal-directed, and to fill in gaps in and extend the LLMs cognitive abilities.
The term language model agent better captures the inherent agency of such a system, while "language model cognitive architecture" better captures the extensive additions to the LLM, and the resulting change in function and capabilities.
LMCAs are arguably a likely candidate for the first agentic, self-aware and self-improving AGIs....
The strength of your votes on LessWrong depends on your karma; if you have more karma, your upvotes will increase the score of things you upvote by more.
Normal votes (one click) are worth
Strong Votes (click and hold) are worth
The strength of your votes on LessWrong depends on your karma; if you have more karma, your upvotes will increase the score of things you upvote by more.
Normal votes (one click) are worth
Strong Votes (click and hold) are worth...
Ursula von der Leyen
António Guterres
Rishi Sunak
Prince Albert II
Naftali Bennett
Ted Lieu (CAIS Signatory)
Audrey Tang (CAIS Signatory)
Posts about the concept of wisdom, which is the ability to act well in the world at large.
As an appreciable fraction of a neural network is composed of shards, large neural nets can possess quite intelligent constituent shards. These shards can be sophisticated enough to be well-modelledmodeled as playing negotiation games with each other, (potentially) explaining human psychological phenomena like akrasia and value changes from moral reflection. Shard theory also suggests an approach to explaining the shape of human values, and scheme for RL alignment.
As an appreciable fraction of a neural network is composed of shards, large neural nets can possess quite intelligent constituent shards. These shards can be sophisticated enough to be well-modeledmodelled as playing negotiation games with each other, (potentially) explaining human psychological phenomena like akrasia and value changes from moral reflection. Shard theory also suggests an approach to explaining the shape of human values, and scheme for RL alignment.
A prominent subfield of interpretability of neural networks is mechanistic interpretability, which attempts to understandhow neural networks perform the tasks they perform, for example by finding circuits in transformer models. This can be contrasted to subfieds of interpretability which seek to attribute some output to a part of a specific input, such as clarifying which pixels in an input image caused a computer vision model to output the classification "horse".
Singluar learning theory is a theory that applies algebraic geometry to statistical learning theory, developed by Sumio Watanabe. Reference textbooks are "the grey book", Algebraic Geometry and Statistical Learning Theory, and "the green book", Mathematical Theory of Bayesian Statistics.