Anthropic (org)

Edited by Dakara, Multicore, Ruby, ryan_greenblatt last updated 31st Dec 2024

Anthropic is an AI company based in San Francisco. The company is known for developing the Claude AI family and publishing research on AI safety.

Not to be confused with anthropics.

Posts tagged Anthropic (org)

63Anthropic's Core Views on AI Safety

Zac Hatfield-Dodds

3y

13

49Why I'm joining Anthropic

evhub

3y

2

33Toy Models of Superposition

evhub

3y

2

29Concrete Reasons for Hope about AI

Zac Hatfield-Dodds

3y

0

67Transformer Circuits

evhub

4y

3

109Towards Monosemanticity: Decomposing Language Models With Dictionary Learning

Zac Hatfield-Dodds

2y

12

95Introducing Alignment Stress-Testing at Anthropic

evhub

2y

19

69EIS XIII: Reflections on Anthropic’s SAE Research Circa May 2024

scasper

1y

7

42Anthropic: Three Sketches of ASL-4 Safety Case Components

Zach Stein-Perlman

1y

18

45Dario Amodei’s prepared remarks from the UK AI Safety Summit, on Anthropic’s Responsible Scaling Policy

Zac Hatfield-Dodds

2y

0

45Anthropic's Responsible Scaling Policy & Long-Term Benefit Trust

Zac Hatfield-Dodds

2y

6

30Paper: The Capacity for Moral Self-Correction in Large Language Models (Anthropic)

LawrenceC

3y

0

31Putting up Bumpers

Sam Bowman

6mo

7

18How do new models from OpenAI, DeepMind and Anthropic perform on TruthfulQA?

Owain_Evans

4y

1

19Anthropic's updated Responsible Scaling Policy

Zac Hatfield-Dodds

1y

0

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

Anthropic (org)