AI ALIGNMENT FORUMTags
AF

Adversarial Examples

EditHistorySubscribe
Discussion (0)
Help improve this page (2 flags)
EditHistorySubscribe
Discussion (0)
Help improve this page (2 flags)
Adversarial Examples
Random Tag
Contributors
1Multicore

Adversarial examples are situations that have unusual features that will cause an AI to make choices that seem obviously wrong to a human. For example, an image of a panda can be subtly manipulated so that an image classifier classifies it as a gibbon.

Posts tagged Adversarial Examples
Most Relevant
1
15If I were a well-intentioned AI... I: Image classifier
Stuart Armstrong
2y
4
1
2The Goodhart Game
John Maxwell
3y
3
1
7AXRP Episode 1 - Adversarial Policies with Adam Gleave
DanielFilan
1y
3
1
60High-stakes alignment via adversarial training [Redwood Research report]
DMZ, Lawrence Chan, Nate Thomas
2mo
8
1
14[AN #62] Are adversarial examples caused by real but imperceptible features?
Rohin Shah
3y
8
0
15Evidence Sets: Towards Inductive-Biases based Analysis of Prosaic AGI
bayesian_kitten
6mo
1
1
7Adversarial attacks and optimal control
Jan Hendrik Kirchner
1mo
0
Add Posts