Parallels Between AI Safety by Debate and Evidence Law

by Cullen_OKeefe1 min read20th Jul 20201 comment

6

Debate (AI safety technique)AI
Frontpage

In this post, I highlight some parallels between AI Safety by Debate (“Debate”) and evidence law.

Evidence law structures high-stakes arguments with human judges.

The prima facie reason that Evidence law (“Evidence”) is relevant to Debate is because Evidence is one of the few areas, like Debate, where debates have high stakes: potentially including severe criminal penalties or millions of dollars in liability. Other high-stakes debates could include parliamentary or electoral debates, but these are less substantively limited (i.e., there are fewer restraints on what debaters can do) and less aimed at seeking truth (and more aimed at political theater).

In court proceedings, questions of law are decided by the judge, while the questions of fact are decided by the finder of fact (usually the jury, but sometimes a judge). The finder of fact weighs the persuasiveness of factual arguments (e.g., whether the defendant shot the victim, and whether he intended to do so). In all cases, like in Debate, the final arbiter of factual debates is human.

Evidence law limits the types of arguments available to debaters.

The goal of the Federal Rules of Evidence is “ascertaining the truth and securing a just determination.”[1] Therefore, generally, “relevant evidence is admissible unless [otherwise provided].”[2] A piece of evidence is relevant if “(a) it has any tendency to make a fact more or less probable than it would be without the evidence; and (b) the fact is of consequence in determining the action.”[3]

However, the bulk of Evidence law is dedicated to exceptions to this presumption of admissibility. The precision of these exceptions varies significantly. Some are less precise (“standards,” in legal jargon) such as Rule 403: “The court may exclude relevant evidence if its probative value is substantially outweighed by a danger of one or more of the following: unfair prejudice, confusing the issues, misleading the jury, undue delay, wasting time, or needlessly presenting cumulative evidence.”[4] Others are more specific (“rules”).

As Rule 403 exemplifies, many of the exceptions to the general admissibility of relevant evidence are based on the fallibility of fact-finders. Evidence that is relevant but likely to be on-balance detrimental to truth-seeking is therefore excluded. Other examples of rules of this form include:

  1. Use of a person’s character to prove action in conformity with that character;[5]
  2. Limitations on the use of out-of-court statements;[6] and
  3. Limitations on impeaching witnesses by their past criminal convictions[7] or religious beliefs.[8]

Relevance to Debate

Types of Arguments to Watch For

The rules of Evidence have evolved over long experience with high-stakes debates, so their substantive findings on the types of arguments that prove problematic for truth-seeking are relevant to Debate.

Opportunities for Structuring Debate

The rules of evidence could also be used to structure Debate: e.g., by training AI debaters to not make certain types of arguments, or by having a mediator screen any arguments that would violate the rules, such that the ultimate judge does not see them.


  1. Fed. R. Evid. 102. ↩︎

  2. Fed. R. Evid. 402. ↩︎

  3. Fed. R. Evid. 401. ↩︎

  4. Fed. R. Evid. 403. ↩︎

  5. Fed. R. Evid. 404. ↩︎

  6. Fed. R. Evid. 801–02. ↩︎

  7. Fed. R. Evid. 609. ↩︎

  8. Fed. R. Evid. 610. ↩︎

6