New Answer

New Comment

3 Answers sorted by
top scoring

Daniel Kokotajlo

Mar 31, 2021

150

Thanks, this is a great thing to be thinking about and a good list of ideas!

Do other subjects come to mind?

Public speaking skills, persuasion skills, debate skills, etc.

Practice no-cost-too-large productive periods

I like this idea. At AI Impacts we were discussing something similar: having "fire drills" where we spend a week (or even just a day) pretending that a certain scenario has happened, e.g. "DeepMind just announced they have a turing-test-passing system and will demo it a week from now; we've got two journalists asking us for interviews and need to prep for the emergency meeting with the AI safety community tonight at 5." We never got around to testing out such a drill but I think variants on this idea are worth exploring. Inspired by what you said, perhaps we could have "snap drills" where suddenly we take our goals for the next two months and imagine that they need to be accomplished in a week instead, and see how much we can do. (Additionally, ideas like this seem like they would have bonus effects on morale, teamwork, etc.)

I don’t know what is entailed in cultivating that virtue. Perhaps meditation? Maybe testing one’s self at literal risk to one’s life?

This virtue is extremely important to militaries. Does any military use meditation as part of its training? I would guess that the training given to medics and officers (soldiers for whom clear thinking is especially important) might have some relevant lessons. Then again, maybe the military deals with this primarily by selecting the right sort of people rather than taking arbitrary people and training them. If so, perhaps we should look into applying similar selection methods in our own organizations to identify people to put in charge when the time comes.

Any more ideas?

In this post I discuss some:

Perhaps it would be good to have an Official List of all the AI safety strategies, so that whatever rationale people give for why this AI is safe can be compared to the list. (See this prototype list.)

Perhaps it would be good to have an Official List of all the AI safety problems, so that whatever rationale people give for why this AI is safe can be compared to the list, e.g. "OK, so how does it solve outer alignment? What about mesa-optimizers? What about the malignity of the universal prior? I see here that your design involves X; according to the Official List, that puts it at risk of developing problems Y and Z..." (See this prototype list.)

Perhaps it would be good to have various important concepts and arguments re-written with an audience of skeptical and impatient AI researchers in mind, rather than the current audience of friends and LessWrong readers.

Thinking afresh, here's another idea: I have a sketch of a blog post titled "What Failure Feels Like." The idea is to portray a scenario of doom in general, abstract terms (like Paul's post does, as opposed to writing a specific, detailed story) but with a focus on how it feels to us AI-risk-reducers, rather than focusing on what the world looks like in general or what's going on inside the AIs. I decided it would be depressing and not valuable to write. However, maybe it would be valuable as a thing people could read to help emotionally prepare/steel themselves for the time when they "are confronted with the stark reality of how doomed we are." IDK.

I guess overall my favorite idea is to just periodically spend time thinking about what you'd do if you found out that takeoff was happening soon. E.g. "Deepmind announces turing-test system" or "We learn of convincing roadmap to AGI involving only 3 OOMs more compute" or "China unveils project to spend +7 OOMs on a single training run by 2030, with lesser training runs along the way" I think that the exercise of thinking about near-term scenarios and then imagining what we'd do in response will be beneficial even on long timelines, but certainly super beneficial on short timelines (even if, as is likely, none of the scenarios we imagine come to pass).

[-]Kaj_Sotala5y30

Does any military use meditation as part of its training?

. Yes, e.g.

This [2019] winter, Army infantry soldiers at Schofield Barracks in Hawaii began using mindfulness to improve shooting skills — for instance, focusing on when to pull the trigger amid chaos to avoid unnecessary civilian harm.
The British Royal Navy has given mindfulness training to officers, and military leaders are rolling it out in the Army and Royal Air Force for some officers and enlisted soldiers. The New Zealand Defence Force recently adopted the technique, and military forces o

... (read more)

4Daniel Kokotajlo5y

Hmmm, if this is the most it's been done, then that counts as a No in my book. I was thinking something like "Ah yes, the Viet Cong did this for most of the war, and it's now standard in both the Vietnamese and Chinese armies." Or at least "Some military somewhere has officially decided that this is a good idea and they've rolled it out across a large portion of their force."

TsviBT

Mar 31, 2021

I speculate (based on personal glimpses, not based on any stable thing I can point to) that there's many small sets of people (say of size 2-4) who could greatly increase their total output given some preconditions, unknown to me, that unlock a sort of hivemind. Some of the preconditions include various kinds of trust, of common knowledge of shared goals, and of person-specific interface skill (like speaking each other's languages, common knowledge of tactics for resolving ambiguity, etc.).
[ETA: which, if true, would be good to have already set up before crunch time.]

Chris_Leong

Apr 05, 2021

One of the biggest considerations would be the process for activating "crunch time". In what situations should crunch time be declared? Who decides? How far out would we want to activate and would there be different levels? Are there any downsides of such a process including unwanted attention?

If these aren't discussed in advance, then I imagine that far too much of the available time could be taken up by whether to activate crunch time protocols or not.

PS. I actually proposed here that we might be able to get a superintelligence to solve most of the problem of embedded agency by itself. I'll try to write it up into a proper post soon.

Rendering 10/12 comments, sorted by

top scoring

(show more) Click to highlight new comments since: Today at 12:56 PM

[-]TurnTrout5y140

For this to matter, our alignment researchers need to be at the cutting edge of AI capabilities, and they need to be positioned such that their work can actually be incorporated into AI systems as they are deployed.

If we become aware that a lab will likely deploy TAI soon, other informed actors will probably become aware as well. This implies that many people would be trying to influence and gain access to this lab. Therefore, we should already have AI alignment researchers in positions of power within the lab before this happens.

[-]Eliezer Yudkowsky5y130

Seems rather obvious to me that the sort of person who is like, "Oh, well, we can't possibly work on this until later" will, come Later, be like, "Oh, well, it's too late to start doing basic research now, we'll have to work with whatever basic strategies we came up with already."

[-]Raemon5y120

Seems true, but also didn't seem to be what this post was about?

[-]DanielFilan5y80

Most current AI alignment work is pretty abstract and theoretical, for two reasons.

FWIW, this is not obvious to me (or at least depends a lot on what you mean by 'AI alignment'). Work at places like OpenAI, CHAI, and DeepMind tends to be relatively concrete.

[-]DanielFilan5y60

Also if you count work done by people not publicly identified as motivated by existential risk, I think the concrete:abstract ratio will increase.

[-]Raemon5y50

Curated.

I found this a surprisingly obvious set of strategic considerations (and meta-considerations), that for some reason I'd never seen anyone actually attempt to tackle before.

I found the notion of practicing "no cost too large" periods quite interesting. I'm somewhat intimidated by the prospect of trying it out, but it does seem like a good idea.

[-]jungofthewon5y30

Access

Alignment-focused policymakers / policy researchers should also be in positions of influence.

Knowledge

I'd add a bunch of human / social topics to your list e.g.

Policy
Every relevant historical precedent
Crisis management / global logistical coordination / negotiation
Psychology / media / marketing
Forecasting

Research methodology / Scientific “rationality,” Productivity, Tools

I'd be really excited to have people use Elicit with this motivation. (More context here and here.)

Re: competitive games of introducing new tools, we did an internal speed Elicit vs. Google test to see which tool was more efficient for finding answers or mapping out a new domain in 5 minutes. We're broadly excited to structure and support competitive knowledge work and optimize research this way.

[-]johnswentworth5y30

Relevant topic of a future post: some of the ideas from Risks From Learned Optimization or the Improved Good Regulator Theorem offer insights into building effective institutions and developing flexible problem-solving capacity.

Rough intuitive idea: intelligence/agency are about generalizable problem-solving capability. How do you incentivize generalizable problem-solving capability? Ask the system to solve a wide variety of problems, or a problem general enough to encompass a wide variety.

If you want an organization to act agenty, then a useful technique is to constantly force the organization to solve new, qualitatively different problems. An organization in a highly volatile market subject to lots of shocks or distribution shifts will likely develop some degree of agency naturally.

Organizations with an adversary (e.g. traders in the financial markets) will likely develop some degree of agency naturally, as their adversary frequently adopts new methods to counter the organization's current strategy. Red teams are a good way to simulate this without a natural adversary.

Some organizations need to solve a sufficiently-broad range of problems as part of their original core business that they develop some degree of agency in the process. These organizations then find it relatively easy to expand into new lines of business. Amazon is a good example.

Conversely, businesses in stable industries facing little variability will end up with little agency. They can't solve new problems efficiently, and will likely be wiped out if there's a large shock or distribution shift in the market. They won't be good at expanding or pivoting into new lines of business. They'll tend to be adaptation-executors rather than profit-maximizers, to a much greater extent than agenty businesses.

This all also applies at a personal level: if you want to develop general problem-solving capability, then tackle a wide variety of problems. Try problems in many different fields. Try problems with an adversary. Try different kinds of problems, or problems with different levels of difficulty. Don't just try to guess which skills or tools generalize well, go out and find out which skills or tools generalize well.

If we don't know what to expect from future alignment problems, then developing problem-solving skills and organizations which generalize well is a natural strategy.

[-]johnswentworth5y20

Re: picking up new tools, skills and practice designing and building user interfaces, especially to complex or not-very-transparent systems, would be very-high-leverage if the tool-adoption step is rate-limiting.

[-]Donald Hobson5y10

I don't actually think "It is really hard to know what sorts of AI alignment work are good this far out from transformative AI." is very helpful.

It is currently fairly hard to tell what is good alignment work. A week from TAI, then either, good alignment work will be easier to recognise because of alignment progress not strongly correlated with capabilities, or good alignment research is just as hard to recognise. (More likely the latter) I can't think of any safety research that can be done on GPT3 that can't be done on GPT1.

In my picture, research gets done and theorems proved, researcher population grows as funding increases and talent matures. Toy models get produced. Once you can easily write down a description of a FAI with unbounded compute, that's when you start to look at algorithms that have good capabilities in practice.

Moderation Log

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

40

[ Question ]

How do we prepare for final crunch time?

40

3 Answers sorted by
top scoring

Mar 31, 2021

Mar 31, 2021

Apr 05, 2021

Access

A different kind of work

Knowledge

Research methodology / Scientific “rationality”

Productivity

Picking up new tools

Staying grounded and stable in spite of the stakes

40

[ Question ]

How do we prepare for final crunch time?

40

3 Answers sorted by top scoring

Mar 31, 2021

Mar 31, 2021

Apr 05, 2021

Access

A different kind of work

Knowledge

Research methodology / Scientific “rationality”

Productivity

Picking up new tools

Staying grounded and stable in spite of the stakes

3 Answers sorted by
top scoring