Quintin Pope

Wiki Contributions


AI Tracker: monitoring current and near-future risks from superscale models

This seems like a great resource. I also like the way it’s presented. It’s very clean.

I’d appreciate more focus on the monetary return on investment large models provide their creators. I think that’s the key metric that will determine how far firms scale up these large models. Relatedly, I think it’s important to track advancements that improve model/training efficiency because they can change the expected ROI for further scaling models.

A positive case for how we might succeed at prosaic AI alignment

I agree that transformers vs other architectures is a better example of the field “following the leader” because there are lots of other strong architectures (perceiver, mlp mixer, etc). In comparison, using self supervised transfer learning is just an objectively good idea you can apply to any architecture and one the brain itself almost surely uses. The field would have converged to doing so regardless of the dominant architecture.

One hopeful sign is how little attention the ConvBERT language model has gotten. It mixes some convolution operations with self attention to allow self attention heads to focus on global patterns as opposed to local patterns better handled by convolution. ConvBERT is more compute efficient than a standard transformer, but hasn’t made much of a splash. It shows the field can ignore low profile advances made by smaller labs.

For your point about the value of alignment: I think there’s a pretty big range of capabilities where the marginal return on extra capabilities is higher than the marginal return on extra alignment. Also, you seem focused on avoiding deception/treacherous turns, which I think are a small part of alignment costs until near human capabilities.

I don’t know what sort of capabilities penalty you pay for using a myopic training objective, but I don’t think there’s much margin available before voluntary mass adoption becomes implausible.

A positive case for how we might succeed at prosaic AI alignment

The reason self supervised approaches took over NLP is because they delivered the best results. It would be convenient if the most alignable approach also gave the best results, but I don’t think that’s likely. If you convince the top lab to use an approach that delivered worse results, I doubt much of the field would follow their example.

Meta learning to gradient hack

Thanks for the feedback! I use batch norm regularisation, but not dropout.

I just tried retraining the 100,000 cycle meta learned model in a variety of ways, including for 10,000 steps with 10,000x higher lr, using resilient backprop (which multiplies weights by a factor to increase/decrease them), and using an L2 penalty to decrease weight magnitude. So far, nothing has gotten the network to model the base function. The L2 penalty did reduce weight values to ~the normal range, but the network still didn’t learn the base function.

I now think the increase in weight values is just incidental and that the meta learner found some other way of protecting the network from SGD.

Paths To High-Level Machine Intelligence

Thank you for this excellent post. Here are some thoughts I had while reading.

The hard paths hypothesis:

I think there's another side to the hard paths hypothesis. We are clearly the first technology-using species to evolve on Earth. However, it's entirely possible that we're not the first species with human-level intelligence. If a species with human level intelligence but no opposable thumbs evolved millions of years ago, they could have died out without leaving any artifacts we'd recognize as signs of intelligence.

Besides our intelligence, humans seem odd in many ways that could plausibly contribute to developing a technological civilization.

  • We are pretty long-lived.
  • We are fairly social.
    • Feral children raised outside of human culture experience serious and often permanent mental disabilities (Wikipedia).
    • A species with human-level intelligence, but whose members live mostly independently may not develop technological civilization.
  • We have very long childhoods.
  • We have ridiculously high manual dexterity (even compared to other primates).
  • We live on land.

Given how well-tuned our biology seems for developing civilization, I think it's plausible that multiple human-level intelligent species arose in Earth's history, but additional bottlenecks prevented them from developing technological civilization. However, most of these bottlenecks wouldn't be an issue for an intelligence generated by simulated evolution. E.g., we could intervene in such a simulation to give low-dexterity species other means of manipulating their environment. Perhaps Earth's evolutionary history actually contains n human-level intelligent species, only one of which developed technology. That implies the true compute required to evolve human-level intelligence is far lower.

Brain imitation learning:

I also think the discussion of neuromophic AI and whole brain emulation misses an important possibility that Gwern calls "brain imitation learning". In essence, you record a bunch of data about human brain activity (using EEG, implanted electrodes, etc.), then you train a deep neural network to model the recorded data (similar to how GPT-3 or BERT model text). The idea is that modeling brain activity will cause the deep network to learn some of the brain's neurological algorithms. Then, you train the deep network on some downstream task and hope its learned brain algorithms generalize to the task in question.

I think brain imitation learning is pretty likely to work. We've repeatedly seen in deep learning that knowledge distillation (training a smaller student model to imitate a larger teacher model) is FAR more computationally efficient than trying to train the student model from scratch, while also giving superior performance (Wikipedia, distilling BERT, distilling CLIP). Admittedly, brain activity data is pretty expensive. However, the project that finally builds human-level AI will plausibly cost billions of dollars in compute for training. If brain imitation learning can cut the price by even 10%, it will be worth hundreds of millions in terms of saved compute costs.

DeepMind: Generally capable agents emerge from open-ended play

What really impressed me were the generalized strategies the agent applied to multiple situations/goals. E.g., "randomly move things around until something works" sounds simple, but learning to contextually apply that strategy 

  1. to the appropriate objects, 
  2. in scenarios where you don't have a better idea of what to do, and 
  3. immediately stopping when you find something that works 

is fairly difficult for deep agents to learn. I think of this work as giving the RL agents a toolbox of strategies that can be flexibly applied to different scenarios. 

I suspect that finetuning agents trained in XLand in other physical environments will give good results because the XLand agents already know how to use relatively advanced strategies. Learning to apply the XLand strategies to the new physical environments will probably be easier than starting from scratch in the new environment.