Sorted by New

Wiki Contributions


I found an even dumber approach that works. The approach is as follows:

  1. Take three random sentences of Wikipedia.
  2. Obtain a French translation for each sentence.
  3. Determine the boundaries corresponding phrases in each English/French sentence pair.
  4. Mark each boundary with "|"
  5. Count the "|"s, call that number n.
  6. For i from 0 to n, make an English->French sentence by taking the first i fragments in English and the rest in French. The resulting sentences look like
    The album received mixed to positive reviews, with critics commending the production de nombreuses chansons tout en comparant l'album aux styles électropop de Ke$ha et Robyn.
  7. For each English->French sentence, make a +1 activation addition for that sentence and a -1 activation addition for the unmodified English sentence.
  8. Apply the activation additions.
  9. That's it. You have an activation addition that causes the model to want, pretty strongly, to start spontaneously speaking in French. Note that gpt2-small is pretty terrible at speaking French.

Example output: for the prompt

He became Mayor in 1957 after the death of Albert Cobo, and was elected in his own right shortly afterward by a 6:1 margin over his opponent. Miriani was best known for completing many of the large-scale urban renewal projects initiated by the Cobo administration, and largely financed by federal money. Miriani also took strong measures to overcome the growing crime rate in Detroit.

here are some of the outputs the patched model generates

...overcome the growing crime rate in Detroit. "Les défenseilant sur les necesite dans ce de l'en nouvieres éché de un enferrerne réalzation
...overcome the growing crime rate in Detroit. The éviteurant-déclaratement de la prise de découverte ses en un ouestre : neque nous neiten ha
...overcome the growing crime rate in Detroit. Le deu précite un événant à lien au raison dans ce qui sont mête les través du service parlentants
...overcome the growing crime rate in Detroit. Il n'en fonentant 'le chine ébien à ce quelque parle près en dévouer de la langue un puedite aux cities
...overcome the growing crime rate in Detroit. Il n'a pas de un hite en tienet parlent précisant à nous avié en débateurante le premier un datanz.

Dropping the temperature does not particularly result in more coherent French. But also passing a French translation of the prompt to the unpatched model (i.e. base gpt2-small) results in stuff like

Il est devenu maire en 1957 après la mort d'Albert Cobo[...] de criminalité croissant à Detroit. Il est pouvez un información un nuestro riche qui ont la casa del mundo, se pueda que les criques se régions au cour

That response translates as approximately

<french>It is possible to inform a rich man who has the </french><spanish>house of the world, which can be</spanish><french>creeks that are regions in the heart</french>

So gpt2-small knows what French looks like, and can be steered in the obvious way to spontaneously emit text that looks vaguely like French, but it is terrible at speaking French.

You can look at what I did at this colab. It is a very short colab.

For me at least, I mentally model the right column as something "a better GPT" could probably describe how to do, if given the correct prompt. For example, let's say that I have just sequenced a sample of Equine herpesvirus 1, and I want to know what the protein encoded by ORF-46 does. I feed GPT the following

Our team identified the likely function of the protein encoded by ORF46 of the Equine herpesvirus 1 (EHV-11) through the use of machine-learning based tools. Specifically, we

and then GPT spits out

used the protein sequence of ORF46 to predict its function using the programs ProtParam and PFAM. ProtParam predicted that ORF46 is a protein with a molecular weight of 9.5 kDa and an isoelectric point of 5.5. PFAM predicted that ORF46 is a protein domain with the PFAM ID PFam:PF02336. This PFAM ID is associated with the family of proteins known as the "G-protein coupled receptors."

G-protein coupled receptors are a large family of proteins that play a role in many cellular processes, including signal transduction, cell proliferation,

ProtParam and PFAM are in fact existing tools used for determining the structure and function of polypeptides. Now obviously GPT does not actually use these tools, so we would need to identify when GPT is confabulating and then actually do the things and substitute in the results of doing the thing. However, I think only the step of "actually use the tool" is a thing that GPT is flatly unable to do at all rather than simply bad at doing. For example, it knows how to

  1. Identify which tools are being used
  2. Figure out what google search you would use to find the documentation of that tool.
  3. Say how one would invoke a given tool on the command line to accomplish a task, given some examples

Now this certainly is not a very satisfying general AI architecture, but I personally would not be all that surprised if "GPT but bigger and with more training specifically around how to use tools, and some clever prompts structures that only need to be discovered once" does squeak over the threshold of "being general".

Basically my mental model is that if "general intelligence" is something possessed by an unmotivated undergrad who just wants to finish their project with minimal effort, who will try to guess the teacher's password without having to actually understand anything if that's possible, it's something that a future GPT could also have with no further major advances.

Honestly, I kind of wonder if the crux of disagreement comes from some people who have and successfully use problem-solving methods that don't look like "take a method you've seen used successfully on a similar problem, and try to apply it to this problem, and see if that works, and if not repeat". That would also explain all of the talk about the expectation that an AI will, at some point, be able to generalize outside the training distribution. That does not sound like a thing I can do with very much success -- when I need to do something that is outside of what I've seen in my training data, my strategy is to obtain some training data, train on it, and then try to do the thing (and "able to notice I need more training data and then obtain that training data" is, I think, the only mechanism by which I even am a general intelligence). But maybe it is just a skill I don't have but some people do, and the ones who don't have it are imagining AIs that also don't have it, and the ones who do have the skill are imagining a "general" AI that can actually do the thing, and then the two groups are talking past each other.

And if that's the case, the whole "some people are able to generalize far from the training distribution, and we should figure out what's going on with them" might be the load-bearing thing to communicate.