Capybasilisk
350
Ω
2
9
85
Capybasilisk has not written any posts yet.

Capybasilisk has not written any posts yet.

I'd especially like to hear your thoughts on the above proposal of loss-minimizing a language model all the way to AGI.
I hope you won't mind me quoting your earlier self as I strongly agree with your previous take on the matter:
... (read more)If you train GPT-3 on a bunch of medical textbooks and prompt it to tell you a cure for Alzheimer's, it won't tell you a cure, it will tell you what humans have said about curing Alzheimer's ... It would just tell you a plausible story about a situation related to the prompt about curing Alzheimer's, based on its training data. Rather than a logical Oracle, this image-captioning-esque scheme would be an
Near the beginning, Daniel is basically asking Jan how they plan on aligning the automated alignment researcher, and if they can do that, then it seems that there wouldn't be much left for the AAR to do.
Jan doesn't seem to comprehend the question, which is not an encouraging sign.