This idea is so simple that I'm sure it's been had by someone somewhere. Suppose we have some method to make really smart honest AIs that do not have goals. Let's say it's a yes/no oracle. Our aimless ace. But we want to accomplish stuff! AIcorp wants the printmoneynow.py. I'm...
Edit Apr 14: To be perfectly clear, this is another cheap thing you can add to your monitoring/control system; this is not a panacea or deep insight folks. Just a Good Thing You Can Do™. * Central claim: If you can make a tool to prevent players from glitching games...
So you want to find that special thing that replicates best and lasts longest? Just vibrate a bunch of molecules a long time! You might reasonably assume that molecules wouldn't practically ever randomly assemble themselves into anything worth looking at. It happens a bit differently instead: 1. Eventually, "solar systems"...
Edit February 2024: Now I think maybe we can't do better. Lately, "alignment" means "follow admin rules and do what users mean". Admins can put in rules like "don't give instructions for bombs, hacking, or bioweapons" and "don't take sides in politics". As AI gets more powerful, we can use...
OpenAI is currently charging 100,000 times less per line of code than professional US devs.[1] An LLM's code output is of course less reliable than a professional's. And it is hard to use a text-completion API effectively in large projects. What should you do if you've got a model on...
Proof that a model is an optimizer says very little about the model. I do not know what a research group is studying outer alignment is studying. Inner alignment seems to cover the entire problem at the limit. Whether an optimizer is mesa or not depends on your point of...
Background So I'm thinking that AI-assisted summarization, math, bug-finding in code, and logical-error finding in writing is at a point where it is quite useful, if we can improve the tooling/integration a little bit. In code I've found it helpful to comment out some lines and write // WRONG: above...