It is not always obvious whether your skills are sufficiently good to work for one of the various AI safety and alignment organizations. There are many options to calibrate and improve your skills including just applying to an org or talking with other people within the alignment community.
One additional option is to test your skills by working on projects that are closely related to or a building block of the work being done in alignment orgs. By now, there are multiple curricula out there, e.g. the one by Jacob Hilton or the one by Gabriel Mukobi.
One core building block of these curricula is to understand transformers in detail and a common recommendation is to check if you can build one from scratch. Thus, my girlfriend and I have recently set ourselves the challenge to build various transformers from scratch in PyTorch. We think this was a useful exercise and want to present the challenge in more detail and share some tips and tricks. You can find our code here.
The following is a suggestion on how to build a transformer from scratch and train it. There are, of course, many details we omit but I think it covers the most important basics.
From the ground up we want to
For this calibration challenge, we used the following rules. Note, that these are “soft rules” and nobody is going to enforce them but it’s in your interest to make some rules before you start. We were
Here are some suggestions on what to look out for during the project
I think that the “does it feel right” indicators are more important than the exact timings. There can be lots of random sources of error during the coding or training of neural networks that can take some time to debug. If you felt very comfortable, this might be a sign that you should apply to a technical AI alignment job. If it felt pretty hard, this might be a sign that you should skill up for a bit and then apply.
In some cases, you might want to show the result of your work to someone else. I’d recommend creating a GitHub repository for the project and creating a jupyter notebook or .py file for every major subpart. You can find our repo here. Don’t take our code as a benchmark to work towards, there might be errors and we might have violated some basic guidelines of professional NLP coding due to our inexperience.
In my opinion, there are three important considerations.
I hope this is helpful. In case something is unclear, please let me know. In general, I’d be interested to see more “AI safety up-skilling challenges”, e.g. providing more detail to a subsection of Jacob’s or Gabriel’s post.