All of Dan Hendrycks's Comments + Replies

I am strongly in favor of our very best content going on arXiv. Both communities should engage more with each other.

As follows are suggestions for posting to arXiv. As a rule of thumb, if the content of a blogpost didn't take >300 hours of labor to create, then it probably should not go on arXiv. Maintaining a basic quality bar prevents arXiv from being overriden by people who like writing up many of their inchoate thoughts; publication standards are different for LW/AF than for arXiv. Even if a researcher spent many hours on the project, arXiv moderato... (read more)

2JanBrauner1mo
As an explanation, because this just took me 5 minutes of search: This is the section "Computers and Society (cs.CY [http://cs.CY])"
5David Manheim1mo
Strongly agree. Three examples of work I've put on Arxiv which originated from the forum, which might be helpful as a touchstone. The first was cited 7 times the first year, and 50 more times since. The latter two were posted last year, and have not been indexed by Google as having been cited yet. As an example of a technical but fairly conceptual paper, there is the Categorizing Goodhart's law [https://arxiv.org/abs/1803.04585] paper. I pushed for this to be a paper rather than just a post, and I think that the resulting exposure was very worthwhile. Scott wrote the original post, though we had discussed Goodhart's Law quite a bit in LA, and I had written about it on Ribbonfarm. I think the post took significantly less than 300 hours of specific work, but much more than that in earlier thinking and discussions. The comments and discussion around the post was probably fifty hours, but extending it to cover the items I disagreed with, writing it in Latex, making diagrams, and polishing the paper took about another hundred hours between myself, Scott, and others who helped with editing and proofreading. As an example of a large project with a final report, we commissioned an edited summary report / compilatio [https://arxiv.org/abs/2206.09360]n of our MTAIR sequence [https://www.alignmentforum.org/s/aERZoriyHfCqvWkzg]. This was at least a thousand hours of total work on the project, probably closer to 3,000, including all the work on the project and writing. The marginal work over the project and posts was a couple thousand dollars in editing, probably amounting to a few dozen hours of work. (We did not move it to latex, and the diagrams were screenshots rather than being done nicely in Latex.) As an example of a conceptual paper that we put on .CY, here is a model of why people are working on agent foundations [https://arxiv.org/abs/2201.02950] which Issa initially posted on the alignment forum. I pushed for rewiting and posting it on ArXiv. I guesstimate no more

Here's a continual stream of related arXiv papers available through reddit and twitter.

https://www.reddit.com/r/mlsafety/

https://twitter.com/topofmlsafety

I should say formatting is likely a large contributing factor for this outcome. Tom Dietterich, an arXiv moderator, apparently had a positive impression of the content of your grokking analysis. However, research on arXiv will be more likely to go live if it conforms to standard (ICLR, NeurIPS, ICML) formatting and isn't a blogpost automatically exported into a TeX file.

I agree that formatting is the most likely issue. The content of Neel's grokking work is clearly suitable for arXiv (just very solid ML work). And the style of presentation of the blog post is already fairly similar to a standard paper (e.g. is has an Introduction section, lists contributions in bullet points, ...).

So yeah, I agree that formatting/layout probably will do the trick (including stuff like academic citation style).