Hmm, the inherent 1d nature of the visualization kinda makes it difficult to check for selection effects. I'm not convinced that's actually what's going on here. 1725 is special because the ridges of the splotch function are exactly orthogonal to x0. The odds of this happening probably go down exponentially with dimensionality. Furthermore, with more dakka, one sees that the optimization rate drops dramatically after ~15000 time steps, and may or may not do so again later. So I don't think this proves selection effects are in play. An alternative hypothesi
...Now this is one of the more interesting things I've come across.
I fiddled around with the code a bit and was able to reproduce the phenomenon with DIMS = 1, making visualisation possible:
Here's the code I used to make the plot:
import torch
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits import mplot3d
DIMS = 1 # number of dimensions that xn has
WSUM = 5 # number of waves added together to make a splotch
EPSILON = 0.10 # rate at which xn controlls splotch strength
TRAIN_TIME = 5000 # number of iterations to train for
LEARN
... My impression is that people working on self-driving cars are incredibly safety-conscious, because the risks are very salient.
Safety conscious people working on self driving cars don't program their cars to not take evasive action after detecting that a collision is imminent.
(It's notable to me that this doesn't already happen, given the insane hype around AI.)
I think it already has.(It was for extra care, not drugs, but it's a clear cut case of a misspecified objective function leading to suboptimal decisions for a multitude of individuals.) I'll n
...My worry is less that we wouldn't survive AI-Chernobyl as much as it is that we won't get an AI-Chernobyl.
I think that this is where there's a difference in models. Even in a non-FOOM scenario I'm having a hard time envisioning a world where the gap in capabilities between AI-Chernobyl and global catastrophic UFAI is that large. I used Chernobyl as an example because it scared the public and the industry into making things very safe. It had a lot going for it to make that happen. Radiation is invisible and hurts you by either killing you instantly, making
...I agree that ML often does this, but only in situations where the results don't immediately matter. I'd find it much more compelling to see examples where the "random fix" caused actual bad consequences in the real world.
[...]
...Perhaps people are optimizing for "making pretty pictures" instead of "negative log likelihood". I wouldn't be surprised if for many applications of GANs, diversity of images is not actually that important, and what you really want is that the few images you do generate look really good. In that case, it makes complete sense to p
A likely crux is that I think that the ML community will actually solve the problems, as opposed to applying a bandaid fix that doesn't scale. I don't know why there are different underlying intuitions here.
I'd be interested to hear a bit more about your position on this.
I'm going to argue for the "applying bandaid fixes that don't scale" position for a second. To me, it seems that there's a strong culture in ML of "apply random fixes until something looks like it works" and then just rolling with whatever comes out of that algorithm.
I'll draw attention
...To me, it seems that there's a strong culture in ML of "apply random fixes until something looks like it works" and then just rolling with whatever comes out of that algorithm.
I agree that ML often does this, but only in situations where the results don't immediately matter. I'd find it much more compelling to see examples where the "random fix" caused actual bad consequences in the real world.
I'll draw attention to image modelling to illustrate what I'm pointing at. [...] It also became obvious that GANs were d...
Does anyone know if double decent happens when you look at the posterior predictive rather than just the output of SGD? I wouldn't be too surprised if it does, but before we start talking about the bayesian perspective, I'd like to see evidence that this isn't just an artifact of using optimization instead of integration.
I wonder if this is a neural network thing, an SGD thing, or a both thing? I would love to see what happens when you swap out SGD for something like HMC, NUTS or ATMC if we're resource constrained. If we still see the same effects then that tells us that this is because of the distribution of functions that neural networks represent, since we're effectively drawing samples from an approximation to the posterior. Otherwise, it would mean that SGD is plays a role.
...what exactly are the magical inductive biases of modern ML that make interpolation work so wel
Deriving bounds on the generalization error might seem pointless when it's easy to do this by just holding out a validation set. I think the main value is in providing a test of purported theories: your 'explanation' for why neural networks generalize ought to be able to produce non-trivial bounds on their generalization error.
I think there's more value to the exercise than just that, it may be less useful in the iid case with lots of data where having a "validation set" makes sense, but there are many non-IID time series problems where effectively your
...I'll take a crack at this.
To a first order approximation, something is a "big deal" to an agent if it causes a "large" swing in its expected utility.
do you think any reasonable extension of these kinds of ideas could get what we want?
Conditional on avoiding Goodhart, I think you could probably get something that looks a lot like a diamond maximiser. It might not be perfect, the situation with the "most diamond" might not be the maximum of it's utility function, but I would expect the maximum of it's utility function will still contain a very large amount of diamond. For instance, depending on the representation, and the way the programmers baked in the utilty function, it might have a quirk in it's
...Do you think we could build a diamond maximizer using those ideas, though?
They're definitely not sufficient, almost certainly. A full fledged diamond maximizer would need far more machinery, if only to do the maximization and properly learn the representation.
The concern here is that the representation has to cleanly demarcate what we think of as diamonds.
I think this touches on a related concern, namely goodharting. If we even slightly miss-specify the utility function at the boundary and the AI optimize in an unrestrained fashion, we'll end up wit
...I'm personally far more optimistic about ontology identification. Work in representation learning, blog posts such as OpenAI's sentiment neuron, and style transfer, all indicate that it's at least possible to point at human level concepts in a subset of world models. Figuring out how to refine these learned representations to further correspond with our intuitions, and figuring out how to rebind those concepts to representations in more advanced ontologies are both areas that are neglected, but they're both problems that don't seem fundamentally intractabl
...Under this view, alignment isn’t a property of reward functions: it’s a property of a reward function in an environment. This problem is much, much harder: we now have the joint task of designing a reward function such that the best way of stringing together favorable observations lines up with what we want. This task requires thinking about how the world is structured, how the agent interacts with us, the agent’s possibilities at the beginning, how the agent’s learning algorithm affects things…
I think there are ways of doing this that don't involve exp
...I think this is a good sign, this paper goes over many of the ideas that the RatSphere has discussed for years, and Deepmind is giving those ideas publicity. It also brings up preliminary solutions, of which, "Model Based Rewards" seems to go farthest in the right direction.(Although even the paper admits the idea's been around since 2011)
However, the paper is still phrasing things in terms of additive reward functions, which don't really naturally capture many kinds of preferences (such as those over possible worlds). I also feel that the causal influence
...
Hypothesis: Unlike the language models before it and ignoring context length issues, GPT-3's primary limitation is that it's output mirrors the distribution it was trained on. Without further intervention, it will write things that are no more coherent than the average person could put together. By conditioning it on output from smart people, GPT-3 can be switched into a mode where it outputs smart text.