Linda Linsefors

Hi, I am a Physicist, an Effective Altruist and AI Safety student/researcher.

Wiki Contributions


But I think orgs are more likely to be well-known to grant-makers on average given that they tend to have a higher research output,

I think your getting the causality backwards. You need money first, before there is an org. Unless you count informal multi people collaborations as orgs. 

I think people how are more well-known to grant-makers are more likely to start orgs. Where as people who are less known are more likely to get funding at all, if they aim for a smaller garant, i.e. as an independent researcher. 

Counter point. After the FTX collapse, OpenPhil said publicly (some EA Forum post)  that they where raising their bar for funding. I.e. there are things that would have been funded before that would now not be funded. The stated reason for this is that there are generally less money around, in total. To me this sounds like the thing you would do if money is the limitation. 

I don't know why OpenPhil don't spend more. Maybe they have long timelines and also don't expect any more big donors any time soon? And this is why they want to spend carefully?

From what I can tell, the field have been funding constrained since the FTX collapse.

What I think happened: 
FTX had lots of money and a low bar for funding, which meant they spread a lot of money around. This meant that more project got started, and probably even more people got generally encouraged to join. Probably some project got funded that should not have been, but probably also some really good projects got started that did not get money before because not clearing the bar before due to not having the right connections, or just bad att writing grant proposals. In short FTX money and the promise of FTX money made the field grow quickly. Also there where where also some normal field growth. AIS has been growing steadily for a while. 

Then FTX imploded. There where lots of chaos. Grants where promised but never paid out. Some orgs don't what to spend the money they did get from FTX because of risk of clawback risks. Other grant makers cover some of this but not all of this. It's still unclear what the new funding situation is.

Some months later, SFF, FTX and Nonlinear Network have their various grant rounds. Each of them get overwhelmed with applications. I think this is mainly from the FTX induced growth spurt, but also partly orgs still trying to recover from loss of FTX money, and just regular growth. Either way, the outcome of these grant rounds make it clear that the funding situation has changed. The bar for getting funding is higher than before. 

Todays thoughts: 

I suspect it's not possible to build autonomous aligned AIs (low confidence). The best we can do is some type of hybrid humans-in-the-loop system. Such a system will be powerful enough to eventually give us everything we want, but it will also be much slower and intellectually inferior to what is possible with out humans-in-the-loop. I.e. the alignment tax will be enormous. The only way the safe system can compete, is by not building the unsafe system. 

Therefore we need AI Governance. Fortunately, political action is getting a lot of attention right now, and the general public seems to be positively inclined to more cautious AI development. 

After getting an immediate stop/paus on larger models, I think next step might be to use current AI to cure aging. I don't want to miss the singularity because I died first, and I think I'm not the only one who feels this way. It's much easier to be patient and cautious in a world where aging is a solved problem. 

We probably need a strict ban on building autonomous superintelligent AI until we reached technological maturity. It's probably not a great idea to build them after that either, but they will probably not pose the same risk any longer. This last claim is not at all obvious. The hardest attack vector to defend against would be manipulation. I think reaching technological maturity will make us able to defend against any military/hard-power attack. This includes for example having our own nano-bot defence system, to defend against hostile nanobots. Manipulation is harder, but I think there are ways to solve that, with enough time to set up our defences.

An important crux for what there end goal is, including if there is some stable end where we're out of the danger, is to what extent technological maturity also leads to a stable cultural/political situation, or if that keeps evolving in ever new directions. 

Recently an AI safety researcher complained to me about some interaction they had with an AI Safety communicator. Very stylized, there interaction went something like this:

(X is some fact or topic related to AI Safety

Communicator: We don't know anything about X and there is currently no research on X.

Researcher: Actually, I'm working on X, and I do know some things about X.

Communicator: We don't know anything about X and there is currently no research on X.


I notice that I semi-frequently hear communicators saying things like the thing above. I think what they mean is that our our understanding of X is far from the understanding that is needed, and the amount of researchers working on this is much fewer than what would be needed, and this get rounded off to we don't know anything and no one is doing anything about it. If this is what is going on then I think this is bad. 

I think that is some cases when someone says "We don't know anything about X and there is currently no research on X." they probably literally mean it. There are some people who think that approximately no-one working on AI Safety is doing real AI Safety researchers. But I also think that most people who are saying "We don't know anything about X and there is currently no research on X." are doing some mixture of rounding off, some sort of unreflexively imitation learning, i.e. picking up the sentence structure from others, especially from high status people. 

I think using a language that hides the existence of the research that does exist is bad. Primarily because it's misinformative. Do we want all new researchers to start from scratch? Because that is what happens if you tell them there is no pre-existing research and they believe you. 

I also don't think this exaggeration will help with recruitment. Why do you think people would prefer to join a completely empty research field instead of a small one? From a personal success perspective (where success can mean either impact or career success) a small research field is great, lots if low-hanging fruit around. But a completely untrodden research direction is terrible, you will probably just get lost, not get anything done, and even if you fid something, there's nowhere to publish it.

Recording though in progress...

I notice that I don't expect FOOM like RSI, because I don't expect we'll get an mesa optimizer with coherent goals. It's not hard to give the outer optimiser (e.g. gradient decent) a coherent goal. For the outer optimiser to have a coherent goal is the default. But I don't expect that to translate to the inner optimiser. The inner optimiser will just have a bunch of heuristics and proxi-goals, and not be very coherent, just like humans. 

The outer optimiser can't FOOM, since it don't do planing, and don't have strategic self awareness. It's can only do some combination of hill climbing and random trial and error. If something is FOOMing it will be the inner optimiser, but I expect that one to be a mess.

I notice that this argument don't quite hold. More coherence is useful for RSI, but complete coherence is not necessary. 

I also notice that I expect AIs to make fragile plans, but on reflection, I expect them to gett better and better with this. By fragile I mean that the longer the plan is, the more likely it is to break. This is true for human too though. But we are self aware enough about this fact to mostly compensate, i.e. make plans that don't have too many complicated steps, even if the plan spans a long time.

There is no study material since this is not a course. If you are accepted to one of the project teams they you will work on that project. 

You can read about the previous research outputs here: Research Outputs – AI Safety Camp

The most famous research to come out of AISC is the coin-run experiment.
(95) We Were Right! Real Inner Misalignment - YouTube
[2105.14111] Goal Misgeneralization in Deep Reinforcement Learning (

But the projects are different each year, so the best way to get an idea for what it's like is just to read the project descriptions. 

Second reply. And this time I actually read the link.
I'm not suppressed by that result.

My original comment was a reaction to claims of the type [the best way to solve almost any task is to develop general intelligence, therefore there is a strong selection pressure to become generally intelligent]. I think this is wrong, but I have not yet figured out exactly what the correct view is. 

But to use an analogy, it's something like this: In the example you gave, the AI get's better at the sub tasks by learning on a more general training set. It seems like general capabilities was useful. But consider that we just trained on even more data for a singel sub task, then wouldn't it develop general capabilities, since we just noticed that general capabilities was useful for that sub task. I was planing to say "no" but I notice that I do expect some transfer learning. I.e. if you train on just one of the dataset, I expect it to be bad at the other ones, but I also expect it to learn them quicker than without any pre-training. 

I seem to expect that AI will develop general capabilities when training on rich enough data, i.e. almost any real world data. LLM is a central example of this. 

I think my disagreement with at least my self from some years ago and probably some other people too (but I've been away a bit form the discourse so I'm not sure), is that I don't expect as much agentic long term planing as I used to expect. 

I agree that eventually, at some level of trying to solve enough different types of tasks, GI will be efficient, in terms of how much machinery you need, but it will never be able to compete on speed. 

Also, it's an open question what is "enough different types of tasks". Obviously, for a sufficient broad class of problems GI will be more efficient (in the sense clarified above). Equally obviously, for a sufficient narrow class of problems narrow capabilities will be more efficient. 

Humans have GI to some extent, but we mostly don't use it. This is interesting. This means that a typical human environment is complex enough so that it's worth carrying around the hardware for GI. But even though we have it, it is evolutionary better to fall back at habits, or imitation, or instinkt, for most situations.

Looking back to exactly what I wrote, I said there will not be any selection pressure for GI as long as other options are available. I'm not super confident in this. But if I'm going to defend it here anyway by pointing out that "as long as other options are available", is doing a lot of the work here. Some problems are only solvable by noticing deep patterns in reality, and in this case a sufficiently deep NN with sufficient training will learn this, and that is GI.

I think we agreement.

I think the confusion is because it is not clear form that section of the post if you are saying 
1)"you don't need to do all of these things" 
2) "you don't need to do any of these things".

Because I think 1 goes without saying, I assumed you were saying 2. Also 2 probably is true in rare cases, but this is not backed up by your examples.

But if 1 don't go without saying, then this means that a lot of "doing science" is cargo-culting? Which is sort of what you are saying when you talk about cached methodologies.

So why would smart, curious, truth-seeking individuals use cached methodologies? Do I do this?

Some self-reflection: I did some of this as a PhD student, because I was new, and it was a way to hit the ground running. So, I did some science using the method my supervisor told me to use, while simultaneously working to understand the reason behind this method. I did spend less time that I would have wanted to understand all the assumptions of the sub-sub field of physics I was working in, because of the pressure to keep publishing and because I got carried away by various fun math I could do if i just accepted these assumptions. After my PhD I felt that if I was going to stay in Physics, I wanted to take year or two for just learning, to actually understand Loop Quantum Gravit, and all the other competing theories, but that's not how academia works unfortunately, which is one of the reasons I left.

I think that the fundament of good Epistemic is to not have competing incentives.

Load More