I guess this falls into the category of "Well, we’ll deal with that problem when it comes up", but I'd imagine when a human preference in a particular dilemma is undefined or even just highly uncertain, one can often defer to other rules like--rather than maximize an uncertain preference, default to maximizing the human's agency, in scenarios where preference is unclear, even if this predictably leads to less-than-optimal preference satisfaction.
Still working my way through reading this series--it is the best thing I have read in quite a while and I'm very grateful you wrote it!
I feel like I agree with your take on "little glimpses of empathy" 100%.
I think fear of strangers could be implemented without a steering subsystem circuit maybe? (Should say up front I don't know more about developmental psychology/neuroscience than you do, but here's my 2c anyway). Put aside whether there's another more basic steering subsystem circuit for agency detection; we know that pretty early on, through some combi... (read more)
Hey Steve, I am reading through this series now and am really enjoying it! Your work is incredibly original and wide-ranging as far as I can see--it's impressive how many different topics you have synthesized.
I have one question on this post--maybe doesn't rise above the level of 'nitpick', I'm not sure. You mention a "curiosity drive" and other Category A things that the "Steering Subsystem needs to do in order to get general intelligence". You've also identified the human Steering Subsystem as the hypothalamus and brain stem.
Is it possible things like a ... (read more)
Very late to the party here. I don't know how much of the thinking in this post you still endorse or are still interested in. But this was a nice read. I wanted to add a few things:
- since you wrote this piece back in 2021, I have learned there is a whole mini-field of computer science dealing with multi-objective reward learning, maybe centered around . Maybe a good place to start there is https://link.springer.com/article/10.1007/s10458-022-09552-y
- The shard theory folks have done a fairly good job sketching out broad principles but it seems... (read more)
That's right. What I mainly have in mind is a vector of Q-learned values V and a scalarization function that combines them in some (probably non-linear) way. Note that in our technical work, the combination occurs during action selection, not during reward assignment and learning.
I guess whether one calls this "multi-objective RL" is semantic. Because objectives are combined during action selection, not during learning itself, I would not call it "single objective RL with a complicated objective". If you combined objectives during reward, then I could call... (read more)
Interesting. Is it fair to say that Mollick's system is relatively more "serial" with fewer parallelisms at the subcortical level, whereas you're proposing a system that's much more "parallel" because there are separate systems doing analogous things at each level? I think that parallel arrangement is probably the thing I've learned most personally from reading your work. Maybe I just hadn't thought about it because I focus too much on valuation and PFC decision-making stuff and don't look broadly enough at movement and other systems.
Apropos of nothing, is... (read more)