I feel that the social instincts link to the learned-from-scratch world-model via a chain of guided development windows.
The singular links in the chain are stacks of affective mechanisms: the trigger that detects the environmental stimulus (the moving large object for ducklings), the response (follow that object), and an affect (emotion) that links the instinct to the learned model via a reward signal to strengthen the association (feeling of safety).
As it would be near impossible for the DNA to have a concept of "Rita won a trophy" as the trigger, the system would have to first "teach" the model simpler concepts, and then tag onto those via the affect to be able to trigger later correctly: for example, "Rita" would be identified as a "member of the pack/competition", which would be derived from the concept of "agent". This in turn would have to be first learned via the associations that spring from the early instincts of "pheromones", "human voice", "eyes" etc..

These simpler concepts from our early years occur in development windows. F.ex. for the first 8 weeks babies don't focus their gaze on anything, as they are still learning the basics of seeing. After they have a slight better capacity to predict what they see, the next development window opens, which among other things, has a filter to detect eyes. For a while the eyes are associated with an "agent" and "safety", hence the babies smile instantly at their parents faces, while pretty soon this filial imprinting window closes, and they start to cry at the sight of new faces instead.

I have some of these chains of instincts mapped out on an initial level, and am soon trying out these theories within an environment closely resembling to OpenAI's gym (the architecture didn't lend itself easily to this new reward paradigm, unfortunately). Maybe they could be discussed further with some interested people? 

Also, little glimpse of empathy has some literature under the term mirror neurons.