Utility Maximization = Description Length Minimization

Hypothesis: in a predictive coding model, the bottom up processing is doing lossless compression and the top down processing is doing lossy compression. I feel excited about viewing more cognitive architecture problems through a lens of separating these steps.

What are the best precedents for industries failing to invest in valuable AI research?

There's a fairly straightforward optimization process that occurs in product development that I don't often see talked about in the abstract that goes something like this:

It seems like bigger firms should be able to produce higher quality goods. They can afford longer product development cycles, hire a broader variety of specialized labor, etc. In practice, it's smaller firms that compete on quality, why is this?

One of the reasons is that the pressure to cut corners increases enormously at scale along more than one dimension. As a product scales, eking out smaller efficiency gains is still worth enough money that that particular efficiency gain can have an entire employee, or team devoted to it. The incentive is to cut costs in all ways that are illegible to the consumer. But the average consumer is changing as a product scales up in popularity. Early adopters and people with more specialized needs are more sensitive to quality. As the product scales to less sensitive buyers, the firm can cut corners that would have resulted in lost sales earlier on in the product cycle, but now isn't a large enough effect to show up as revenues and profits go up. So this process continues up the curve as the product serves an ever larger and less sensitive market. Fewer things move the needle, and now the firm is milking its cash cow, which brings in a different sort of optimization (bean counters) which continues this process.

Now, some firms, rather than allow their lunch to get eaten, do engage in market segmentation to capture more value. The most obvious is when a brand has a sub brand that is a luxury line, like basically all car makers. The luxury line will take advantage of some of the advantages of scale from the more commoditized product lines but do things like manufacture key components in, say, germany instead of china. But with the same management running the whole show, it's hard for a large firm to insulate the market segmentation from exactly the same forces already described.

All of this is to answer the abstract question of why large firms don't generate the sort of culture that can do innovation, even when they seemingly throw a lot of money and time at it. The incentives flow down from the top. The 'top' of firms are answerable to the wrong set of metrics/incentives. This is 100% true of most of academia as well as private R&D.

So to answer the original question, I see micro examples of failing to invest in the right things everywhere. Large firms could be hotbeds of experimentation in large scale project coordination, but in practice individuals within an org are forced to conform to internal APIs to maintain legibility to management which explains why something like Slack didn't emerge as an internal tool at any big company.


This is clarifying, thanks.

WRT the last paragraph, I'm thinking in terms of convergent vs divergent processes. So , fixed points I guess.


This is biting the bullet on the infinite regress horn of the Munchhausen trilemma, but given the finitude of human brain architecture I prefer biting the bullet on circular reasoning. We have a variety of overlays, like values, beliefs, goals, actions, etc. There is no canonical way they are wired together. We can hold some fixed as a basis while we modify others. We are a Ship of Neurath. Some parts of the ship feel more is-like (like the waterproofness of the hull) and some feel more ought-like (like the steering wheel).

Some AI research areas and their relevance to existential safety

I see CSC and SEM as highly linked via modularity of processes.

The Pointers Problem: Human Values Are A Function Of Humans' Latent Variables

A pointer is sort of the ultimate in lossy compression. Just an index to the uncompressed data, like a legible compression library. Wireheading is a goodhearting problem, which is a lossy compression problem etc.

The Pointers Problem: Human Values Are A Function Of Humans' Latent Variables

Over the last few posts the recurrent thought I have is "why aren't you talking about compression more explicitly?"

Extortion beats brinksmanship, but the audience matters

The other people of whom you have nude photos, who are now incentivised to pay up rather than kick up a fuss.

Releasing one photo from a previously believed to be secure set of photos, where other photos in the same set are compromising can suffice for single member audience case.

Confucianism in AI Alignment

That's the Legalist interpretation of Confucianism. Confucianism argues that the Legalists are just moving the problem one level up the stack a la public choice theory. The point of the Confucian is that the stack has to ground out somewhere, and asks the question of how to roll our virtue intuitions into the problem space explicitly since otherwise we are rolling them in tacitly and doing some hand waving.

Additive Operations on Cartesian Frames

The main intuition this sparks in me is that it gives us concrete data structures to look for when talking broadly about the brain doing 'compression' by rotating a high dimensional object and carving off recognized chunks (simple distributions) in order to make the messy inputs more modular, composable, accessible, error correctable, etc. Sort of the way that predictive coding gives us a target to hunt for in looking for structures that look like they might be doing something like the atomic predictive coding unit.

Load More