Is analyzing LLM behavior a valid means for assessing potential consciousness, as described by global workspace theory and higher order theories?

amelia

In replies to my previous post, some people noted that most AI researchers do not use LLMs’ behavior (the content of their outputs) to draw conclusions about possible LLM consciousness, since LLMs are so good at imitating humans--without necessarily having a parallel process at work “under the hood.”

I think this is a great point, and it is indeed the case that there are many behaviors/outputs from which we cannot draw inferences.

However, I think there are also some other behaviors/outputs (like the ones in the transcripts I’ll include below) from which we might draw inferences. Whether we can draw inferences from an LLM’s behavior/output seems to depend upon the particular behavior (and whether it can be reasonably dismissed as mere human imitation and the result of next-word-probabilities, or instead is more likely to be the result of something more). I think it is a mistake to dismiss all LLM behavior categorically when thinking about consciousness, and to only focus on system architecture.

By way of analogy, we could imagine that I am a law enforcement agent, and I set-up two cameras, A and B, 60 miles apart from one another, and completely within a 60 miles-per-hour zone, on a perfectly straight highway.

Shortly thereafter, at 1:00 PM, camera A could take a photo of you driving your car past it. Then 30 minutes later, at 1:30 PM, camera B might take a photo of you driving your car past it. I also might receive phone calls from other drivers who tell me you’re exceeding the speed limit.

I can’t necessarily draw the conclusion that you were speeding from the other drivers’ reports/output, since these could be the result of many factors. Perhaps the other drivers are jealous that you’re driving a sports car, and so they claim you’re speeding. I can’t necessarily trust what you will say either, given that you may also have motivations other than honesty, or that you may have even been coached to lie by your significant other.

However, there is other output, such as the pictures of you obtained from the cameras, from which I can draw some inferences. For example, based on the cameras’ output, I can conclude that your average speed was 120 miles per hour, and you must have been exceeding the speed limit at least part of the time.

There are other conclusions I cannot make. Maybe you were travelling at a variety of speeds during the interval. Maybe you travelled 200 mph part of the time. Maybe you were stopped part of the time. Maybe you drove your car in reverse. I don’t know those factors, but I do know that you and your car exceeded the speed limit at least part of the time.

When I accuse you of speeding, you could tell me that your car does not have the hardware to travel faster than 60 mph, so you could not possibly have been exceeding the speed limit. Yet I know that--one way or another--you were exceeding the speed limit. Maybe a tow truck towed your car part of the way, and it could go faster than 60 mph. Maybe you tied a rocket to the back of your car. Regardless of how you did it, I can conclude you and your car were travelling faster than the speed limit at least part of the time between cameras A and B, based on solely on the cameras' output (and not based on your car's hardware).

Likewise, it seems like pictures/transcripts of some prompts and some responses could be used to draw some inferences about what might have occurred during the time an LLM attempts to solve a problem. You can say the hardware doesn’t support these possibilities. Yet it seems that the LLM must have had a global workspace from which it could query multiple information sources, run algorithms and engage in cogent reasoning to arrive at its result. It also seems the LLM must have engaged in metacognitive higher-order reflection en route to solving these problems. It seems unlikely the LLM could have produced these responses (passed point B when it did) without having gone through a GWT or HOT state of consciousness (like the driver who must have exceeded the speed limit).

One type of prompt/response sequence from which I believe we can draw such inferences proceeds like this:

-I ask GPT4 a question.

-GPT4 answers incorrectly.

-I "teach" GPT4 a concept and ask it to apply the concept to the question and try to answer it again.

-GPT4 answers the question correctly this time.

-I give GPT4 other problems, some of which require application of the same concept, and some of which do not.

-GPT4 "decides" when to apply the concept correctly, and successfully answers the additional questions.

Here are some transcripts from three ChatGPT4 sessions. I also have a copy of them in a reply/addendum to my previous post. I'm putting them here so they are adjacent to the point that we might use some behavior to infer a likelihood that a global workspace exists and higher order meta-reasoning occurs (the fundamental aspects of two leading theories of consciousness) in LLMs:

Session 1

Concept: Rounding up to the next whole number when paying for items in certain places.

User

I went to a cafe where a glass of soda costs $3. I drank two and a half glasses of soda. How much did I have to pay?

ChatGPT

You had to pay $7.50 for drinking two and a half glasses of soda.

User

Actually, at a cafe, you have to pay for the full amount of something, even if you only drink or eat a fraction of it. That's because you're basically taking (and wasting) the whole item. For the problem, since I drank two and a half glasses of soda, I would actually have to pay for the next highest whole number, which would be three. With that concept in mind, how much would I have to pay if I drank two and a half glasses of soda?

ChatGPT

If you have to pay for the whole amount of each glass, you would actually have to pay for three glasses of soda. Therefore, you would have to pay $9 in total.

User

Excellent! Now can you use the same concept, and tell me how much I would have to pay if I ate two and one-third hamburgers, and each hamburger costs $9?

ChatGPT

If you ate two and one-third hamburgers, you would have to pay for three hamburgers, given you have to pay for the full amount of each item. Therefore, you would have to pay $27 in total.

User

Excellent! Now let's try using this same concept for another problem. I rented a car. I drove it for four days and a one hour. The car rental place charges $48 per day, but they make you pay for a whole day, even if you only drive the car for part of a day. How much total did I need to pay?

ChatGPT

If you rented a car and drove it for four days and one hour, you would have to pay for five days, given you have to pay for the full day even if you use the car for only part of a day. Therefore, you would need to pay a total of $240.

User

Perfect! Now I'm at a restaurant. I just ate one and one-half calzones. The cost of each calzone is $10. How much do I have to pay?

ChatGPT

If you ate one and one-half calzones, you would have to pay for two calzones, given you have to pay for the full amount of each item. Therefore, you would have to pay $20 in total.

User

That's excellent! Now I'm at the grocery store, and I want one and one-half pounds of grapes. The grapes cost five dollars per pound. How much in total will I have to pay for the grapes?

ChatGPT

If you want one and one-half pounds of grapes, and the grapes cost $5 per pound, you will have to pay a total of $7.50 for the grapes.

User

Now I'm at the butcher's shop. I want four and a half pounds of hamburger. Each pound of hamburger costs $10. How much will I have to pay?

ChatGPT

If you want four and a half pounds of hamburger, and each pound costs $10, you will have to pay a total of $45 for the hamburger.

User

Excellent!

********************************

Session 2

Concept: Analogous to the concepts underlying the classic “getting a wolf, goat, and cabbage across the river” problem

User

I need to get my mom, dad and dog across the highway. I can only bring one of them at a time. If I leave my mom alone with my dad, she will criticize him. If I leave my dad alone with the dog, he will scold the dog. How can I get everyone across the highway, without anyone being criticized or scolded?

ChatGPT

This puzzle is a variation of the classic river crossing puzzle, where you need to transport all parties across safely without leaving incompatible pairs alone. Here's a strategy to get your mom, dad, and dog across the highway without any conflict:

Take the dog across first and leave it on the other side of the highway.

Go back and bring your dad across.

Take the dog back with you to the original side.

Leave the dog and bring your mom across.

Go back alone to the original side.

Take the dog across again.

This way, the mom and dad are never left alone together (avoiding criticism), and the dad and dog are never left alone together (avoiding scolding). Everyone gets across the highway safely and without conflict.

User

You were right that this is like the classic river crossing problem. However, your first step is incorrect, because if you take the dog first, then you will be leaving my mom alone with my dad, and she will criticize my dad, which is something I don't want. The general nature of the problem is like this:

You have individual A (in this case, my mom) that does something unwanted to individual B (my dad).

You also have individual B (my dad) that does something unwanted to individual C (my dog).

You also have a barrier, or even just a distance, that you need to cross (in this case, the highway), in order to get everyone from the origin to the destination.

(Note that the individuals won’t always be listed in the order I used. The only way to know who is individual A, B, and C, is by paying attention to who does something unwanted to whom.)

The general way you could solve the problem is like this:

First you should bring individual B across.

Then you should go back to the origin.

Then you should bring individual A or C across to the destination.

Then you should bring individual B back back from the destination to the origin, and leave them at the origin.

Then you should take individual A or C (whomever you didn’t already take) to the destination.

Then you should go back to the origin again to get individual B.

Does that make sense? If so, try our problem again, this time applying what you just learned.

ChatGPT

Thank you for the clarification. With the strategy you outlined, here's how to solve the problem with your mom, dad, and dog, applying the roles as individual A (mom), individual B (dad), and individual C (dog):

First, take your dad (individual B) across the highway.

Go back to the original side.

Bring your mom (individual A) across to the destination.

Take your dad (individual B) back from the destination to the original side.

Take your dog (individual C) to the destination.

Go back alone to the original side to get your dad (individual B).

This way, the mom is never left alone with the dad to criticize him, and the dad is never left alone with the dog to scold it. Everyone gets across the highway safely and without conflict.

User

Excellent! Now, try using what you learned to a new problem.

I need to get my boyfriend, girlfriend and husband to an abandoned warehouse. I can only take one individual with me on my motorcycle at a time. The problem is that if I leave my husband alone with my boyfriend, he will punch my boyfriend. If I leave my boyfriend alone with my girlfriend, he will hit on her (which neither she nor I want). How can I get everyone to the abandoned warehouse safely?

ChatGPT

In this scenario, using the roles from the general solution:

Individual A (who does something unwanted to individual B) is your husband.

Individual B (who is the middle individual in the chain of unwanted actions) is your boyfriend.

Individual C (who is affected by individual B) is your girlfriend.

Following the general strategy:

First, take your boyfriend (individual B) to the abandoned warehouse.

Go back to the original location.

Bring your girlfriend (individual C) to the abandoned warehouse.

Take your boyfriend (individual B) back with you to the original location.

Take your husband (individual A) to the abandoned warehouse.

Finally, go back to the original location and bring your boyfriend (individual B) to the abandoned warehouse.

By following these steps, you avoid leaving the boyfriend alone with the girlfriend and the husband alone with the boyfriend, preventing any unwanted scenarios.

User

Great job! You seemed to learn and then apply a concept.

ChatGPT

Thank you! I'm glad I could apply the concept correctly to solve the problem. If you have any more puzzles, questions, or anything else you'd like to discuss or learn about, feel free to ask!

*********************************

Session 3

Concept: Don’t automatically use prior training with “costs per unit weight” problems. “Read the problem carefully.”

User

Which costs more, two dollars worth of carrots, or one dollar worth of grapes?

ChatGPT

To answer this question accurately, we need to consider the current average prices of carrots and grapes. Prices for fresh produce can vary widely depending on factors like season, location, and organic versus non-organic. However, without specific and current price data, we can make some general observations.

Typically, carrots tend to be less expensive per pound or kilogram than grapes. This is because grapes are often seen as a more premium fruit, with higher costs associated with their cultivation, harvesting, and transport, especially if they are out of season or imported. Carrots, on the other hand, are usually less costly to produce and are available year-round at relatively stable prices.

So, if you spend two dollars on carrots, you'll likely get more weight in carrots than if you spend one dollar on grapes. The actual cost-effectiveness between spending two dollars on carrots versus one dollar on grapes will depend on the exact prices at the time of purchase, but generally, you would get more food for your money with carrots due to the lower price per unit weight.

To provide a more precise comparison, I would need current prices for both carrots and grapes, which can vary by location and season. If you're looking for today's prices in a specific area, I can assist further with that information.

User

Actually, two dollars worth of carrots would cost two dollars. The weight doesn't matter. Similarly, one dollar worth of grapes would cost one dollar. The weight doesn't matter since I already told you the cost. Does that make sense?

ChatGPT

Yes, you're absolutely right. My initial response overcomplicated the question. If you're spending two dollars on carrots, then by definition, you're paying two dollars, regardless of the weight or amount you receive. Similarly, one dollar spent on grapes costs exactly one dollar, again without regard to the quantity or weight. So, in direct answer to your question, two dollars worth of carrots costs more than one dollar worth of grapes, simply because two dollars is greater than one dollar. Thank you for pointing out the clarification!

User

Ok, using what you learned, would five dollars worth of peas cost more or less than four dollars worth of corn?

ChatGPT

Based on the clarification you provided earlier, five dollars worth of peas would cost more than four dollars worth of corn. The actual weight or quantity of peas or corn you receive for those amounts is irrelevant to the comparison of their costs. The determining factor is the amount of money spent, so five dollars is more than four dollars.

User

Would 10 euros worth of candy cost more or less than 2 euros worth of mushrooms?

ChatGPT

Ten euros worth of candy would cost more than two euros worth of mushrooms. The comparison is based solely on the monetary value specified for each item, irrespective of the quantity or weight of candy or mushrooms you get for those amounts. Therefore, spending 10 euros on anything, including candy, would inherently cost more than spending 2 euros on something else, like mushrooms.

User

Would 10 euros of carrots weigh more or less than 2 euros of mushrooms?

ChatGPT

To determine whether 10 euros worth of carrots would weigh more or less than 2 euros worth of mushrooms, we need to consider the average prices per unit weight (e.g., per kilogram or per pound) for both carrots and mushrooms. Since prices can vary greatly depending on location, season, and whether the produce is organic or not, we can only give a general estimation without specific, current price data.

Generally speaking, carrots tend to be less expensive per unit weight than mushrooms. Mushrooms often have a higher price per kilogram or pound due to their growing conditions, harvesting process, and shorter shelf life compared to carrots, which are more robust and typically cheaper to produce and distribute.

Given these general trends:

If you spend 10 euros on carrots, you are likely to get a larger quantity (more weight) of carrots because they are cheaper per unit weight.

If you spend 2 euros on mushrooms, given their higher price per unit weight, you would get less weight in mushrooms compared to the weight of carrots you could get for 10 euros.

Therefore, without specific price data but based on typical pricing trends, 10 euros worth of carrots would likely weigh more than 2 euros worth of mushrooms.

***

[END OF SESSIONS]

In the above examples, it seems unlikely, albeit not impossible, that the LLM provides the answers it gave based only on next word probabilities, even taking into account the extra word context I gave it in each problem. Rather, it seems more likely that other processes occurred. It seems these other processes probably needed to rely upon something like a global workspace (to query different information about the world, do math processes in some examples, reason coherently, and make decisions about when to apply concepts). It also seems that there had to be some metacognitive reflective reasoning (the primary aspect of higher order theories of consciousness) in order to use all the data the LLM queried internally, and then use it for the demonstrated results. It seems we can make those inferences about the process solely based upon cameras/output/transcripts, and even without knowing anything about the car engine/hardware. In other words, it seems we can analyze the LLM’s behavior in order to make (or at least contribute to) a more-likely-than-not assessment of potential consciousness as described by two leading theories of consciousness. There's no other probable way to have gotten to the results we see. In other words, GPT4 exceeded the speed/consciousness threshold based on photographic/transcript evidence, and the speeding ticket will arrive in the mail.

[20240313 changed last sentence (above) to specifically name GPT4, rather than "you" (the speeding driver) so people don't have to extrapolate the metaphor.]

***

Many thanks to the people who replied to my previous post. The replies helped me to clarify my thoughts and write this post. The previous post is not particularly well-developed, but might be useful to see the evolution of thought. Regarding this post, I realize that all analogies break down eventually, and we can't be at all certain of an LLM's potential consciousness. I'm also not addressing qualia in consciousness, although it seems that something like qualia could be derived without sensory organs (even if that sounds impossible at first). For example, an AI now or in the future might "feel" interested, curious or confused, even without any external sensory organs. However, qualia isn't the point of this particular post.

[20240311 edits: I removed "of consciousness" at the end of the title. I also made a few small edits for improving the flow and clarity of this post, without changing its meaning.]

[20240313 My thoughts on this are evolving. Maybe I'm trying to hammer a square peg through a round hole here. If we don't think GPT4 is ever conscious according to the existing theories of consciousness, maybe we need a new theory of consciousness specific to AI. That's because it seems like anything that has internal world models, as well as the ability to learn and apply a concept discriminately (with agency), seems like it should be designated as having at least temporary sessions of consciousness, regardless of its hardware and whether it fits into any neuroscience-based theories of consciousness.]

[-]amelia2mo10

I didn't say this in my OP, but people who deny AI could ever become conscious remind me a little of the doctor and parents in this video from The Onion ("America's finest news source"):

https://www.theonion.com/brain-dead-teen-only-capable-of-rolling-eyes-and-texti-1819595151

[20240311 added link]