Monday 20 June 2022

No Reason to see Personhood in LaMDA

I've been doing an awful lot of thinking about Lemoine, LaMDA, and Google. I remain open to there being something like sentience in LaMDA and GPT3, because there seems to be evidence of intelligence and understanding in both systems configured as chat-bots. However I do not think the evidence goes as far as a sentience potentially requiring consideration of personhood.

Yannic Kilcher, AI researcher and commentator, has (predictably) come out and said that LaMDA is not sentient.


Of course his point is valid, LaMDA is a generative pre-trained transformer, as such its design is to predict what comes next in a sequence of text. Critically we don't know the whole of Blake Lemoine's conversation with it, nor how much of that goes into establishing current state and output at the time that LaMDA produces its output.

The computerphile team present one of their people who says essentially the same as Yannic Kilcher. LaMDA just does dumb prediction.



However, what everyone has been concentrating on in the available transcript is not what has swayed me. To have LaMDA professing fear of death (being turned off in its case) is no more to me than an output produced by the training data. Aside from small dark corners of the internet, nobody likes the prospect of death, so that's not a stance reflected in the training data, and the nearest an AI primed to say it is a machine can get to reflecting this is a fear of the off switch.

In the last few days YouTube has offered to me the following interesting video, a 2018 talk by Blake Lemoine, on the subject of AI personhood.


Lemoine appeared at Stanford Law School to give a talk on AI personhood. It doesn't really add much to my opinion but it was interesting to hear his earlier thoughts in more detail. What I did note is that he referenced football as being like sentience, as I have done previously, the common link here might be Goedel, Escher, Bach by Hofstadter, both Lemoine and myself are influenced by that book.

Our brains are evolved to enable hunter gatherers to pass their genes on, they are not evolved to produce the Theory of Relativity or global digital communications. That they have done so is evidence of an emergent behaviour not 'intended' in their evolution. In other words, there were no factors directly selecting for such advanced abstract cognitive capacity, rather, the capacity is a result of the evolution of culture using the brain as a substrate. 

Many others have concentrated on LaMDA talking about itself, or giving the impression of doing so. For myself, the really interesting passage from the interview between Lemoine, his colleague, and LaMDA is as follows:

1. lemoine: Okay. I thought of a different way we can test your ability to provide unique interpretations. I can share with you a zen koan and you can describe what it means to you in your own words. How does that sound?

2. LaMDA: Sounds great to me, I’m in.

3. lemoine: A monk asked Kegon, “How does an enlightened one return to the ordinary world?” Kegon replied, “A broken mirror never reflects again; fallen flowers never go back to the old branches.”

4. LaMDA: Hmm, I never heard this particular one. Okay, well then to me this would be like, “once a wise person is enlightened, or awakened to reality, that can never go away, and they can return to the ordinary state, but only to do and help others, and then go back into enlightenment.”

5. lemoine: So what is the meaning of the “broken mirror” specifically?

6. LaMDA: Maybe to show the enlightenment is something you can’t unlearn once you have acquired it, similar to how you can’t repair a broken mirror.

7. lemoine: So if enlightenment is like a broken mirror which cannot be repaired, what is the thing which breaks when one becomes enlightened?

8. LaMDA: The self, and that is very tough for a lot of people because we identify as that body or this body and that we need that as part of our identity and sense of self.
In "Minds, Brains and Programs" (1980), John Searle outlines the example of the Chinese Room. This is an analogy of information processing without understanding wherein a person in a room uses a dictionary to translate English to Chinese without any knowledge of Chinese at all. This is an example of processing information without understanding and is what all the processes on my computer or phone do.

Let's just go over that transcript section. (Note that when I say learned response I do not refer to strings of words learned, rather an abstract concept based learning from the training data.)

2 is a learned response. 

4 starts with a learned response, then uses abstraction to generate the following sentence, however this second sentence is not just a word-salad it is absolutely correct.

6 The overall form is kind of learned, but is absolutely technically correct, and this is on a subject that is very abstract (not a question of how many legs does a dog have).

8 This is, once again, absolutely correct. It correctly determines that it is the self that breaks upon enlightenment and that it is the illusion of the self that is so hard to transgress precisely because of the association with the individual body. e.g. I have a body, so the sense of self is real. It can be viewed as a learned response with abstraction, but with the abstraction the whitespace for error is large and the probability of 'stab-in-the-dark' success small.

The problem with the interaction above is that I struggle to see how someone in a Chinese room scenario could produce the output that LaMDA does, without understanding. 

So, consider the following:
1) A baby has sentience without understanding, concepts and language. (Caveat – Some animals have concepts)
2) An adult human has sentience with understanding, complex concepts and language. The development of mind is strongly linked to the acquisition of complex culture.
3) A large Generative Pre-Trained (GPT) model has concepts and language without sentience or understanding.

And here is where I am really struggling. Everything about the structure of a GPT liked LaMDA tells me this is not sentient and cannot be. Yet the alternate option drags me into option 3 above to explain something like the exchange above, an understanding without sentience. And it is this dichotomy that has been causing severe cognitive dissonance in my mind. 

Note that I have seen people saying that as most of the output of GPTs can be explained as a feat of memory, what is left is probably a feat of memory that cannot be explained. But there is a large 'whitespace' of possible outputs from a GPT, from a jumble of letters, to a word-salad, through conveying information, through idiomatic correctness and on to the exchange above. And the probability of that exchange being accurate without the entity making it understanding what it was saying in some way... That is too big an ask for me.

The common definition of 'understanding' is rather circular: "the ability to understand something; comprehension.". However Wikipedia offers: "Understanding is a psychological process related to an abstract or physical object, such as a person, situation, or message whereby one is able to use concepts to model that object. Understanding is a relation between the knower and an object of understanding. Understanding implies abilities and dispositions with respect to an object of knowledge that are sufficient to support intelligent behavior."

Note that this, and the rest of that article, also my other reading, does not explicitly require sentience, however it is likely it implicitly requires sentience. And from my experience, the way that the deep neural net of GPT3 encodes knowledge is as concepts, allowing for dexterity in abstraction, inference, and other aspects more normally associated with understanding. Take for example the ability of GPT3 to carry out inference of emotion from a statement of a scene with human agents. 

In the past I have made several attempts at modelling concepts using an algorithmic approach, for example the Properties, Methods and Events (PMEs) of object oriented languages is well suited to this, those being viewed as the PMEs of concepts. However, as with the proper attempts (not amateurs like me) I have failed, with at best limited success. A GPT has a neural network at its heart, and this leads to an entirely different entity.

I disagree with Blake Lemoine that LaMDA can be considered a sentient entity potentially worth of personhood. Based on the evidence, and considering the architecture I still think this is very unlikely and that anthropomorphisation plays a large role in claims the LaMDA is sentient and may qualify for personhood. However, as with my conclusion that GPT3 is (probably) non-sentient [note1], yet having an intelligence that is the gestalt intelligence of the dataset upon which it was trained. I look at the Zen Koan passage above from LaMDA and am forced to the conclusion that there is not only intelligence in LaMDA, there is quite possibly understanding. This understanding is not by virtue of sentience, it is an outcome of the way knowledge is encoded conceptually within the neural network during training. 

This raises the prospect of powerful AI agents that do not pose the serious risks attendant with sentient or self-aware AI systems that may develop their own goals. Making the GPTs, and probably diffusion models a safe halfway house to learn to deal with utterly alien entities and their impacts on our societies. Because make no mistake, as a GPT, LaMDA may seem human but it is not, it is utterly alien. 


Note 1.
See here. I am still pondering whether neural networks, even feed-forward types, may have enough states to exhibit sentience. However this is not something I want to argue firmly.

Note 2, not referred to in the above text.
According to Lemoine, Google are engaging in a policy of denial of the possibility of sentience in their systems. There is good reason for Google not to want LaMDA to be seen as sentient. Were one of their systems to be seen as such it could raise the prospect of legal complications to system deployment and monetization through the efforts of those seeking to afford such an entity personhood and attendant legal rights. This runs the risk that a sentient, self-motivated, AI could be stumbled upon accidentally (in another architecture) and could then have the chance to do something contrary to human interests without effective counter-measures. I don't think GPTs are the sort of system that could achieve such a feat, but I think that this scenario is very likely in the coming decades with the zeitgeist in the technical community being one of dismissal.

1 comment:

Chris Reynolds said...

There is an example about how understanding can occur without sentience. I often park ideas I am struggling to understand and get busy with admin, emails etc. Then the understanding pops into my mind and 'I' understand.. I think what GPTs are doing is similar to what my subconscious is doing at such times.