Tuesday, 31 May 2022

Do Diffusion Models Dream of Electric Sheep?

DoodleChaos on Youtube has used a diffusion model from Google, Disco Diffusion v5.2, to produce a music video. This avenue of AI is all new to me, never heard of this system, but watching it is an astonishing experience.

Best watched on a Smart TV in HD streaming mode.


The poster has described how they did it in the explanatory text for the video. In a nutshell they did as follows:
While this AI is impressive, it still required additional input beyond just the song lyrics to achieve the music video I was looking for. For example, I added keyframes for camera motion throughout the generated world. These keyframes were manually synchronized to the beat by me. I also specified changes to the art style at different moments of the song. Since many of the lyrics are quite non-specific, even a human illustrator would have a hard time making visual representations. To make the lyrics more digestible by the AI, I sometimes modified the phrase to be more coherent, such as specifying a setting or atmosphere. 
In other words, they acted as a director, the machine executes the direction but the direction is all human.

Commentor Jack Luxear made the following comment which I saw after the title for this post popped into my mind while first watching the video.
As an AI engineer, I want to point out something extremely cool to me that otherwise people might not know. All these images and moving through this surreal space.. it's basically exploring a multi-dimensional world, with all the AI remembers.. We're travelling through the AI's actual mind, experiencing the various imagery and 'impulses' it experiences as it interprets the words of the lyrics (or the artist's text. Probably the latter, it looks like DDv5 Turbo). It's like your brain when you're asleep, sorting memories. The fact this looks like we're moving through a perfectly linked continuous 3D space is just a feature of the AI where similar-looking things are connected, and we're morphing various 'alignments' of them.

So is it dreaming? Of course not, there is no 'mind' here, but maybe the similarity with some dream-states and the fact that so many commenters on that video reference dreams should not be idly dismissed. Just as we see something similar to ourselves in the behaviour of other animals, maybe this has lessons for us in understanding our own dream-states.

No comments: