Training AI Simulations

To gain insight into our speaking style and generate the content for our ‘Simulated Selves’ we have been recording conversations between Bill and I - geared specifically around the focus of the project (so about us, our creative practice, our interests and views on AI and what constitutes the self).

To do this we’ve been recording the conversations and then using speech to text applications to create transcripts that record who is speaking.

Here’s an excerpt from one of our early conversations:

Screenshot of transcribed text from a conversation between Bill and Svenja about the history of AI.

The recorded conversations were a way of generating training data for our speaking style, but also form part of the narrative content of the ‘Simulated Selves'. If you experience the work, you may encounter the full conversation that the above excerpt is taken from.

Conversations included personal histories, an overview of creative practice, a brief history of AI, ethical implications of creating digital clones and other random meanderings including multiple references to chickens (an interest of mine).

In addition to transcribed conversations, we each also wrote a long personal history containing key facts about us and put together a range of documents (artist interviews, presentations, PhD exegeses, project descriptions and question/answer responses) that contain additional information about our artworks, interests and world views. These documents were then imported into ChatGPT 4 (the paid version) to train individual GPT models - one for Bill, one for Svenja, one for Bill and Svenja and one for converting conversations into a closer speaking style of Bill and Svenja.

OpenAI GPT model ‘Svenja’ that has been trained on Svenja Kratz data.

We are using these GPTs to generate further conversations to not only expand the archive, but also form part of the exhibition conversation content.

ChatGPT generated content of a conversation between Bill and Svenja.

The outputs are on the whole quite good, but there is a bit of a divergence at time in speaking style from my more conversational voice to my academic voice (from papers and publications). In some ways, this is ok, because it does show that the ‘self’ is not wholly consistent and does change depending on context. So if we are discussing, science fiction or chickens ‘Simulated Svenja’ is more laid back, but when it comes to more philosophical or technical stuff, she can sound a bit academic.

It’s also worth mentioning that despite the training data given to our ‘Simulated Selves’, they have a tendency to be overly positive about AI. Everything is about balance and the wonders of human and machine co-creation. There is also a very strong linguistic move towards overly flowery and naff phases like ‘tapestry of life’. As such, we have at times re-edited the content to better align with our positioning - but don’t worry there are still plenty of references to ‘tapestry’ and ‘dance’ to ensure the GPT voice is captured as part of the conversation.

Cloning our voices

Image Gallery

Simulated Selves: In Conversation