another post in the wall

A vibrant pop-art style podcast studio with marionettes hosting, using microphones and equipment, with bright colours and bold comic-like elements.

Wow Us with your Simulacrum

Similar to Alan, as recently noted over on the cogdogblog, I’ve seen an uptick in people trying out a new tool, NotebookLLM to generate podcasts from other documents (docs, books, lists, etc.). I’d never heard of NotebookLLM before this, but am a daily podcast listener and even support a few. I listen to audiobooks regularly, a habit I started both to endure many highway hours driving the prairies of Canada, and also in somecases to ‘read’ the classics I knew I’d never read with my eyes (shout out to librivox). I’m also someone who regularly uses Read Aloud which is included in MS Edge on a daily basis. More recently I’ve been using the Read Page feature in iOS as part of Safari’s reader. This is all to say I’ve become familiar, as a user, with a few different text-to-speech technologies. I for one, have been pretty impressed by how far text-to-speech voices have come in the past few years.

One of the first machine voices I tried engaging with was an audio reading of Project Management for Instructional Designers, a project by David Wiley and co. that included audio readings of all the chapters. The linked book might have updated these, or maybe my memory is fuzzy, but I remember trying to listen to it years ago and it was awful. By comparison, now I can get by with the cadence and my choice of accents in Read Aloud. It’s leaps and bounds better now. So back to the cogdogblog. Alan shares,

Do not get me wrong, the technical accomplishment is impressive. But riddle me this– how many of these have you fully listened to start to finish? Or for more than 10 seconds?Cogdog

I haven’t! Hadn’t. So challenge accepted. I gave it a solid college try. I put on my headphones, and hit play on the file Alan shared while I prepped breakfast. This is something I would normally do. I don’t just sit and listen to a thing. Listening to podcasts is something I do while cleaning, eating, making coffee, etc. Reader, I made it a whopping 10 minutes before I had to shut it off. I tried so hard, but I just couldn’t. I thought maybe it was the content. So I thought I would try something else. I saw D’Arcy Norman had done this to his dissertation. While I follow his blog, and am interested in his work, I’ve never even cracked the dissertation (sorry D’Arcy). This seemed like a change to do it. As he describes on his blog,

Seriously – if you aren’t one of the half-dozen people on the planet who’s read the thing, but are curious about what I spent 6 years of my life working on, this’ll do the trick. Give it 12 minutes and you’re done.D’Arcy Norman

Great, it’s about as long as I made it on the last one. I made it about 1:49 into this one before having to turn it off (really sorry D’Arcy). This time I knew it wasn’t the topic. The work is interesting and in a related field to where I spend my time. So what is this issue? It took a while, to come to the word, I needed a friend’s help the other night to get it, but the NotebookLLM doesn’t create a podcast, it’s a simulacrum (I know, that’s probably not some huge revelation. I noticed similar things to Alan,

  • The banter is remarkable, well at first. But listen closely and you can hear as one voice talking the other chipping in with “Totally”, “100%”, “that’s amazing”
  • The have inflection and intonation that is really not what we are used to for synthetic voices
  • I even heard a few “ums” in there. Weird.
  • The clichés are strong. I heard Buffy say at least 3 times “Work smarter, not harder”
  • The always refer to their “show” as a “Deep Dive”
  • Biff and Buffy carry the same exuberance for every damn topic. I wondered about uploading something really banal and seeing how the hep it up.
  • In one sample listen, you might be wowed. But over a series, Biff and Buffy sound like a bunch of gushing sycophants, those office but kissers you want to kick in the pants.
  • Judging a bit, but to me the voices sound very middle class white. I know the response will be, “they will add more voices” or “it will be improve”.
  • And they come across as hip experts on everything.
  • After about 45 minutes, I am ready to throw them out the window of my truck (I listened while driving)Cogdog

Except he made it way further than I did. So what gives? How is this thing that looks like a duck and quacks like a duck, so totally not a duck?  I think, for me, it stems from my experience of podcasts. In an education context, I recall sitting in my supervisor’s office one day while we booted up an eluminate session and recorded a walk through of the topic we were to discuss. He introduced the topic and we walked through it with me asking questions, but also performing a think-aloud. The connections I made were from outside of what we were discussing. Something I found about the simulacra was that both voices were really the same perspective. They both were formed from the same uploaded docs and so it may have included questions from one voice to the other, but it was really just a back and forth of what felt like the same entitity. Like a puppet master voicing different maroinettes. What I have come to enjoy about podcasts, especially interview style ones, is that you can tell there is a common interest but different backgrounds and experiences.

Ok, well what about those podcasts I listen to that are more storytelling or journalistic? Perhaps the LLM could do that? Not yet at least. If we’ve learned anything about how LLMs generate stories is that they’re very shallow. Behind the Bastards (IIRC) has some really great episodes where they go into detail about GenAI created children’s books overrunning Amazon and the samples they read are just the most shallow attempts at storytelling.

Finally, one of those big things about either two host podcasts, or the interview style podcasts that keep me hooked is slowly getting to know the people. At least their personas. Love them or hate them, listening is like hanging out at a weird house party and feeling a part of the conversation.

Do I still like text-to-speech? Heck yes. If a dissertation summary was created and communicated to me via Read-Aloud, I’d probably have made it the whole 12 minutes. But the awkwardness of this pseudo-self conversation completely gets in the way of just learning about the content which is being spoke about.

3 responses to “Wow Us with your Simulacrum”

  1. As they say, “thanks for listening” but the stuff just keeps steamrolling ahead. As they say, “Totally!”

    1. (slightly cutting you off in a monotone voice) 100%!

  2. Totally. Today we’re going to take a deep dive into a new and interesting topic – have you heard of something called a “simulacrum”?

    Simu-wha?

    Simulacrum. It’s this new thing, and an academic in Canada is writing about it.

    Amazing.

    Totally.

    Anyway. You are forgiven for not reading 280 pages of dense academic text 🙂

To respond on your own website, enter the URL of your response which should contain a link to this post’s permalink URL. Your response will then appear (possibly after moderation) on this page. Want to update or remove your response? Update or delete your post and re-enter your post’s URL again. (Find out more about Webmentions.)