A while back I saw someone working on real-time AI image generation in VR and I had to bring it to their attention because frankly I can't express how majestic it is to watch AI-modulated AR transform the world before us into glorious emergent dreamscapes.
Applying AI to augmented or virtual reality is not a new concept, but there have been certain limitations to its application – computing power being one of the main barriers to its practical use. The Stable Diffusion image generation software, however, is an algorithm distilled for use on consumer-grade hardware and has been released under a Creative ML OpenRAIL-M license. This means that not only can developers use the technology to create and launch programs without renting large amounts of server silicon, but they can also monetize their creations.
I was woken up in the middle of the night to conceptualize this project
ScottieFox TTV (opens in new tab) is a creator who has been showing off his work with the VR algorithm on Twitter. “I was woken up in the middle of the night to conceptualize this project,” he says. As a creator, I understand that the Muses like to strike at ungodly hours.
What they brought to it was an amalgamation of Stable Diffusion VR and TouchDesigner (opens in new tab) application-building engine, the results of which he calls “real-time immersive latent space.” This may sound like hippie nonsense to some, but latent space is a concept that fascinates the world right now.
At a basic level, it’s a phrase that in this context describes the potential for expansion that AI brings to augmented reality, as it gathers insights from the vastness of the unknown. While it’s an interesting concept, it’s one for a feature at a later date. Right now I’m interested in how Stable Diffusion VR manages to work so well in real-time without engaging any consumer GPU (even the recent RTX 4090). (opens in new tab)) into a steaming puddle.
Stable Diffusion VR Immersive real-time latent space. 🔥Small clips are sent from the engine to be diffused. Once ready, they are queued back into the projection.Tools used:https://t.co/UrbdGfvdRd https://t.co/DnWVFZdppT#aiart #vr #stablediffusionart #touchdesigner #deforum pic.twitter.com/x3QwQDkapTOctober 11, 2022
“Broadcasting small chunks into the environment saves resources,” Scotty explains. “Small clips are sent from the engine to be broadcast. Once they’re ready, they’re queued back into the projection.” The blue boxes in the images here show the parts of the image that are being worked on by the algorithm at any given time. It’s a much more efficient way to make it work in real time.
Anyone who has used an online image generation tool will understand that a single image can take up to a minute to create, but even though it takes a while to work through each individual section, the results still feel like they’re happening immediately because you’re not focusing on waiting for a single image to finish fuzzing out. And while not at the level of photorealism they might one day be, the videos Scotty is posting are absolutely breathtaking.
Flying fish in the living room, ever-changing interior design ideas, lush forests and nightscapes evolving before your eyes. With AI capable of projecting onto our physical world in real time, there’s a lot of potential for use in the gaming space.
Midjourney CEO David Holz describes the potential for games to one day be “dreams” (opens in new tab) and it certainly seems like we are moving headlong in that direction. However, the next important step is to navigate the minefield that is copyright and data protection issues. (opens in new tab) emerging around the datasets that algorithms like Stable Diffusion have trained on.