Someday soon, we’ll speak entire universes into existence.
This article is a guide to the companies building the generative artificial intelligence technology that will lead to these virtual worlds (games, simulations, metaverse applications).
Existing market maps describing the landscape for generative AI lack a convincing organization, instead seeming to be random boxes based on functionality. Since most of my readers are interested in the technologies and companies powering things like games, simulations and metaverse applications — you’ll find this map helpful in charting who is moving these specific experiences forward.
Here’s version 1.0 of the market map, which I’ll update regularly:
Large companies appear more than once in the chart if they have significant investments, research or operations in any of the categories. For smaller companies, I try to keep them in one focused category.
Read on to learn more about how you should interpret the various layers of the value-chain used to organize this chart, as well as what generative AI for virtual worlds so complicated.
Game development provides a view into how virtual worlds will be supported by generative AI. There are many types of creators: the worlds built by studios (the normal mode of game development, where there be from one person to 1000+ people who build the virtual world); the modders who extend them; the people who populate and create while playing. Even the worlds themselves may be imbued with generative features.
Virtual worlds are complex because of their emergent properties: the bigger and more varied their internals, the more they give rise to unexpected behaviors. They are not simply 3-dimensional worlds, they are many-dimensional: with time, social networks, economies and living narratives.
But they’re also complicated: there are myriad of constantly-evolving and hard-to-fit jigsaw pieces in the production process. This diagram gives you a sense of just a few of these:
Just to make a 3D model, you need to go from concepts to modeling to optimizing to texturing to UV-unwrapping to rigging to animating to composing to lighting… Along the way you may return to early stages to make various improvements. And then you need to get this content out to participants in an ever-changing world. All this requires high amounts of expertise and a wide range of supporting technologies at each step.
You could add additional creative pipelines to the above diagram — like music, sound effects and voice-over — but if everything were added, you wouldn’t even be able to make out the diagram anymore.
Generative AI can help with the compositional aspects of this, making it easier to link together these various tasks with the right verbs a workflow requires. But there are also many missing pieces: today, the technology for generating a 3D model that’s readily usable in a virtual world is at its earliest stages.
Let’s return to the map of the market. Here’s what the categories mean, and how they relate to each other:
- Experiences are the playgrounds, applications and virtual worlds that are most impacted by generative AI. To be included here, a company needs to have generative elements directly “in the loop” of the experience, rather than only being a product created with the productivity-enhancing aspects of generative technologies. For example, a game like AI Dungeon is an experience — and so is ChatGPT, which is essentially an application for playing with GPT-3.
- Discovery is the companies that make it easier to find and connect virtual world content and experiences. Companies here have a social, community or search aspect that directly leverage generative AI or support creators as they build virtual worlds.
- Creator Economy is the companies that are making the tools and compositional frameworks that make it easier to create content for virtual worlds. It also includes, SaaS or API-driven approaches to enabling AI applications, such as the approach used by OpenAI.
- Spatial Computing is the companies that are bridging the realm of generative AI technologies and 3D environments (such as generating models, animating models, neural radiance fields, and so forth).
- Decentralization is the companies that are making AI accessible to the world. While a great deal of AI software is quite centralized (such as almost everything from OpenAI), the exponential acceleration of advances in generative technologies are driven by the widespread dissemination of accessible research and models. This includes open-source AI communities (e.g., Hugging Face), open-source models (such as the work of Stability AI) and open-source libraries that are core to generative AI.
- Human Interface is the technology that makes it possible for us to make use of AI. In my market map of the metaverse, this is mostly packaged hardware products such as AR/VR devices. But for generative AI, this has mostly converged on natural language and voice as the simplest human interfaces to a wide range of creative tasks.
- Infrastructure is the fundamental technology that enables AI. This is the realm of physical machines: chip-making equipment from ASML, chip manufacturers like NVIDIA, and the companies deploying networks of equipment.
The largest companies in AI have expansive investments that support virtual worlds:
- NVIDIA is a key enabler of all AI technologies, given that they make the most widely-used chips in AI. Given their strong background in enabling 3D graphics, it should come as no surprise that they have research in most categories related to virtual worlds. Their Omnivese is a platform that acts as a collaborative workspace for 3D creation, including generative inputs; and their research across many types of models allows them to codevelop semiconductors and software like few other companies.
- Meta has research and products in virtually every area: from supercomputing clusters (infrastructure) for training AI models, up through experiences for platforms like the Quest which benefit directly from generative technologies.
- Similarly, Google has products in virtually every category, from chips to end-user experiences.
- Microsoft’s current generative AI is mostly oriented around creator economy technologies that enable others to build the applications. That seems likely to expand dramatically, especially given their OpenAI investment.
- Apple is the most secretive and rarely publishes any research, but their chips now deliver world-class AI performance in their devices (the A16 Bionic, in mobile phones, does 17 TOPS on its neural engine — more than most standalone computers do at the start of 2023!)
- OpenAI is extremely strong for specific AI models (especially LLMs and images), but has mostly focused on an API-oriented systems for the creator economy. ChatGPT is really an end-user application (which consider an experience — even a virtual world — on its own) built on top of their underlying models.
There are other big companies with large investments in AI generally, like Tesla — but I didn’t include them simply because I couldn’t identify anything with applicability to virtual worlds (sorry, making Steam available on the center console doesn’t quite count). That may change if they start making their supercomputing infrastructure available for third-party generative use, or if generative elements of Optimus surface: I’ll be tracking closely.
Decentralized AI is also an interesting battleground to observe: there are companies like Stability who have made open-source access to models their mission. In contrast, companies like OpenAI who guard it closely, sealed behind APIs. The larger technology companies have thus far been hesitant to provide any access to their trained AI models at all. However, some of these same companies make significant contributions towards open-source software that directly support decentralized AI development: for example, TensorFlow was invented at Google and Meta makes major contributions towards PyTorch; these are the two most popular software libraries for building AI systems.
Quick digression for geeks: gradients are a way of understanding how you can turn knobs deep inside a network to result in amazing, emergent properties. Similarly, value chains are just a way of looking at how a nudge in a fundamental technology affects the network of other interrelated and dependent technologies; it explains why something like improvements to advanced chip-making machine from ASML eventually means you’ll be able to speak whole worlds into existence from your home computer. Understanding these exponential slopes of where the market will be — not next month, but years in advance — is going to be the key to building successful strategies for R&D and investing. Likewise, seeing where we can optimize our loss functions shows us the greatest opportunities for value creation.
There are quite a few startups solving important parts of the generative puzzle for virtual worlds, but I wanted to call out three in particular:
- Stability.ai (Decentralization): most people are familiar with them for Stable Diffusion, the generative AI model for 2D art. Stability is notable for creating an open-source version of their diffusion model, and being at the tip-of-the-spear for a more decentralized and open AI technology. They are investing in a wide range of models focused on the creative industries, such as music and audio. All of this will apply to games and virtual worlds.
- Scenario.gg (Creator Economy): they let you create game assets — and fine-tune your own model that helps you maintain artistic consistency. They’ll soon release an API that allows games to generate assets on-the-fly (while players are experiencing the game, rather than only prepared in advance). This is the sort of thing that will move some games up to Level 3 in the generative AI hierarchy in the near future.
- Midjourney (Creator Economy): generates 2D art that’s particularly good at making concept art and other assets that are ready-to-use. These days, I use Midjourney a lot more than stock photos for all my articles and presentations.
Scientific research within generative AI is a huge driver of new capabilities. Much of the research funding for generative AI comes from industry itself (NVIDIA, Meta, Google, Google and OpenAI are at the forefront). Much also continues to rely on traditional institutional relationships.
The market map is focused on the role of commercial technologies: things that have moved out of the lab and into startups and products. Because of the importance of this emerging science, I’ll follow-up in my next article with a summary of the state-of-the-art in the areas most relevant to the topics discussed above.
The application of generative AI to virtual worlds is in its infancy — but is going to grow faster than many expect.
- The Direct from Imagination Era has Begun describes the converging technologies that will result in the ability to “speak worlds into existence.”
- The Five Levels of Generative AI is an article on how to frame progress with generative technologies in virtual worlds (which began with procedural generation and various forms of automation, before generative AI became a thing)
- Market Map of the Metaverse is the foundational structure that inspired this slice into generative AI, and includes others who are creating core technologies, 3D graphics, creative tools and experiences such as games.
- Experiences of the Metaverse provides a guide for the many applications that may be placed in virtual worlds — not only games, but education, collaboration and other forms of media.
Read More: medium.com