AI and Gaming: What is Next?
Artificial Intelligence (AI) is defined as “a technology that enables computers and machines to simulate human learning, comprehension, problem-solving, decision-making, creativity, and autonomy” (IBM). The term was coined in 1955 in a proposal for a Dartmouth summer research program, but the beginning of the field emerged much earlier with the discussion of “neural networks” in 1943 and “Bayesian inference” in 1763 (Forbes).
This week, we want to focus on the recent developments in the field and how it impacts the gaming industry. Specifically, how these models are evolving, the technical limitations, and what this means for the future of games.
An AI Glossary
Many terms around AI are used interchangeably, so we want to set a baseline in how we are defining AI-related terms:
- Artificial Intelligence: This concept, which was coined in the 1950s, states that human intelligence can be exhibited by machines.
- Machine Learning: AI systems that learn from historical data. The simplest version of this would be a linear regression model in which the system attempts to predict an outcome based on a series of historical input variables.
- Neural Networks: A neural network is a popular type of machine learning model inspired by the human brain. When trained on data, the model learns by attempting to find patterns in the data that help it to output the correct conclusion. One common example where a model attempts to identify a written number can be found here. Over time, the model becomes capable of recognizing patterns, such as distinguishing a "7" from a "1."
- Deep Learning: Deep learning utilizes neural networks and increases the number of factors that it considers, making the model capable of digesting large forms of unstructured data such as natural language processing (NLP) or computer vision.
- Generative AI: This refers to deep learning models that have the ability to generate original content (images, videos, sound, text, etc). This section of AI has drawn significant investment in recent years.
Generative AI Basics
Training
Generative AI applications today run off of a foundational model. This is a deep learning neural network trained on large sums of data and capable of performing a wide variety of general tasks (AWS). These models act as the foundation for building other AI applications.
The training process for state-of-the-art models has radically increased over time. This training process is time-consuming and expensive. Some estimates suggest that GPT-3 cost ~$4.6m (Lambda), Sam Altman suggested GPT-4 cost north of $100m (WIRED), and Anthropic CEO, Dario Amodei, suggested that the next models could cost billions or tens of billions of dollars (Twitter). This should be taken as a grain of salt as Dario is an incumbent incentivised to discourage potential future competition.
Tuning
Tuning is a way to adjust a model to customize it to specific needs. This could be helping it understand domain-specific language, focus on a specific task, layer in context awareness, etc. For example, in gaming this could be used to hone an image model on a company's specific style. This process is delicate and while it has the ability to improve a model, it could also have potential negative side effects.
Inference
Inference is the process of taking what the model has learned during training and creating an output. Expanding on the example above of the neural network trained to identify written numbers, inference is the process of using the trained model to identify a new number that the model has never seen. This is the model actually working and making predictions.
Challenges Remain
Despite the fact that billions of dollars are being spent to create the most powerful foundational models, the industry still faces many roadblocks:
- Scaling: Scaling refers to an increase in compute and data that are dedicated to the model for training. One train of thought, exhibited by Anthropic CEO Dario Amodei, is that scaling has continuously improved the performance of these models and will continue to be the driver of performance in the future (Lex Fridman). Said another way, more compute and more data will be the core reason models get better.
- Compute: The cost of compute for training these models has radically increased, reaching into the billions of dollars. While other efficiency gains can be made to reduce this cost, it is becoming prohibitively expensive for the majority of businesses to build a large-scale foundational model, suggesting that we are moving into a world where there will be a handful of foundational models that are largely commoditized. This could look something like how cloud infrastructure has evolved today.
- Data Availability: A more pressing concern in the industry today is the access to high-quality data. As the amount of compute used increases, the demand for data increases as well in order to maximize the efficiency of training, and we are quickly reaching the point where available text data has been used up, begging the question, “where do the next data sets come from?”. Some suggest that synthetic data, non-human-created data that mimics real-world data, could help. Additionally, some of the largest models have already begun to use reinforced learning from human feedback (RLHF), which uses human feedback to further train the models. It is currently unclear if these methodologies, and others like private datasets, will be sufficient to continue scaling at the rate that we have seen over the past few years.
- Latency: In addition to the quality, speed is an important factor for any model. This is especially important for use cases such as text-to-speech. For example, call centers can not afford to have a substantial amount of latency or customers become frustrated. The balance between speed and power, while constantly shifting, is a limiting concern for some more advanced use cases. We have written before about how local AI has the potential to help this by shifting some inference to the local device.
- Copyright Law: As we have written about in the past, determining what is “substantially transformative” from a copyright perspective is still a matter of debate, making the use of these models risky in a commercial setting. While we believe that most of these works would fall under Fair Use, the regulatory uncertainty could prohibit the broad proliferation of certain tools.
- Data Quality: The output provided by AI is not always accurate and can be inappropriate or just odd, depending on the environment. These "hallucinations" are a product of how foundational models are trained and operated (MIT). These models are also extremely difficult to audit, meaning that even to developers, the internal workings of the model and patterns that it discovers to create reasonable outcomes is largely a black box.
AI Painpoints Transfer to Gaming
Despite some of these limitations, the use cases in gaming are powerful with multiple companies focused on streamlining development with new tools, creating cheaper content, democratizing access to voice tools, and creating dynamic and emergent content. However, we still have a long way to go before the game development process is replaced by AI and we are playing in fully dynamic worlds with generative content emerging based on our actions. Many of these roadblocks are the same ones faced by the industry as a whole. Examples include:
- Data Availability: for things like quests or genre specific dialogue may not be as readily available as the trillions of words that can be digested for more general models.
- Latency: can be prohibitive when attempting to have characters respond and react in real-time.
- Copyright Law: makes it difficult to leverage certain developer tools without fear of public or regulatory scrutiny.
- Data quality: can break immersion in games where developers work rigorously to create a world that tells a cohesive story. It is also a major concern when introducing these models into children's games where room for error is effectively zero.
Strauss Zelnick, the CEO of Take-Two said in a recent interview, “All of our tools do help us become more efficient…that said, it’s going to become commoditized. Everyone is going to have access to the same tools. That is the history of toolsets. What it means, though, is our creative people will be able to do fewer mundane tasks and turn their attention to the really creative tasks,” he said. “The machines can’t make the creative decisions for you” (Gizmodo). Tools are accelerants, not replacements for creativity.
We are bullish on the long-term benefit that developers will have from these tools, and are excited about the pace of development that we are seeing in the industry. However, at the core of the the AI technology suite is a powerful platform to support game developers, it will not be the saving grace for mediocre content. Attracting a sustainable player base is one of the hardest feats for any developer; there are over ten thousand games being released every year on Steam alone (Statista).
Takeaway: The rapid advancements in AI are exciting and will benefit the efficiency and creative output of all industries, not just gaming. Companies focused on development tools, unique content, and democratization of creativity will help gaming companies save money and spur innovation. Although technical challenges still remain, we believe that these tools have the capabilities to elevate game creation.