NVIDIA and AWS Helping to Democratize the Metaverse

Subscribe To Download This Insight

4Q 2021 | IN-6368

While the full metaverse is still a thing of the future, current 3D simulations are providing more effective and valuable connection possibilities.

Registered users can unlock up to five pieces of premium content each month.

Log in or register to unlock this Insight.


Simulations/Digital Twins, Avatars, and AI/ML among the Key Metaverse Highlights at NVIDIA GTC 2021


NVIDIA made several announcements at its November Graphics Processing Unit (GPU) Technology (GTC) 2021 event, with many of them pertaining to the future metaverse and NVIDIA’s platform, Omniverse. First and foremost, NVIDIA pointed to the power of simulations and digital twins; the company envisions these as the bedrock for merging the real and virtual worlds. Simulations and digital twins can enable significant savings by shifting some of the requirements from physical testing to virtual, improving efficiencies and reducing costly repairs/replacement of hardware and equipment by better predicting maintenance needs and scheduling. Simulations also bring a higher order of precision and accuracy not possible with real-world testing that may be limited by the number of trials (and inputs) and uncontrollable variables. Simulations can account for a myriad of potential events and conditions by creating and using synthetic data, and NVIDIA has highlighted its Omniverse Replicator as a simulation framework to generate physically accurate synthetic data for training neural networks and predicting future outcomes. NVIDIA Modulus—the framework for developing physics–Machine Learning (ML) models—further extends the reach for physically accurate simulations to different verticals and industries from those seeking to run simulations from worldwide events down to the level of molecular biology.

Simulations and digital twins speak to more than the industrial, engineering, and manufacturing examples often showcased at these events. Companies in the media industry, for example, are also leveraging digital simulations of locations to create content and advertisements without having to deploy onsite film crews. This saves money and also minimizes any potential impact to natural ecosystems by avoiding the deployment of large film crews.

The fusion between real and virtual also extends to users with natural language and avatars. NVIDIA’s Nemo Megraton offers a framework for training speech and language models that, for instance, could translate multiple languages in real-time during video conferences (Microsoft was a cited as a use partner) and support customer call centers and multinational-enterprise workflows. NVIDIA has also announced Triton, which the company claims is the world’s first distributed inferencing engine that can work across multiple GPUs and nodes to deploy these types of models into the real world.

On the avatar front, NVIDIA has showcased its Omniverse Avatar for Project Maxine that covers both language processing and comprehension (enabled by NVIDIA Riva neural speech Artificial Intelligence (AI); the input/output of Maxine currently supports seven languages), computer vision (enabled by NVIDIA Metropolis, the computer vision framework for video analytics), and avatar animations (enabled by NVIDIA Video2Face and Audio2Face for nonverbal cues and expression of emotions). While use cases could extend widely, NVIDIA has called specific attention to five areas: live customer support, web customer support, video conferencing and telepresence, games, and robots. NVIDIA Merlin—the framework for deep-learning recommender systems—underpins the development of recommendation engines that are critical for applications not only for customer service but also for virtual assistants and personalization.

More broadly speaking, NVIDIA Omniverse is working to make workflows interoperable and more accessible to a wider breadth of companies for simulations. NVIDIA’s vision to see Omniverse as a basis for a future 3D Internet with Universal Scene Description (USD)—akin to HTML for 3D—may sound like a far-off future, but it resonates well with the necessary buildup to the metaverse.

Omniverse and 3D: Making the Virtual and 3D More Accessible


In truth, NVIDIA’s vision of this future is already taking shape—Omniverse has already been evaluated by over 700 companies (including Ericsson, BMW Group, Lockheed Martin, and WPP) and 70,000 individuals. NVIDIA has further announced the availability of Omniverse Enterprise with an annual subscription license starting at US$9,000 per year (a workgroup of two creators, ten reviewers, and four nucleus subscriptions). Companies that use city-scale 5G simulations like Ericsson and BMW that use factory simulations have already used and benefited from the NVIDIA Omniverse platform, and as more companies shift workflows to hybrid environments, the need and demand for these types of solutions and platforms will grow in kind. This demand will extend well beyond the basic need to connect remote and in-office employees.

Perhaps most critically, Omniverse is making 3D and virtual worlds more accessible to a wider audience. Companies without deep expertise in AI/ML and natural language models will be able to deploy new, conversational AI assistants to improve customer service from point of sale (i.e., kiosks) to service calls. Colleagues from around the world will be able to interact in lifelike virtual environments with avatars that fully reflect their real-world nonverbal cues while breaking down language barriers. Workflows that have been disjointed or siloed will become increasingly interconnected and done in parallel in real time, greatly increasing efficiency and value of collaboration.

NVIDIA isn’t alone in bringing these technologies and tools to a wider audience. At its re:Invent 2021 event, Amazon Web Services (AWS) announced AWS Internet of Things (IoT) TwinMaker to make it easier to make and deploy digital twins (see this insight for additional insights on TwinMaker: IN-6376). AWS has also built on SageMaker and has announced Amazon SageMaker Canvas which allows companies with little data science expertise to run entire ML workflows using a point-and-click user interface. All of these solutions and efforts will accelerate the adoption of these technologies and tools that will benefit the buildup toward the metaverse.

Further, these use cases are not gated on the need to visualize or interact in virtual environments or assets in 3D, nor are they dependent on hardware like augmented reality (AR) and virtual reality (VR) Head-Mounted Displays (HMDs). Mesh for Microsoft Teams, for example, brings together virtual avatars and video feeds in a unified collaboration session without the need for any HMDs. These experiences still lack the fluidity of a real-office setting, but the gap is closing. New features like language translation and tracking of nonverbal cues (and simulated eye contact) are beginning to bring heightened value to these types of communications, and as a result, more companies will invest in these platforms and services. For those companies who are already heavily invested in 3D workflows, trials of platforms like Omniverse need to be on the short list for near-term planning.

Focus on Value, not on the Spectacle and the Buildup to the Metaverse


When speaking of the metaverse, it is easy to get wrapped up in the longer-term vision and perhaps write it off as too long term for current needs and planning. To counter this notion, NVIDIA GTC has provided ample reasons why companies across industries should begin evaluating the opportunities new forms of Communications and Collaboration (C&C) that platforms like Omniverse bring today and in the near future. It’s important to read past some of the current characterizations of these early metaverse building blocks; even Microsoft has positioned Mesh for Microsoft Teams as a way to make collaboration in the metaverse “personal” and “fun.” These early implementations may seem better aligned to the creative or “fun” side of how we communicate, but there is plenty of research to attest to the value and importance of nonverbal cues and emotional expressions for meaningful communication. This is also why companies like Meta are working on hardware that will track users’ facial expressions and gazes.

Companies that have already embraced a more immersive take on C&C (i.e., immersive collaborative platforms like Spatial, VirBELA, etc.) must plug into new platforms like NVIDIA Omniverse to reach a wider audience. For immersive C&C platforms, this might mean an expansion beyond NVIDIA’s Unity or Unreal engines, but the scalability and depth of simulations offered by Omniverse will allow these immersive C&C companies to target new use cases and applications. Unity and Unreal are still key players here, since both have moved well beyond gaming and have expanded into the enterprise space; Unreal also has a connector built into USD (Unity is expected to add a connector as well). Flexibility will be the key. Although it is early in the process, momentum is building, and there is ample opportunity for companies to innovate and differentiate by taking some early actions and capturing some first-mover advantages.