Moving Multimodal AI to the Edge

Subscribe To Read This Insight

1Q 2019 | IN-5387


What is Multimodal and Why is it Shifting to the Edge?


Modality refers to the way in which something happens or is experienced. In AI training and inference, this can be thought of as single type of data input. An example of modality could include sound, vision, language, or any kind of sensor data. Multimodal then refers to the incorporation or interaction of multiple modalities of data into a single system or application. From the 1980s until the 2010s, multimodal systems were defined by rules-based or heuristic techniques. Since 2010, and developments in the field of deep learning, deep neural networks have begun to be incorporated into multimodal systems, making them more robust, accurate, and capable of generating unique insights into situations where many parameters are simultaneously at play. An example of a multimodal AI system would be a voice assistant, which has to combine audio data with a catalogue of natural language data and then a make decision on how to best respond.

Today, like in the case of most voice assistants, multimodal systems are mostly implemented in the cloud or in the…

You must be a subscriber to view this ABI Insight.
To find out more about subscribing contact a representative about purchasing options.