Four Benefits Multimodal Interfaces Offer Small Businesses
Velinov was responding to two AI developments: the release of OpenAI’s ChatGPT-4o, by his estimation the first truly multimodal model, and the announcement of Google’s Project Asta, a universal, Gemini-powered assistant prototype that supports multimodal inputs.
“Today’s AI models are evolving toward more advanced and diverse data processing capabilities across text, audio and video. Over the last two years, we have seen improvements in the quality of each modality,” notes a recent McKinsey report. “For example, Google’s Gemini Live has improved audio quality and latency and can now deliver a human-like conversation with emotional nuance and expressiveness. Also, demonstrations of Sora by OpenAI show its ability to translate text to video.”
How will these advancements in multimodal interfaces impact small businesses?
- Easy Upskilling: Multimodal interfaces that can carry on a natural dialogue and even offer visual feedback are easier for employees to navigate and learn from. With these interfaces becoming the future of the workplace, early adopters will reap dividends as they evolve and scale.
- One-Stop Shops: Employees are used to interacting with a variety of applications to complete their work, but multimodal interfaces eliminate the need to transition between them. For example, the AI can understand a voice command initiating a video call and identify hand gestures used to advance a slide deck.
- Lower Overhead: The fewer apps and pieces of associated hardware (such as cameras and microphones) that businesses need, the greater their savings.
- Promoting Diversity: Interfaces that respond to verbal commands and hand gestures are more accessible to employees with diverse needs and thus more inclusive. In general, they allow workers to communicate via their preferred method.
DISCOVER: Small language models drive business efficiency.
Thousands of Multimodal Models To Choose From
Small businesses often have small IT teams and budgets, with about 60% of the 2,000 small and midsize business leaders interviewed for a 2023 Connected Commerce Council survey said they plan to use AI tools to save time and money. planning to use AI tools to save time and money, according to
Early multimodal interface use cases include the medical field, where clinicians are using them to process conversations with patients and analyze medical imaging to identify tumors and other issues. Human resources teams are using the interfaces to handle claims more efficiently.
Businesses have thousands of multimodal models to choose from in the Azure AI Foundry.
“Ultimately, these multimodal AI agents will make us exponentially more efficient and creative, driving human productivity and discovery in unprecedented ways,” Velinov wrote. “Our relationships and interactions with technology will be fundamentally transformed — becoming more personal, intelligent and collaborative than ever before imagined.”
UP NEXT: This is the definitive checklist for deploying artificial intelligence agents.