Why do AIs need a universal grammar?

Bias is a problem for AI interoperability. Every voice AI interprets human language slightly different. Learn why we use a grammar based on who else? questions as protocol for our AI namespace protocol.

Download Whitepaper

Interoperability standards are needed to overcome bias, data silos, and narrow AI limitations.

AI = data silos

Currently, there are 1000+ voice AI provider world-wide. But different AI ecosystems usually can't be easily connected.

AI inequality

AI research is happening across languages and use cases at different speeds. Small languages generally have a disadvantage (e.g. Hebrew).

Interpretation bias

Today's voice AI are only narrow intelligent and understand only use cases they are trained for. Human-like intelligence is still a dream.

Should AIs use a universal grammar?

Our idea is to make voice commands comparable by providing a simplified language to store the intent of a language-based request. The inspiration to this idea comes from Noam Chomsky's theory of language that is hard-wired in human understanding.

Universal grammar are rules tying together all languages in the world.

Noam Chomsky's theory

Universal grammar is a theory that certain parts of language appear to be hard-wired in human thinking. Since its introduction in the 1960s it is considered a cornerstone theory of modern computer science.

Is 'Huh' universal grammar?

Max Planck researcher demonstrated in 2013 that 'Huh' carries characteristics of universal grammar. They later won the IG "alternative" nobel prize.

Is 'who else' hard-wired too?

We demonstrate that 'who else?' grammar appears to exhibit similar properties. Maybe this is a new proof of universal grammar. It's definitely language every knows how to use.

UG as shared namespace for voice AIs.

Since it is language apparently available to every kind of user, and applicable to every kind of question, we use "who else?" questions as a design for the intent catalog. Initially we provide it for the 200 most common request types in human language. Voice AI providers can use this catalog as a shared protocol to store the content of voice commands.

We enable voice AI to lookup other voice AI and forward the request.

I now connect to another AI

Voice assistants can lookup other voice assistants and forward the user request using a standardized language as protocol.

Based on public standards

With our work we support the development of public available and vendor-agnostic NLP API standards. AI integration become this way much easier and cheaper.

Universal voice AI wake word

The vision is a network of connected voice AI technologies. If successful, one day maybe every AI is able to use "who else?" to lookup other voice AI.

Executive Summary

Architects of the Voice Internet.

whoelse.ai develops a protocol that voice AIs use to process human language in standardized format. Everybody knows Siri and Alexa. But actually, there are more than 1000 voice AI providers world-wide. Research labs, industry solutions, and startups compete across voice AI use cases and input languages. Bias in voice AI is a problem with two dimensions: (i) Input: Every user speaks individually different (accents, slang, vocabularies). (ii) Interpretation: Every use case requires a different NLP model. Voice AI capabilities vary across input languages. To solve this Babylonian problem of voice AI, we suggest a shared addressing system. Similar to how Tim Berner’s Lee initially invented the URL protocol, we suggest establishing a namespace protocol for voice AIs.

The whoelse.ai idea is based on Noam Chomsky's theory about language "hard-wired" in human thinking: We demonstrate that "who else?" questions exhibit properties being such a Universal Grammar (UG) and are thus extremely easy to remember, and flexible to compose for all types of intents. Once learned, users automatically know how to ask for any other request they can think of. Today user usually must speak to voice assistants in a format compliant to the navigation logic of voice AI app stores: "Hey Alexa, open MyTaxi" or "Hey Alexa, open Delivery Hero". Voice AI users are required to remember a different brand (skill) name for each service.

Average people recall however only 3.7 brands during 80% of their time. Users prefer actually to speak with voice AIs like they would speak to other humans. They do not want to recall a different name for every other feature. This is why our long-term idea is to establish "who else" questions as universal wake words that all voice AI understand and every user knows. It much nicer to remember a single phrase instead of many brands: "Apartment who else?", "Delivery who else?", "Date who else?". It's however not important for our business model. Voice AI users can speak however they want. whoelse.ai initially only provides a protocol to store voice commands in a simplified format.

Vision

Always get the best voice AI available.

This project proposes the installation of a publicly available address system for voice AI. Similar to how the Domain Name System (DNS) and the Hypertext Transfer Protocol (HTTP) were needed for the original “Text Internet” to succeed, we suggest the development of a shared namespace system for “Voice Internet” Natural Language Processing (NLP) AI technologies(e.g. smart speaker, voice assistants, chatbots).

AI business models (e.g. Amazon Alexa, Google Search, Apple Siri) depend on access to consumer data (e.g. preferences, search queries, voice commands). It is in particular difficult for OEMs (e.g. car companies, home appliances, FMCG electronics) to adapt. As experts in the combination of technology suppliers, European OEMs do not have enough own AI know-how and data access to reach parity with US and Chinese companies.

The installation of neutral AI interoperability standards and systems helps to address this situation. It allows OEMs to share use cases and NLP AI R&D resources. Progress in NLP R&D is driven both by industrial and academic research. This makes the domain particularly suited to pioneer standards and protocols for AI interoperability. Later on related AI domains (e.g. Computer Vision, Brain Machine Interfaces) can be similarly integrated.

Natural language processing (NLP) and voice assistant technologies (smart speaker, chatbots, voice assistants) are expected to account for 2/3 of all Internet search requests by the year 2025. Every first customer contact will be a bot. Voice-based e-commerce will become a $55bn p.a. market during the same period (Gartner).

The potential of voice interfaces stems from usability and availability of the medium: Everybody speaks. Language is a tool that users of all ages already know. But integrating voice technologies is a huge challenge for European OEMs such as car manufacturers, FMCG electronic companies, and telecommunication service providers.

‍They have 2 options: Either they use Alexa and become the microphone of the Amazon business model. Or, they choose a white-label NLP (e.g. Nuance, Cerence, Watson) and build custom voice assistants.

NLP is a young domain and currently more than 1000 voice AI companies, research projects, and industry solutions compete world-wide across different languages and use cases.

Examples for voice AI use cases & applications:

- Smart speaker

- Voice assistants

- Biometric user identification

- Customer hotlines

- Chatbots

- Voice-based COVID 19 diagnostics

- Industry-specific (e.g. banking, insurances) voice assistants

- Digital receptionists

- Toys

- Text classification (e.g. legal contracts, medical files)

- Text generation (e.g. marketing, advertising)

- Search engines..

It is a situation of AI bias. Voice AIs are usually only good for the domain they are designed and trained for. NLP R&D breakthroughs happen every week. It’s difficult to predict which voice AI will be best suited for a product to be released in 2-4 years.

Furthermore is the localization of voice interfaces a problem. To sell a German car with voice assistant features, or a home appliance product with an integrated smart speaker, to customers in e.g. China, USA, and France, OEMs must integrate multiple NLP technologies, because each market has a different technology leader.

‍Voice AI development is taking place at different speeds across different languages. The market for Swedish NLPs is for instance much smaller than for technologies with Mandarin capabilities. The Government of Israel recently announced that it will sponsor the development of voice AI capabilities for Hebrew language at Amazon and Google.

‍

Entering whoelse.ai - the first universal language for all AIs. To make the combination of different NLP technologies easier, we provide voice AIs a simplified language to store and exchange voice-based user requests (intents) in a standardized format.

‍This way voice interfaces can contain multiple voice AI technologies. User requests can be answered by the voice assistant most suited to respond:

Example intent catalog implementation:

‍

Smart Speaker for Co-Working Spaces

├── NLP 1: IBM Watson (WeWork AI)

│ ├── Air Condition

│ ├── Room Booking

│ ├── Catering

│ └── Register Guest

│

└── whoelse.ai

└── NLP 2: Cisco Mindmeld (PWC AI)

│ ├── Tax Fillings

│ ├── HR Management

│ └── Digital Lawyer

└── NLP 3: Nuance Mix (Lufthansa AI)

│ ├── Ticket Booking

│ ├── Hotel Reservation

│ └── Rental Cars

└── NLP 4: Deepgram (no white label)

│ ├── Transcription Service

│ ├── Task Automation

│ └── Meeting Translator

└── NLP 5: Alexa (Amazon)

├── ..

└── ..

User journey:

Voice AI 1 Welcome at WeWork - how can I help you?

User Input I want to file my taxes!

Voice AI 1 I can not help you personally. But I find the best AI available!

Searches in whoelse.ai intent catalog

Voice AI 2 Welcome at PCW. Please tell me first your tax code (..)

We detailed this concept during the organisation of a DIN industry standard initiative for NLP API interoperability. As consortium initiators we worked together with industry partners and validated the demand of such an AI exchange.

The Domain Name System was once needed for the Text Internet to succeed. This project proposes to now develop the technologies needed for the first address system for a new kind of Voice Internet.

‍But standardization itself is not a business model. In the current environment usually de-facto monopolists like “GAFA” control by their market dominance the adoption of technology specifications and SDKs in the industry.

Same is now happening in the field of voice AI interoperability. In September 2019 Amazon announced a new NLP standard initiative. The Alexa consortium decided that wake words (Alexa, Einstein, .. ) will control which voice assistant responds to a user request.

This selection logic will, in our opinion, not work. Because it is unlikely that different OEMs will be able to find a shared agreement about the ownership of arbitrary language. Example: “Voice AI, find me a ride-share” - who should decide if this command is directed to e.g. BMW or VW? Will it be the user, the AI, or the interface provider?

The Amazon’s consortium naming logic favours the most known brands and thus is designed to position Alexa in the best way possible. Research shows as well that consumers do not want to remember multiple brands for voice assistants and prefer to use natural language over synthetic input dialogs. Naming voice AI will be an ongoing topic of concern in the industry.

Long-term this project solves problems stemming from the redundancy of voice AI-based information. Once voice AI technologies are mainstream adopted, and every user is surrounded by multiple voice interfaces, it will be an issue that several voice assistants pick up a speech command and want to respond in parallel to the same user. A shared addressing system between voice AI to agree on what was (likely) said and which AI was addressed by the user will then be inevitably needed.