One of the delights of being a millennial is that every new technology fad looks completely novel, even if it isn’t. That thought crossed my mind when attending a DialogFlow workshop at Google Brussels a couple of weeks ago.
In our times of ever accelerating change, technological generations last shorter and shorter. The World Wide Web — better known as the Internet — went mainstream 25 years ago, mobile (voice) telephony followed 7 years later. The smartphone as we know it is barely 10 years old. Data science reached the age of consent — sexiest job, remember — just 5 years ago. And if you believe the newspapers, the age of artificial intelligence has now finally arrived. With — would you believe it — the chatbot aka conversational interface as one of its poster children.
Back to Google. Looking around in their classroom, I seemed to be the only attendee older than 35. By a wide margin, I must admit. Logically, most of the digital natives in the room started their professional life around or after the arrival of the iPhone. Which means they were not yet around during the early years of speech-driven phone applications and call center automation, say the beginning of this century and millennium.
Nostalgia alert: who remembers nowadays the fully automated speech services offered in the (phone) cloud by companies such as Tellme Networks (a former employer of mine), BeVocal, and Voxeo? Let alone 20-year old technologies like VoiceXML or the Speech Synthesis Markup Language aka SSML?
I’m afraid I do. For the simple reason that for the better part of a decade, I made a living out of building voice applications with said platforms and technologies. Pleasant surprise: SSML has survived the generational gap. It is actively supported in the Google Actions Simulator.
So when I asked our instructor if Google Assistant offered shared revenue models like the toll numbers in the years before the smartphone GUI replaced the voice channel for mobile information access, he was speechless at first. Then he asked politely if I could repeat the question. By the way: the answer was no, even though technically speaking, there is certainly room for such business models on the fulfillment side, using some form of account linking and automated payment provider.
Since the workshop, I have dabbled with a toy chatbot for Google DialogFlow and Actions. My first impression is that the democratization of chatbot development is both a curse and a blessing. The innate volatility of spoken conversation makes that the quality of the interaction depends at least as much on conversational interface design as on the technical platform it is implemented on. Nothing new there, i’m afraid: user friendly, intuitive web interfaces like DialogFlow won’t change that. In fact, despite Google’s own agent design guidelines, the ease of the point & click development interface might and will trick some developers into thinking that chatbots are easy to build. Quod non.
A few years into the VoiceXML era, the advent of integrated, multi-platform tools for application development like VoiceObjects made it easier for the industry to shift from bare bones VoiceXML programming (a coding job) to the art of conversational interface design (a voice user interface specialist’s job). In the Amazon Alexa and Google Home era, voice application framework providers like jovo may benefit from studying these ancient precedessors, lest they reinvent the wheel.
In that respect, a timeless book still worth reading is Voice User Interface Design by James P. Giangola and Jennifer Balogh. I bought it in … 2004.