This post is a part of series about building the personal assistant app, designed for voice as a primary user interface. More posts in series:
In previous posts we have built Google Assistant skill which lets us track daily water intake by our voice or text written in natural language (post 1). Skill can be customised with user’s personal data – it can call us by the name and can use our timezone to define at what time our day starts (post 2).
Google Assistant skills are not only Voice User Interfaces — our app can operate on a variety of surfaces like Google Home (audio-only) or mobile devices (audio + display).
Today we’ll customise our app‘s user experience for both cases. But before we’ll do this, let’s take a look at what our options are when it comes to surfaces capabilities support.
According to official documentation, there are a couple ways to handle different surfaces:
App-level surface capabilities
We can define upfront what surfaces are supported by our app. To do this open your Actions on Google developer project, then: Overview -> Surface capabilities.
If users try to invoke the app on an unsupported surface, they receive “device is unsupported” error.
Runtime-level surface capabilities
We can also deal with specific surfaces during runtime. There are a couple ways to do this:
- Response branching — show response adjusted to the current surface, so you can reply with a simplified message on Google Home or with a rich card(s) including additional text and links on a mobile device.
- Conversation branching — the entire conversation can look different depending on the current surface. This can be useful when we want to provide simplified flow for Google Home (e.g. repeat the last transaction) or completely custom one for a mobile device (e.g. find the cheapest flight to the destination and buy it).
- Multi-surface conversations — sometimes we will need to move the user from one surface to another. For example, the user who asks Google Home for directions probably would like to see a map so it would be reasonable to do the transition from audio to display.
All surface capabilities and examples are well described in the official documentation:
Response branching in WaterLog
Now we’ll implement simple response branching in WaterLog app. We’ll add some facts about how much water we should drink during the day. Example scenario:
User: How much water should I drink?
WaterLog: According to The Telegraph, in climates such as the UK, we should be drinking around 1–2 litres of water. In hotter climates, the body will usually need more.
//End of conversation
User: How much water should I drink?
WaterLog: Here are some facts I found about drinking water:
// continue with rich card:
We’ll start with DialogFlow agent update. This time we need to add one new intent:
This intent is fired when the user asks how much water we should drink, so e.g.:
- How much water should I drink during the day? — asked during the conversation with WaterLog
- Ask WaterLog why should I drink water? — asked from the main context of Google Assistant.
— Config —
✅ Use webhookGoogle assistant:
✅ End conversation
That’s it. Nothing new compared to the previous posts. If you would like to see full DialogFlow config, you can download it and import into your agent from the repository (WaterLog.zip file, tag: v0.2.1).
getFactForDrinkingWater() is pretty straightforward:
At line 17 we can see how we can detect what kind of surface user is already using. According to the documentation, we are able to check if the screen is available (
dialogflowApp.SurfaceCapabilities.SCREEN_OUTPUT) or audio(
Now when we have discovered user’s surface, we can build rich (display) or simplified (audio) response:
In our case for the rich response, we build Basic Card with title, text, button link, and image. But there are also other options like List Selector, Carousel Selector, Suggestion Chips. You can see all of them in the documentation.
We could end this post here, but if we have knowledge about when user is using screen surface, why not use this somewhere else?
Let’s move back to our Conversation.js code:
Now thanks to
GREETING_USER_SUGGESTION_CHIPS: [‘100ml’, ‘200ml’, ‘500ml’, ‘1L’] we can greet the user with actions suggestions:
And that’s it. Conversation with WaterLog is even easier now! 🙂
All clear 👍
$ npm test:
The full source code of WaterLog app with:
- Firebase Cloud Functions
- Dialogflow agent configuration
- Assets required for app distribution
can be found on Github:
The code described in this post can be found under release/tag v0.2.1.
Thanks for reading! 😊