Currently the Glass API kicks off based on the phrase "okay glass...". It would be nice to have the ability to create custom phrases based on authenticated applications. So if my app needed something more natural in a conversation that would fit.
As an extra bonus it would be nice if this could be done programmatically and not just through something like a manifest file. That way if my users wanted to customize a phrase for my app they could.
Comment #1
Posted on Apr 17, 2013 by Massive ElephantI do have very specific examples that I am developing but would discuss further if requested.
Comment #2
Posted on Apr 17, 2013 by Massive Rhino(No comment was entered for this change.)
Comment #3
Posted on Apr 17, 2013 by Massive RhinoIssue 7 has been merged into this issue.
Comment #4
Posted on Apr 17, 2013 by Happy GiraffeOne of the apps I've started building is a todo app and I would love it if my users could say "okay glass do the dishes tonight" and have my app be able to grab the text dictation to add it to their list.
Comment #5
Posted on Apr 17, 2013 by Quick CamelI still have reservations about widespread implementation of this. I really like the idea - but I can easily see Glassware registering tons of hooks which have no relevancy, or registering overlapping hooks. This is the equivalent of a home menu screen, and history has shown you don't want it to become too "cluttered", or whatever the equivalent may be.
Comment #6
Posted on Apr 17, 2013 by Massive ElephantI did think of that this morning also. maybe another option is registering command groups. So you would say "okay glass" (like a verbal home button) then call your group or app "... Dining". This would then add the commands "calorie count of..." Command to your top level commands. I'm totally making up the command example and it may not be perfect but you should get the idea.
Comment #7
Posted on Apr 17, 2013 by Happy KangarooI can see the potential for abuse, though I don't think it would be a big issue if the user can manage which voice commands from their subscriptions will be active.
In the event that there is still a conflict it could function similar to Android does when there are multiple apps that handle the same function and let the user choose (through a follow up voice command).
Say for example I subscribed to two to-do apps and say "okay glass, do the dishes tonight..." Glass would then show me which two apps can handle the command, and then I could either say "...with To-Do-App-Name", "...with Google Keep", or even "...with both." Doing this in such a way as to complete the command a single sentence would flow nicely.
Comment #8
Posted on Apr 17, 2013 by Happy GiraffeI think that since the trigger is voice activated it will be easier to handle a larger number of keywords. E.g. third-party ones could be hidden until the user start speaking the correct trigger phrase. And as a user gets further into a trigger phrase the options that don't match could get hidden. I don't think it will be as bad as Android where when you select share you get a list of 20 apps that can handle text.
Comment #9
Posted on Apr 18, 2013 by Helpful LionHow about "okay glass, activate app_name"?
The service would send an "initiation" timeline card with available actions.
Comment #10
Posted on Apr 18, 2013 by Massive ElephantYup. That is what I am thinking would be nice if voice commands get out of control.
Comment #11
Posted on Apr 18, 2013 by Happy HorseI'd like to see this too. Imagine I wanted to make Glass control my Sky TV STB, I might want to be able to say things like:
"Okay Glass; Sky Planner" - Open Planner (recorded shows) "Okay Glass; Channel {blah}" - Switch to channel {blah}
The user should be shown all of these prefixes at install time, like other permissions. Whether they need to be customisable, I'm not sure (currently, you can't install an Android app but block on of its permissions, as much as I wish you could).
It'd also be good to grab phrases following, so you could do:
"Okay glass; Sky Planner"... wait for it to open "Play {show name}"
However, this would be more complicated to do in a way that ensures an app doesn't just soak up all audio without the user knowing. There would need to be something on-screen alerting the user that they were still interacting with the app. I'm sure you could figure something out.
Comment #12
Posted on Apr 18, 2013 by Quick Ox(No comment was entered for this change.)
Comment #13
Posted on Apr 18, 2013 by Quick CamelThe more I think about it, the more I think this would be unnecessary and very un-Glass-like. Particularly since we already have the ability to send messages to contacts and Glassware are valid contact destinations. The biggest change I might want would be to change the phrase "send a message" to "tell" (or possibly add it, along with other alises such as "remind"). So commands would sound like:
"ok glass, tell HAL to open the pod bay doors" "ok glass, tell weps to fire the photon torpedos" "ok glass, remind my task list to do the dishes" "ok glass, send a message to my tv to play {show name}"
these are all more natural sounding than most of the alternatives, they don't require any configuration beyond what is already necessary, are entirely consistent with how Glass currently behaves, they seem to meet the needs of having "apps installed on the phone", and can be largely implemented today.
Comment #14
Posted on Apr 18, 2013 by Massive ElephantWhile I agree with you that the examples that have been discussed aren't very 'glass-like' I have specific examples (that I can't share due to NDA) that would be very strange to say "okay glass" in front of each request.
Comment #15
Posted on Apr 18, 2013 by Massive Rhino(No comment was entered for this change.)
Comment #16
Posted on Apr 22, 2013 by Massive RhinoIssue 19 has been merged into this issue.
Comment #17
Posted on Apr 22, 2013 by Quick CamelAs Jenny noted on StackOverflow (http://stackoverflow.com/questions/16137974/how-do-i-send-a-message-directly-to-my-app), glassware isn't a valid contact.
I still maintain making it a valid contact might be a better and more consistent solution.
Comment #18
Posted on Apr 22, 2013 by Happy HorseMaking it a contact would be a nasty hack. Given it's near impossible to take away from an API, it should be done properly. Your app is not a contact and the user shouldn't need to use strange phrases to interact with it as if it where. "OK glass, tell Sky to record Game of Thrones" vs "OK glass, record Game of Thrones". It should be natural.
Comment #19
Posted on Apr 22, 2013 by Quick GiraffeDanny, the problem with that is that everyone and their cat is going to want to "own" the initiation phrase "record." If Google wants devs to make lots of Glassware, then they should expect users to want to install and use lots of Glassware as well. Which means apps will need to differentiate themselves.
Comment #20
Posted on Apr 22, 2013 by Happy HorseI don't see how that's a problem; Android handles more than one app allowing into "Share". Users should be in full control. If I only want one app to respond to Record, I should be able to have that option. If I want multiple, I can make a selection. If I want a specific app, it can register a more specific phrase in addition "sky record".
Using contacts to represent apps makes no sense. What if an app wants to handle two types of commands?
There's also no requirement for apps to be fake contacts to have a similar API; but it certain my shouldn't be the only way; it would be terribly restricting and awkward.
Comment #21
Posted on Apr 22, 2013 by Grumpy HorseI think the idea of prefixes is important. Prefixes of different apps may conflict, but I think users should be able to choose which one wins out. We could allow the user to change an apps prefix if they have two that do conflict.
On android, multiple apps can handle the same type of resource. If I hit a link it asks me if I want chrome or browser or lastpass, etc to handle the link, always or this time only. Glass could work the same way on prefixes.
I still can't imagine the voice command doing much else than sending the command text to the glassware for handling. All google can really do is to the speech to text conversion, understand what app handles a prefix and then send the command and get the response from our glassware.
Comment #22
Posted on Apr 22, 2013 by Quick CamelAndroid "handles" this by bringing up an additional confirmation screen. Somehow I don't think you're proposing that the interaction be something like this: "ok glass, record Game of Thrones" [Card pops up with choice of apps] [You swipe to select the app]
I can see you suggesting that this choice be made when they enable the contact, but this is a procedure that already seems more complex than it should be, and I can't imagine adding one more step to the process would make it easier.
If an app wants to handle two types of commands, then it would be sent the commands. This is part of the simplicity of my proposal - everything after "tell " is sent to the app. I can have the app handle hundreds of commands if it made sense for the app: "ok glass, tell the tv to record Game of Thrones" "ok glass, tell the tv to delete this show" "ok glass, tell the tv to skip the commercial" "ok glass, tell the tv to go back 30 seconds" "ok glass, tell the tv to start over" "ok glass, tell the tv to pause"
Comment #23
Posted on Apr 26, 2013 by Helpful PandaAs a twist on Comment #9, Google Now currently supports the method of invoking an app using "Launch Pandora", or "Open Netflix" when the app is actively listening for audible input. Would this not be possible for the main Glass screen as a means of navigating to a registered service?
So announcing at the home screen, "OK, Glass, launch New York Times" would jump to that service's bundle of bundles? Obviously this wouldn't apply for things like Skitch that are share-only commands, but the concept is more for quick navigation.
Comment #24
Posted on Apr 26, 2013 by Massive ElephantWhile an awesome feature that still doesn't help with the "okay glass..." Part. That is what I am wanting to customize. Now I am not trying to just change things from "okay glass" to something like "computer" (which would be geeky cool) but more on an app by app level. I actually think that something as common as "record" isn't necessary however there are certain business reasons that a more custom initialization phase would be helpful.
Comment #25
Posted on Apr 26, 2013 by Happy HorseI think this needs splitting into two cases?
- Ability to change the term "OK Glass"
- Ability to register voice actions of some sort (that will still be prefixed with "OK Glass")
Comment #26
Posted on Apr 26, 2013 by Happy DogI completely agree with splitting this into a new case that is #2 above.
Also, there could be a #3 which would be:
As Jenny noted on StackOverflow (http://stackoverflow.com/questions/16137974/how-do-i-send-a-message-directly-to-my-app), glassware isn't a valid contact.
I still maintain making it a valid contact might be a better and more consistent solution.
Comment #27
Posted on Apr 26, 2013 by Massive ElephantWell I would also want 3. ability to change the term "okay glass" on an application by application basis and maybe even programmatically.
Comment #28
Posted on Apr 26, 2013 by Quick CamelOk, I'll go on record (again) to state the unpopular decisions:
Regarding changing "ok glass" - I strongly disagree that this should be tampered with in any way, shape, or form. I can understand there are use cases that might benefit from keying off a different phrase (tho I'm not sure I can think of any offhand). I can understand the desire to personalize your own Glass. I can understand wanting to change it as a security measure. But I think the potential for abuse of this is strong as well. One of the arguments for keeping "ok glass" is that people around us will be aware that we are doing something with Glass. This is a direct counter-argument to the notion that we can be doing something secretly. Being able to change this phrase to something like "hi there" would severely undermine that argument.
I am not convinced that other voice actions are necessary if Glassware can be set as a contact for voice messages. I could be convinced that other voice actions might be setup as aliases for "send a message" ("tell" and "ask" come to mind), but I don't think that should be on an app-by-app basis.
In the SO message that Cecilia mentioned, Jenny suggested to star this issue and discuss it here. I don't have a problem splitting the issues - but it sounds like the Glass team is considering it as one thing.
Comment #29
Posted on Apr 27, 2013 by Quick CatMy 2 cents: Dont mess with 'ok glass', but allow adding apps a contacts, so that I can send a message to my app.
HOWEVER, allow the app to create an contact that is immediately active. The extra step of having the user go google.com/myglass to add the contact as another step, and then enable it as a sharing contact is just too much of a hurdle.
Comment #30
Posted on Apr 27, 2013 by Happy DogAlthough slightly off-topic, I agree with not having the extra step of enabling a share target on Glass. It should be part of the oAuth process and by default it should appear enabled. I understand the need to disable it at a later time if the user wants to review the active share targets at any time.
Comment #31
Posted on Apr 27, 2013 by Helpful RabbitHow difficult might "strung commands" be with the Glass API? In other words, if you want to give Glass a series of commands like so:
"OK Glass, command series: Take a photo THEN Record a video THEN Send an email"
In other words, rather than having to repeat the initial command "OK Glass", you save some time and strain on the user. The trick is whether or not Glass is smart enough to accept "THEN" or "FINALLY" etc type commands, plus smartly monitoring logical pauses or cadence in someone's speech.
For example, I tend to stop talking/pause between thoughts. That would be a real problem in a situation like this unless you enter a mode for the command such as "command series" (I am sure someone could come up with a more sexy phrase for this concept).
-Jesse
Comment #32
Posted on May 3, 2013 by Helpful MonkeyJust wanted to hop in.
Glass is Glass, and glassware exists on Glass. IMO the wake-up command for Glass ("Ok Glass") should not be editable, at least until the commonfolk are more familiar with Glass.
That said, I can certainly see the functionality of letting apps either
1) register one top-level command (Quora registering "ask Quora", so I can say something like, "Ok Glass, ask Quora 'What is the best sushi restaurant in Rolla, Missouri?'"), or
2) be opened with a generic open command, so you can say something like, "Ok Glass, open Baby Monitor".
Allowing some flexibility for glassware to be opened in ways other than sharing content into them would open up the kinds of apps available on Glass, while limiting the way they can be opened prevents accidental opens and confusion in what exactly you need to say to interact with a new "app".
Comment #33
Posted on May 9, 2013 by Massive Rhino(No comment was entered for this change.)
Comment #34
Posted on May 17, 2013 by Massive BirdMy two cents: I view one of the main advantages of Glass is its ability to be used with no hand interaction. E.g., I'm washing dishes so my hands are wet but I want to send a text message. So the ability to add custom commands greatly extends the potential of Glass.
Comment #35
Posted on Jul 7, 2013 by Grumpy HorseI also want to add custom commands for the apps I have developed. The use case that I have in mind right now is something like, "Okay glass, how much money is left in my checkings account?" or something like that.
I also would really love a Wolfram Alpha hook so I could say, "Okay Glass, Wolfram Alpha The air speed velocity of an african swallow".
I say allowing the users to change which voice hooks are displayed/activated is the correct way to do this as well.
Comment #36
Posted on Jul 22, 2013 by Massive RhinoIssue 144 has been merged into this issue.
Comment #37
Posted on Oct 8, 2013 by Massive RhinoWe added two voice commands that you can use in your Glassware: https://developers.google.com/glass/contacts#declaring_voice_menu_commands
You can also request that new commands be added: https://services.google.com/fb/forms/glassvoicecommand/
Comment #38
Posted on Oct 8, 2013 by Massive PandaReally shouldn't be marked as fixed. There are examples directly above that can't be implemented with "take a note" and "post an update", like checking a bank account.
Really disappointing Google went this direction instead of letting app developers add any voice command label they want and letting users choose which apps they want to install. This is very limiting and not the way Android and Google Play became so successful. Really disappointing the APIs on Glass are intentionally crippled and it will never have the amazing ecosystem Google Play has.
Comment #39
Posted on Oct 8, 2013 by Massive Rhino@Inanek: That's what the linked form is for. If there are other voice commands you'd like Glass to support, please let us know: https://services.google.com/fb/forms/glassvoicecommand/
Comment #40
Posted on Oct 8, 2013 by Helpful HorseI'm a PM on the Glass team for our voice experience, so I wanted to chime in a bit on our thinking here.
Right now, we do have a policy that we need to explicitly approve all voice commands. Part of the reason is that we want to make sure that voice commands are consistent. To give an example, we want the command to describe what you're trying to accomplish "ok glass, get directions to Pizza Hut" instead of what software you want to use to accomplish it "ok glass, open Google Maps".
But the bigger reason is that for every voice command we add, we build a hand tuned acoustic model for recognizing that command. We make sure that the voice command doesn't overlap too closely with our existing command, and tune those existing models if necessary (this avoids false positives). We make sure that the voice command handles different accents. This hand tuning is why the voice recognition on glass is so high quality and fast. We very much want to keep it that way.
As Jenny said, if you've got a voice command that you want us to support then please submit it to our form: https://services.google.com/fb/forms/glassvoicecommand/
Lately I've been replying to all voice requests within a few days ... so hopefully we can find the right command for you and kickoff the process of getting a voice model built soon.
Comment #41
Posted on Oct 8, 2013 by Happy DogThanks Jeff for your clear explanation.
One question for you, in the case of glassware that would accept multiple commands, I know you do discourage the use of free recognition text to indicate commands, but to what extent does that suggestion go? Like in the case of Genie, there are multiple commands such as Add to shopping list, add to to do list, add to log, etc.
One possibility would be that through the take a note voice command, all I'm taking are notes, but there are different types of notes that the glassware could pick up in the case specific words are mentioned.
The other possibility would be to have a ton of different voice commands and that would make the main list of voice commands very crowded.
How far do we go in the adding new voice commands line versus adding some context on the one command itself?
Comment #42
Posted on Oct 8, 2013 by Quick Wombat@cecilia There is the possibility for multiple share contacts that could work with one voice command phrase.
Comment #43
Posted on Oct 8, 2013 by Quick CamelConcur with Cecilia's question, and the general tone of the inquiry.
I hope that the "top level" voice commands are as generic and broad as possible - giving a similar launching point to multiple Glassware services, but allowing the Glassware to provide multiple contacts that can handle further details. I would NOT want the top menu to be littered with overly specific voice command launches.
Comment #44
Posted on Oct 8, 2013 by Helpful HorseThat's a great question and honestly not something that I think we know the long term answer to. We don't currently have the ability to add context to our voice commands (something like an "ok glass add to..." command that can be followed by a fixed list of possible things . I don't want to commit to anything, but it's definitely something we've debated.
I think for now, we'll push y'all to create different commands for different actions. The idea being that when you see the command on your screen you should have a very clear sense of what it will do "ok glass, add to my shopping list" is really clear what it will do whereas "ok glass, add to..." is not.
We're also worried about the possibility of cluttering up the voice menu with too many commands. We may need to address that by intelligently ranking the commands in the menu, or by letting users choose which commands to expose from their installed application, or by doing what you suggest and having specific possible phrases that must follow commands. But we're going to cross that bridge as it comes.
Comment #45
Posted on Oct 8, 2013 by Helpful BearFor contacts, you have to manually add them via the Glass app before they show up on your "shortlist". Why can't voice commands work the same way? If they user uses the todo app a bunch, they can add it to their top level commands, or if that is a algorithmic problem then just subset all user commands with the top level command "with". So therefore:
("ok glass", "take a picture") ("ok glass", "with twitfacegram", "take a picture")
Looks something like this: ("ok glass", "with ", "")
I think this solution would cover most use cases. Please correct me if I'm wrong.
Comment #46
Posted on Oct 29, 2013 by Massive Wombatsuper3 I think you have the perfect solution, 'ok glass with X do X' Google, Please make this happen ASAP!
Comment #47
Posted on Nov 18, 2013 by Happy WombatI would love to Change Ok Glass to Ok Jarvis.
Comment #48
Posted on Feb 14, 2014 by Swift GiraffePlease note, the form for suggesting a voice command now resides here:
Comment #49
Posted on Apr 21, 2014 by Grumpy MonkeyNot to be overly negative here, but I dont know how the restriction on custom voice triggers does anything BUT make the entire experience worse. I mean I guess I'll just launch my immersion app and try to keep users there because I cant have them multitasking and get back to my app with any sort of reasonable actions.
Also, the request for custom triggers states generic enough for other apps to be able to respond to the same command etc... and then some of the defaults are "learn a song" and "start a round of golf". Wtf?
I totally echo the suggestions above where even if you have a context of someapp and do "ok someapp start some action", that would be way more logical / common sense.
I guess for now, I'll just use a hugely inaccurate existing command like "check me in" because even that has NOTHING to do with our software, I guess its the least wrong
Status: Fixed
Labels:
Type-Enhancement
Priority-Medium
log-7975520
Component-Mirror-API