Understanding what the user is looking at
We'll explore the intricacies of eye-based input and unveil strategies to surmount its inherent limitations.
No single definition of smart glasses (or smartglasses) exists. Nevertheless, a practical product-oriented definition might be head-worn devices that provide the familiar benefits of glasses and additional capabilities enabled by sensors, audio and visual components, and processors.
This definition excludes virtual reality (VR) and mixed reality (MR) devices that use passthrough video to see the world. However, it could include augmented reality (AR) devices that use see-through optics.
Is it important that smart glasses encompass conventional glasses? Well, would a smartphone exclude calling capability? Or would a smartwatch be incapable of telling time? Let’s assume that “smart” means enhancement on top of the original functionality.
It should be noted that smart glasses, to be practical, must be comfortable enough to be worn for extended periods, like ordinary glasses. Subjectively, it implies that the size and weight should resemble normal glasses and, to a certain extent, even be fashionable.
Today’s consumer smart glasses allow listening to music, making calls, summoning voice assistants, and taking pictures and videos. Devices with displays might have features such as maps, texting, notifications, calendars, and language translation. These applications are not unique to smart glasses and should be familiar to anyone with a smartphone. The difference is that you don’t need to pull out your smartphone to access those features, much like the advantages of smartwatches.
Compared to smartwatches, smart glasses don’t require you to look down at your hands at all – your attention can stay on the world around you and the activity at hand. Most smart glasses also have built-in speakers, so you can forgo earbuds, further increasing your situational awareness and showing others that you can hear them. Smart glasses let you be connected to your digital life without being disconnected from the physical world.
Eye tracking can provide natural and novel ways for user input, accelerating AI-driven use cases and streamlining usability. It can also act as a sensor to allow the technology to adapt to the wearer.
Smart glasses can collect a tremendous amount of information about the wearer’s surroundings through conveniently placed cameras, microphones, and other sensors. This makes glasses well-suited for AI use cases such as visual search and multimodal conversation.
AI is more effective when it has ample context for a request and can clearly understand what is being asked. Understanding people is the goal of attention computing technologies like eye tracking. While an image of the scene in front of a user is valuable context, knowing what the user is focused on will enable specific, efficient responses. It makes the difference between generalized information and attentive, to-the-point answers.
For example, a visitor at a museum may look at a painting and ask the AI, “What is that?” AI could either answer, “Those are framed artworks hanging on a wall,” or “That is Gustav Klimt’s Portrait of a Lady.” Similarly, questions like “How do I use that?” “Do I plug the cable there?” “What is he holding?” are less ambiguous when the user’s object of attention is known.
In Google’s Project Astra teaser video, we see an example of visual focus being leveraged by AI. The user drew a red arrow while asking the AI, “What can I add here…?” The answer revealed an understanding of the user’s intention and attention. If the diagram had been in a book or on a screen, or not a diagram at all but a physical machine, a hand-drawn arrow might not have worked, but simply looking at the point of interest could have. Eye tracking would enable that style of attention-enhanced query.
Google’s Project Astra leverages attention cues to produce precise responses.
This kind of assistance with a shared view mirrors the interaction between people during a remote support session. The user is engaged in a task, while a remote expert watches and provides guidance. Smart glasses enable “see what I see” support, supercharged with attention awareness made possible by eye tracking.
One challenge of having a device located on the face is figuring out the best way to control it. The relaxed resting position of our hands is far from any manual controls on the device. This can make touch interactions cumbersome and tiring. Some smart glasses may attempt to overcome this limitation with wireless controllers, but that may not be an option during handsfree usage. Voice commands are a convenient way to interact with smart glasses, but some situations require silent or discrete interactions.
Eye tracking opens the possibility to control the device with our eyes.
Gaze direction can act like an invisible laser pointer to indicate attention.
Gaze + blink can provide acknowledgment or serve as a trigger.
These gestures can tell the smart glasses to dismiss a reminder, scan a QR code in view, read aloud the text message that just arrived, or hang up an active call.
While the degree of control is limited, eye gestures may be appropriate for simple interactions like UI navigation, dialog responses, and activating frequently used features.
Some near-to-eye display technologies only work when the display and the eye are properly aligned. You might have experienced optical misalignment when using a microscope or binoculars that let you see clearly only when your eye is within a small sweet spot. An eye tracker can tell the display exactly where the eye is, momentarily or continuously, allowing the display system to light up selectively or steer itself in the optimal direction despite frequent eye movements.
When the original iPhone was revealed, it was introduced as the fusion of a music player, a mobile phone, and an Internet communicator. Of those three functions, the first two were already in many people’s pockets. The Internet communicator is now the most used function of smartphones according to a 2023 survey by Qualcomm. Smartphones paired with Internet search gave us anytime, anywhere access to the world’s knowledge.
Smart glasses can liberate the hands and eyes held captive for hours a day by smartphones. AI assistants, capable of seeing through our eyes, will provide just-in-time guidance and enrich our experience of the world. Where smartphones brought us knowledge, smart glasses will bring us know-how and let us return to the physical world where we belong.
Do you want to know more about how eye tracking can make your eyewear products smarter and more human-centered? Let Tobii share our decades of experience bringing attention-awareness to products.
We'll explore the intricacies of eye-based input and unveil strategies to surmount its inherent limitations.
In this article, we'll learn how the basic UI concept of pointing requires special handling when creating interfaces with eye-based input.
This article explains how eye tracking works. We illustrate step-by-step processes of screen-based and wearable eye trackers and XR integrations.
Subscribe to our stories about how people are using eye tracking and attention computing.