Wearable devices have attracted a
lot of attention from the research community. These devices enable access to a
user’s day-to-day life. Many of these devices support multi-modal sensors that
can record and/or transfer sensory data including video and audio. Coupling
this source of data with network connectivity can enable a wide range of
augmented reality (AR) applications, which serve to enrich the user’s life and provide
more insight in making decisions. For example, a wearable device supported by
intelligent computer vision and machine learning methods can automatically
infer and relay information about the place and situation to the user directly.
This provides the user with more information to make a particular decision, e.g.
whether or not to buy a product in a store based on reviews and competitor
pricing found online. Moreover, these augmented capabilities could be very
beneficial for people with sensory impairments, e.g. a visually impaired person
wearing an AR device can be warned (through audio) of immediate obstacles in
his/her way or a hearing impaired person can be notified (through words on a
display) of someone calling out to him/her.
In this project, we aim to build an AR system based
on the Google Glass and a Google Tablet, which will automatically acquire
visual and audio data and transfer it to a central processing station for
analysis. Information inferred from this data will be transferred back to the
Glass, so that it is conveyed to the user in visual or audio form. This is
possible because the Glass supports both visual and audio sensors. One possible
output of this project is the ability to project on the Glass display
automatically-generated results of recognizing (i.e. labeling) and detecting
(i.e. localizing) objects in front of the user.