Augmented reality: nearly but not quite
A wave of apps for iPhones and Androids has sparked the public interest in augment
reality. It won’t be long before superimposing images and content onto the real
world through the camera view of your smartphone will change the way you look
at things.
Imagine walking down a street in Rome, unsure of where to eat. You hold up your
phone to a restaurant window; superimposed in the camera view are translated reviews
of customers who have been there. This is the promise of augmented reality, a
technology that superimposes text, graphics and animations on top of physical
locations. Virtual reality has been in the headlines for years, but now researchers
are hoping that augmented reality (AR) may have hit the mainstream.
The concept could be used for a variety of applications, such as medical imaging:
surgeons using AR-capable glasses could see data layered on top of their patients,
such as blood flow information and X-ray or MRI imaging. In the industrial or
manufacturing industries, engineers could see electrical routing information and
operational data about the parts that they are working on. Such applications are
getting closer, but there are still some challenges for the technology to overcome.
When Jim Vallino at the Rochester Institute of Technology returned to AR research
after a ten-year hiatus, he was surprised to discover how far things had progressed...and
how far they hadn't. "Things that 10-15 years ago would have taken six months
to build into a working system can be done with a laptop, a webcam and the AR
Toolkit," says the faculty member at the Department of Software Engineering at
the Rochester Institute of Technology. "But it turns out that the problems that
were hard ten years ago still haven't been solved. And to do it in a seamless
fashion is still going to take a lot of work. The big thing with augmented reality
is the registration issue. It's the registration of what the user sees in the
real world with the synthetic one," he says. Getting synthetic objects to line
up against real ones – and stay there as you move the camera around – is a challenge.
Vallino's system uses points such as lines and corners in the real-world environment,
as a reference to keep the computerized content correctly registered. However,
that still leaves the problem of geographic accuracy.
dependent on GPS
For an augmented reality system to truly work, the device being used must understand
where it is so that it knows what it is looking at. That may be possible in a
limited environment, such as an operating theater, where a device can be triangulated
to within millimeters of its true location using local wireless modules. But if
you're anywhere in the world, you'll need GPS to pinpoint your location.
This factor is becoming more important as we start to see these developments
in the consumer mobile space. Smartphone devices are now good enough to handle
rudimentary augmented reality. For example, Layar, an augmented reality browser
for the iPhone, displays places of interest and local search results superimposed
on the phone's camera image as you hold it in front of you. Another, Wikitude,
does the same. And in December, Google announced Goggles for Android and iPhones,
an app which uses image recognition software and its own vast databases to tell
the user that they are looking at.
Unfortunately, GPS positioning isn't so much of a pinpoint as a smear, says Daniel
Wagner, postdoctoral researcher at Graz University of Technology in Austria. "If
you are walking down the street and you want to see something layered on top of
a shop, then you are out of luck, because you could be 50 meters out and seeing
another shop's details altogether, or even the details of something behind you,"
Wagner warns. He is working on a GPS system that could locate a device to within
2 cm, but that would be highly expensive and certainly would not fit into a consumer
device.
The other option is to use computer visualization to further lock down a device's
location; if a device can recognize what is in front of it, in conjunction with
a broad location, it could solve the problem. However, computer visualization
is a notoriously processor-intensive problem that developers probably wouldn't
want to try and solve using a handheld device.
Blair MacIntyre, associate professor at Georgia Tech in the U.S., has an innovative
approach to the problem. He suggests using “crowdsourced” photographs from sites
such as Flickr to create a highly-photographed model of the real-world environment
in question. This would enable an AR browser to conduct real-time pattern matching
against a database of approximate images.
a world of pictures
"If we crowdsource all the geotagged images from Flickr, then you can generate
the model that you need," he says. "You could use correspondences between these
pictures so that they can figure out where the cameras were for each image. That
gives the system the information it needs to do the visual tracking."
Researchers have already crowdsourced images to create 3D versions of cities
including Rome and Dubrovnik. With companies such as Google already generating
entire street-level photomontages of cities, the dataset necessary to make this
happen is growing daily.
While technologists grapple with these problems, there must also be advances
made in display technology. Holding a phone up to the world is one thing, but
it limits the user's experience to “pulling” information on request at ad hoc
points throughout the day, MacIntyre explains. For a fully-immersive experience
in which information is constantly pushed to the user without their thinking about
it, a head-up display will be necessary, and these are already appearing. Researchers
at the University of Washington, for example, are working on bionic contact lenses
that can virtually “float” an 8x8 grid of pixels in front of the human eye.
In order for AR to be really useful, researchers must chisel away at the technical
problems facing it on both the software and the hardware fronts. As companies
and their customers begin recognizing the concept's inherent value, they will
adopt rudimentary AR techniques, paving the way for more sophisticated solutions
as they emerge over the next few years. It will be an exciting spectacle.