TECHNOLOGY HORIZONS|VISION SYSTEMS
and skin colour in the images. Stenger
admitted: ‘It was a shamefully simplistic
representation that did not take into
account shading and lighting or motion
blur. The hardest part of the tracking
was the sheer number of possible hand
gestures from the video the computer
needed to compare against his model.
He said: ‘The real trick was to develop a
way of quickly matching the image data
to the model.’
To resolve that, he used a data glove
with sensors in the joints to capture a
variety of hand poses that a user might
employ to interface with the system in
real-life scenarios. By doing so, he was
able to reduce the search space
dramatically and therefore the
computational complexity of the task.
He also developed a hierarchical
processing scheme to make the
computer matching more efficient.
In Stenger’s system, the computer
identifies a close match to the hand
gesture presented to it then only
explores others that are similar. It tracks
down a hierarchy of possibilities, finally
selecting the most promising match from
a distribution of potential solutions. The
result is then propagated to the next
frame of the image as the most likely
match. Then, the procedure is carried
out again and again as each new image
is presented.
The program Stenger wrote for his
doctorate won him the British Machine
Vision Association Sullivan Thesis Prize,
awarded annually to the best thesis in
the UK in the field of vision.
During his final year at Cambridge, he
applied for a position on the Toshiba
research fellowship programme which,
supported by the EPSRC, offered the
opportunity to live and work in Japan.
Stenger was delighted when he was
accepted and flew to Japan to join a
team of researchers at the Toshiba R&D
Centre in Kawasaki, near Tokyo.
While there, he extended the
techniques he had developed for his
PhD to track the human body. ‘The
hand-tracking algorithm could easily be
employed to analyse images of a human
walking, turning and dancing,’ said
Stenger. He and his colleagues were
asked to build a system to do that. ‘We
developed a “virtual fashion show” that
would display a CGI image of a model
on a screen that would move in tandem
with a real fashion model walking on a
catwalk.’
8
Toshiba aims
to implement
hand
recognition
technology on
a standard
laptop
The system Stenger developed tracks
the movement of a model using two
cameras and captures a 3D image while
she is walking down the catwalk. A
simplified version of the captured image
is then compared with a database of
postures created from laser scans of the
body. A computer algorithm then
estimates the posture of the model,
‘We needed to
develop a better
system that was less
complex but robust’
again using a hierarchical scheme, from
which it is able to display the virtual
equivalent on a big display in real time.
In Japan, Stenger was also asked to
look again at developing a hand
recognition system, this time for the
Toshiba Qosmio laptop. Toshiba wanted
to demonstrate that the computer, with a
built-in camera, could recognise a few
simple hand gestures to enable a user to
control simple functions such as audio
and video.
Stenger realised his earlier work
would be inappropriate in a commercial
system. ‘For a commercial system we
needed to develop a better system that
was less complex, meaning fewer
gestures, while at the same time being
much more robust,’ he said.
He used cascaded classifiers to
detect a number of hand poses in each
video frame independently. But that did
not prove fast enough to recognise a
hand gesture in real-time. To do that, the
detection algorithm was optimised for
multi-core processors by distributing the
operations to multiple cores and
minimising the data transmission
between them.
Toshiba was planning to open a
computer vision group at the research
laboratory it had formed in Cambridge
and Stenger was invited to join a team
of three.
Back in the UK, he has been working
on refining the hand recognition system
and implementing it on a standard
laptop.
‘I have been investigating ways of
improving the algorithm and to make it
run fast on standard low-cost hardware,’
he said.
He believes one day such systems
will be all-pervasive — on TVs, remote
controls and public displays.
TECHNOLOGY HORIZONS AUTUMN 2008