Right on cue
Artificial intelligence software will identify suspicious sounds and make CCTV cameras respond
PORTSMOUTH University
researchers are using sound cues
to identify potential crime in
progress and trigger CCTV cameras
to turn towards the source.
CCTV surveillance has enjoyed
much success in identifying perpetrators
after a crime has taken
place, but does not always catch it
in progress. It can also be difficult
for a security officer in a control
room to keep an eye on several
cameras at once.
The three-year project, sponsored
by the EPSRC, aims to adapt
artificial intelligence (AI) software
developed by Neuron Systems, a
spin-out founded by Portsmouth
University’s Dr James Hui, which
currently identifies visual patterns.
Dr David Brown of Portsmouth’s
Institute of Industrial Research said:
‘This software identifies salient features
in objects, such as if a car has
got an aerial up, or a wing mirror in
a certain position, or a dent. For the
next stage, we want to get the
motorised camera to pivot if it hears
a type of sound. So in a car park, if
someone bashes in a window, it
turns to look at them.’
Brown said the system would
not identify a specific speech pattern,
but would use fuzzy logic to
identify a type of noise, such as a
crowd or breaking glass. The
researchers will take the software
developed by Neuron Systems and
adapt it to sound cues.
‘We are looking for templates of
sound — riser response, shapes of
sound,’ he said. ‘If you close your
eyes, you can imagine running your
fingers along a profile. In the same
way, we can look at the shape of a
sound profile, and from the shape,
say it is the same shape as breaking
glass, for example. So it’s a very
fast, real-time method of doing it.’
The project builds on the team’s
previous study of waveform
shapes. The software will fit an AI
template to the waveform and use
fuzzy logic if the fit is not exact. For
example, different panes of glass
breaking will have different waveforms
but the same generic shape.
A key challenge for the project is
that the system needs to respond
to an anomaly in real time, on a
scale comparable to human
response time, which is about 300
milliseconds. The software will
identify a problem and instantly
swing the camera in that direction,
just as a person would turn their
head if they heard a scream.
The software will work alongside
CCTV-based human motion analysis
that has been developed at the
same institute.
‘If the camera is pointing in a
direction because an aggressive
sound has been identified, the
motion software can identify
whether a person is punching
another or running away from the
scene,’ said Brown.
‘In a similar way, it can look at a
template of a body shape carrying
out an action, so it could spot the
difference between reaching out to
pick a jumper off the shelf or
FOR THE LATEST NEWS GO TO www.theengineer.co.uk
Researchers say the system will trigger security equipment to respond to a crime in progress
the EnGIneeR 2–15 JUNE 2008 7
punching the assistant, or spot
someone running away in a football
crowd.’
Another problem the project
could help overcome would be having
to search through hours of
security video to identify a specific
object or action. Instead of having
to watch an eight-hour tape looking
for a white van, for example, the
software could identify the object
quickly on- or offline. When combined
with sound cues, the camera
could be assured to be pointing in
the right direction.
It could also make life easier for
camera operators. ‘If you sit and
look at a camera in one of these
control rooms in a council office
and you move it, it’s disorienting,’
said Brown. ‘Panning cameras
manually is not as easy as it
appears. It’s like looking through a
telescope, then swinging it to
another location — you don’t know
where you’re looking. But if you’re
looking at the sound being generated
— a car being broken into, a
kid shouting — you know you’re
looking at an important scene.’
The potential market for software
incorporating the algorithm
created from the research could
include local councils, private security
firms, car parks, shopping
centres, football stadiums and
public transport.
By the end of the three-year
project, the teams hopes to have
generated algorithms that can be
incorporated into a commercial
software suite, with each generation
of algorithms becoming more
sophisticated as the project
progresses.
‘Because it’s AI, the longer it’s in
the software, the more it learns,’
said Brown. ’The later versions will
get cleverer as time goes on, perhaps
identifying certain words
being said or violent sounds.’
Berenice Baker