A drone surveillance system capable of highlighting “violent individuals” in a crowd of between 2 and 10 people in real time has been built and is being reported on by the UK based Register.
This seems like a great concept to us, the issue is surveillance with crowds or exception behavior where aggressive behavior is flagged for review by a person. While this is not designed to handle this quite yet as the examples that they used focus on between 2 and 10 people. It is easy to envision it’s use to monitor aggressive behavior in very large crowds such as punching, stabbing, shooting, etc. Current uses might be for reduction of “boredom fatigue where a person watching a monitor might miss out of boredom. Much like in inspections with drones that focus on the parts that are broken not the parts that ok and working fine.
Not discussed but not hard to imagine would be this type of technology being used of a crowded beach for erratic swimming patterns by life guards, after a football game to spot fights, and monitor large events such as the marathons or entering and leaving stadiums, etc.
This is many ways seems better to us than the straight use of AI where facial recognition is on the lookout for bad guys, as it monitors behavior and while they quote 85% efficiency it is a step in the right direction and would have to be reworked for large crowds. I would, for example, not be excited to be questioned after my son playfully punches me when leaving a baseball game.
The inventors are based at the University of Cambridge, England, India’s National Institute of Technology, and the Indian Institute of Science.
The Register writes:
“The artificially intelligent technology uses a video camera on a hovering quadcopter to study the body movements of everyone in view. It then raises an alert when it identifies aggressive actions, such as punching, stabbing, shooting, kicking, and strangling, with an accuracy of about 85 per cent. It doesn’t perform any facial recognition – it merely detects possible violence between folks.
And its designers believe the system could be expanded to automatically spot people crossing borders illegally, detect kidnappings in public areas, and set off alarms when vandalism is observed.
“Law enforcement agencies have been motivated to use aerial surveillance systems to surveil large areas,” the team of researchers noted in their paper detailing the technology, which was made public this week.
“Governments have recently deployed drones in war zones to monitor hostiles, to spy on foreign drug cartels, conducting border control operations, as well as finding criminal activity in urban and rural areas.
“One or more soldiers pilot most of these drones for long durations which makes these systems prone to mistakes due to the human fatigue.”
The model works in two steps. First, the feature pyramid network, a convolutional neural network, detects individual humans in images from the drone’s camera. Second, it uses a scatternet, also a convolutional neural network, tacked onto a regression network to analyze and ascertain the pose of each human in the image.
It breaks down the outline of the body into 14 key points to work out the position of the person’s arms, legs, and face in order to identify the different classes of violent behavior specified in the training process.
Here’s a video of how it works.
The system was trained on the Aerial Violent Individual dataset compiled by the researchers. Twenty-five people were called in to act out attempts at punching, stabbing, shooting, kicking, and strangling to create 2,000 annotated images. Each image featured two to ten people, so this system isn’t, right now, equipped to handle very large crowds.
The accuracy level is highest when the system has to deal with fewer people. With one person in the image, the accuracy was 94.1 per cent, but it drops to 84 per cent for five people, and goes down to 79.8 per cent for ten people. “The fall in accuracy was mainly because some humans were not detected by the system,” Amarjot Singh, a coauthor of the paper, said.
It’s difficult to really judge how accurate the drone system is considering it hasn’t really been tested on normal people in real settings yet – just volunteers hired by the research team. In other words, it was trained by people pretending to attack each other, and was tested by showing it people, well, pretending to attack each other. On the other hand, it is a research project, rather than a commercial offering (yet).
The images fed into the system were also recorded when the drone was two, four, six, and eight metres away. So, that gives you an idea of how close it had to be. And considering how loud the quadcopters are, it’s probably not very discrete. In real crowds and brawls, the gizmos would be a few hundred feet away, reducing visibility somewhat.
The live video analysis was carried out on Amazon’s cloud service, in real time, using two Nvidia Tesla GPUs, while the drone’s builtin hardware directed its flight movements. The tech was trained using a single Tesla GPU on a local machine.
“The system detected the violent individuals at five frames per second to 16 frames per second for a maximum of ten and a minimum of two people, respectively, in the aerial image frame,” the paper stated.
Performing inference in the cloud is a potential security and privacy hazard, seeing as you’re streaming footage of people into a third-party computer system. To mitigate any legal headaches, the trained neural network processes each frame received by the drone in the cloud and, apparently, deletes it after it the image is processed.
“This adds data safely layer as we retain the data in the cloud only for the time it is required,” Singh, a PhD student at the University of Cambridge, told The Register.
The use of AI for surveillance is concerning. Similar technologies involving actual facial recognition, such as Amazon’s Rekognition service, have been employed by the police. These systems often suffer from high false positives, and aren’t very accurate at all, so it’ll be a while before something something like this can be combined with drones.
In this case, an overly sensitive system could produce false positives for people playing football together, and think that they were kicking one another. At the moment, the technology identifies what the team labels violence – but this definition could be expanded to praying, kissing, holding hands, whatever a harsh or authoritarian government wants to detect and stamp out.
Singh said he is a little worried the final form of the technology could be widened by its users to track more than thugs causing trouble.
“The system [could potentially] be used to identify and track individuals who the government thinks is violent but in reality might not be,” Singh said. “The designer of the [final] system decides what is ‘violent’ which is one concern I can think of.”
The researchers used the Parrot AR Drone, a fairly cheap gizmo, to carry out their experiments. Jack Clark, strategy and communications director at OpenAI, previously told The Register that he believed commercially available drones could in the future be reprogrammed using “off-the-shelf software, such as open-source pre-trained models,” to create “machines of terror.”
It’s also pretty cheap to run. This experiment cost about $0.100 per hour to run on Amazon’s platform, so it’s not too expensive after the system has been trained.
Singh admitted that “one could potentially use this system for malicious applications, but training this system for those applications will require large amounts of data which requires numerous resources. I am hoping that some oversight would come with those resources which can prevent the misuse of this technology.”
But he believed the concerns of hobbyists reprogramming drones for nefarious reasons were unwarranted. “Buying the drone is indeed easy but designing an algorithm which can identify the violent individuals requires certain expertise related to designing deep systems which are not easy to acquire. I don’t think that these systems are easy to implement,” he said.
The researchers are planning to test their system in real settings during two music festivals, and monitoring national borders in India. If it performs well, they said they hoped to commercialize it.
“Artificial intelligence is a potent technology which has been used to design several systems such as google maps, face detection in cameras, etc which have significantly improve our lives,” Singh said.
“One such application of AI is in surveillance systems! AI can help develop powerful surveillance systems which can assist in identifying pernicious individuals which will make the society a safer place. Therefore I think it is a good thing and is also necessary.
“That being said, I also think that AI is extremely powerful and should be regulated for specific applications like defense, similar to nuclear technology.”
The research will be presented at a workshop during the International Conference on Computer Vision and Pattern Recognition (CVPR) 2018 workshops in Salt Lake City, Utah, USA, in June. ®