Is it Nemo or Dory? Fast and accurate object detection for IoT and edge devices
Reading group: Dimitrije Panic presented "Is it Nemo or Dory? Fast and accurate object detection for IoT and edge devices" (IoT'21) at 4A312 the 10/2/2023 at 11h00.
Abstract
Current state-of-the-art object detection neural networks, such as YOLO and SSD, are trained and developed on serverclass GPUs. These neural networks do not scale down well to resource-constrained devices, with both accuracy and precision taking a significant hit at the expense of speed. This is particularly concerning as object detection algorithms are often used for low-powered devices, such as surveillance and smart-home cameras, where accuracy is critical. Therefore, these devices generally tend to forward data to servers for processing, which adds network latency. In other cases, algorithms developed for these devices shrink neural networks to reduce computation at the expense of accuracy, which is often not acceptable.
We create an alternative object detection scheme for staticcamera systems such as those used in surveillance and smarthome settings. Our model does not require positions of objects in training data, enabling us to retain only the relevant parts of a video frame for training data and reduce data storage cost while preserving privacy of other subjects in the video. When evaluated on static-camera video feeds using an NVIDIA Jetson Nano (a hybrid Arm and GPU IoT embedded device platform), our method increases throughput compared to Tiny YOLOv3. Further, it takes half of the time that Tiny YOLOv3 takes when run on OpenCV’s CUDAoptimized DNN framework per frame while having high accuracy of bounding box prediction and classification. The lessons learned from this work also suggest other strategies for tailoring the deployment of deep learning algorithms at the edge.