Object Detection and Recognition using Deep Learning

  • Aparna Gullapelly, Barnali Gupta Banik


it is described that object identification and reorganization is a collection of computer related tasks that involves activities such as identity of objects in digital photographic images, image classifications that predicts the class of a single image etc. Here, object localization is the process of identifying the exact location of one or more objects in a single image and drawing the bounding boxes around them.

Now object detection combines the both tasks and classifies the localized one or more objects in an image but when the user prefers the term object recognition it often refers as object detection. It is a challenge to the beginners to differentiate the related computer vision task for finding objects of real world in object detection system for both digital images and videos where the image belong to any class such as humans, cars, movable objects etc. The system requires some components for detecting the images and videos for completing the job by using model databases and feature extractors. Present paper presents a concept for detecting, localizing, categorizing, and extracting the images and also data look in images.

We have several Algorithms that are being used for object detection. However, allow us to currently discuss regarding 3 algorithms that I believe would be effective, efficient and quicker. Let’s see regarding their definitions in an exceedingly fast look of your time for currently and let’s expand them within the later sections of the paper.

  1. Haar Cascade Classifier:

Haar Cascade classifier is an efficient object detection approach that was projected by Paul Viola and archangel Jones in their paper, “Rapid Object Detection employing a Boosted Cascade “.So, let’s try and perceive what these Haar Cascade Classifiers basically are. This is primarily a machine learning based approach where a cascade perform is trained from tons of pictures containing both positive and negative and it's then exposed to discover the objects within the different pictures.

  1. MTCNN:

MTCNN stands for Multitask Cascaded Convolutional Neural Networks which an associate machine is learning approach consisting of three stages, that detects the bounding boxes of faces in a picture alongside their five purpose Face Landmarks. In Every stage, bit by bit improves the detection results by passing its inputs through a CNN, Which returns candidate bounding boxes with their scores, followed by non-max suppression.

  1. YOLO V3:

YOLO is a totally convolutional network and its ultimate output is generated by applying a 1 x 1 kernel on a feature map. In YOLO v3, the detection is completed by applying 1 x 1 detection kernels on feature maps of 3 completely different sizes at 3 different places within the network.

Keywords-Object detection, Haar cascade, Gun Object, Face detection, YOLOV3, MTCNN, Helmet detection, eye detection