The document provides an overview of object detection, detailing its definition, approaches including classical and deep learning methods, and introduces the TensorFlow Object Detection API for training and evaluating detection models. It discusses various applications of object detection, such as facial recognition and self-driving cars, highlighting the technology's importance in security and navigation. Additionally, it offers installation instructions, dataset creation tips, and links to further resources.
Deep learning approach
OverFeat-publishedin2013,multi-scalesliding
window algorithm using Convolutional Neural
Networks(CNNs).
al
C.NN - RegionswithCNN features.Threestage
approach:
- Extract possible objects using aregion propos
method(themostpopular onebeingSelective
Search).
- ExtractfeaturesfromeachregionusingaCNN.
- ClassifyeachregionwithSVMs.
11.
Fast R-CNN -Similarto R-CNN, it used Selective
Search to generate object proposals, but instead of
extractingallof themindependentlyandusingSVM
classifiers,it appliedtheCNN onthecompleteimage
and then used both Region of Interest (RoI) Pooling
onthefeaturemapwithafinalfeedforwardnetwork
for classificationandregression.
YOLO-You Only Look Once: a
simple convolutional neural
network approach which has
both great results and high
speed,allowingfor thefirsttime
realtimeobjectdetection.
Deep learning approach
12.
Faster R-CNN -FasterR-CNN added what
theycalledaRegionProposalNetwork(RPN),
inanattempttogetridof theSelectiveSearch
algorithm and makethe model completely
trainableend-to-end.
SSD andR-FCN
Finally, there are two notable papers, Single Shot
Detector(SSD)whichtakesonYOLO byusingmultiple
sized convolutional feature mapsachieving better
results and speed,and Region-based Fully
Convolutional Networks (R-FCN) which takes the
architecture of Faster R-CNN but with only
convolutionalnetworks.
Deep learning approach
Selecting a model
TensorflowOD API provides a collection of
detection models pre-trained on the COCO
dataset,theKitti dataset, andtheOpenImages
dataset.
- model name correspondsto aconfigfile that
wasusedto train this model.
- speed -runningtimeinmsper600x600
image
- mAP standsformeanaverageprecision,
which indicates how well the model
performedontheCOCO dataset.
- Outputs types(Boxes,andMasks if
applicable)
23.
Training & Evaluating
#From the tensorflow/models/research directory
python object_detection/eval.py
--logtostderr
--pipeline_config_path=${PATH_TO_YOUR_PIPELINE_CONFIG}
--checkpoint_dir=${PATH_TO_TRAIN_DIR}
--eval_dir=${PATH_TO_EVAL_DIR}
# From the tensorflow/models/research directory
python object_detection/train.py
--logtostderr
--
pipeline_config_path=/tensorflow/models/object_detection/samples/configs/ssd_mobilenet_v1_p
ets.config
--train_dir=${PATH_TO_ROOT_TRAIN_FOLDER}
Facial Recognition:
A deeplearning facial recognition system called the
“DeepFace” has been developed by a group of researchers
in the Facebook, which identifies human faces in a digital
image very effectively. Google uses its own facial
recognition system in Google Photos, which automatically
segregates all the photos based on the person in the
image. There are various components involved in Facial
Recognition like the eyes, nose, mouth and the eyebrows.
26.
Self Driving Cars:
Self-drivingcars are the Future, there’s no doubt in
that. But the working behind it is very tricky as it
combines a variety of techniques to perceive their
surroundings, including radar, laser light, GPS,
odometry, and computer vision.
Advanced control systems interpret sensory
information to identify appropriate navigation
paths, as well as obstacles and once the image
sensor detects any sign of a living being in its path,
it automatically stops. This happens at a very fast
rate and is a big step towards Driverless Cars.
27.
Security: Object Detectionplays a very important role in Security. Be it face ID of Apple or
the retina scan used in all the sci-fi movies.
It is also used by the government to access the security feed and match it with
their existing database to find any criminals or to detect the robbers’ vehicle.
The applications are limitless.