This is an old revision of the document!

Neural Net

YOLO - you only look once looks at the whole image in one pass, with one neural net. though it segments the image into SxS squares bounding box: cx, cy, w, h, confidence w and h are expressed as percentages of the full image dimenrsions IOU - intersection over union, the area of the intersection divided by the area of the union, of the two boxes, ground truth vs predicted, used as confidence score in YOLO

analyze image predict predict objects in image inputs: image, model, ground truth output: prediction

pro - fast enough to be used for real-time object detection, 24 FPS con - because it segments the image into a grid, and allows only one object per cell, it does not do well with small and/or overlapping objects.

comes in versions, YOLOv1, YOLOv3, YOLOv5s, etc

VisDrone dataset