Comparative Analysis of YOLO, Faster R-CNN, RetinaNet,and DETR for Autonomous Vehicle Object Detection

Kandlakunta Sumana Mounya

PDF

Received: 23-Aug-2024
Accepted: 15-May-2025
Published: 01-Jul-2025

Keywords:

Autonomous Vehicles, Object Detection, YOLO, Faster R-CNN, RetinaNet, DETR, Feature Pyramid Network (FPN), Transformer

Kandlakunta Sumana Mounya

Research Scholar, Gandhi Institute of Technology, Visakhapatnam, India

Abstract

Object detection is a cornerstone of autonomous vehicle (AV) perception, enabling
the identification of vehicles, pedestrians, traffic signs, and other road objects in real
time. This paper presents a comprehensive literature review comparing four leading object
detection architectures—YOLO (You Only Look Once), Faster R-CNN (with Feature
Pyramid Networks), RetinaNet (with Bidirectional FPN enhancements), and the Detection
Transformer (DETR)—with a focus on their application in autonomous driving systems.
We examine their architectures, training methodologies, inference speeds, detection accuracy,
and suitability for deployment under stringent AV constraints. Challenges such as
multi-scale detection, occlusions, class imbalance, and adverse environmental conditions
are analyzed using results from domain-specific benchmarks (KITTI, BDD100K, nuScenes).
Our findings indicate that one-stage detectors (YOLO, RetinaNet) generally achieve higher
frame rates suitable for on-board inference, while two-stage detectors (Faster R-CNN) often
offer superior accuracy at the cost of speed. Transformer-based DETR introduces a
new paradigm with fewer heuristics and a streamlined pipeline, though requiring specialized
improvements for small-object detection and efficient training. We conclude with future
research directions, including convergence between convolutional and transformer architectures,
multi-modal sensor fusion, and efficiency optimizations to meet AV safety and latency
requirements.

Downloads

Download data is not yet available.

How to Cite

Kandlakunta Sumana Mounya. (2025). Comparative Analysis of YOLO, Faster R-CNN, RetinaNet,and DETR for Autonomous Vehicle Object Detection. Doupe Journal of Top Trending Technologies, 1(2). Retrieved from https://www.doupe.in/index.php/ttt/article/view/12

Issue

Vol. 1 No. 2 (2025): Recent Advances and Emerging Trends in Computer Vision

Section

Articles

Article Sidebar

Main Article Content

Abstract

Downloads

Article Details