Transformer-Based Models in Image Segmentation and Classification: A New Era in Vision AI

Main Article Content

Atako, Nelson Rachael

Abstract

Over the past decade, deep learning has revolutionized computer vision, with
convolutional neural networks (CNNs) dominating tasks like image classification and segmentation.
However, a new paradigm emerged as transformer-based models – originally
developed for natural language processing – have begun to surpass previous CNN-based approaches
across vision tasks. This marks a new era in Vision AI, where transformers’ ability
to capture long-range dependencies and global context is reshaping how we design vision
systems. Transformer models have achieved state-of-the-art performance in image classification
(assigning labels to entire images) and segmentation (partitioning images into labeled
regions), often with simpler pipelines and stronger results than their CNN predecessors.

Downloads

Download data is not yet available.

Article Details

How to Cite
Atako, Nelson Rachael. (2025). Transformer-Based Models in Image Segmentation and Classification: A New Era in Vision AI. Doupe Journal of Top Trending Technologies, 1(2), 36–45. Retrieved from https://www.doupe.in/index.php/ttt/article/view/15
Section
Articles