Code360 powered by Coding Ninjas X Code360 powered by Coding Ninjas X
Table of contents
Introduction and Origin to Detectron2
Detectron2 for Object Detection
Getting started with Detectron2
Implementation of Detectron2 with Datasets
Frequently Asked Questions
Differentiate between Object Recognition and Object Detection.
How will you learn Object Detection?
What are the several types of models Detectron2 offers?
Last Updated: Mar 27, 2024

Detectron2 for Object Detection

Author Rupal Saluja
0 upvote
Create a resume that lands you SDE interviews at MAANG
Anubhav Sinha
SDE-2 @
12 Jun, 2024 @ 01:30 PM


How many times have you lost the TV remote, and how much time did you spend finding it? It happens to most of us and is the most frustrating experience. However, if I told you that there is a computer algorithm that could solve the problem in a few milliseconds?

Object Detection is a method to solve such problems. Have you ever tried an Object Detection model using a dataset of your choice?

logo of detectron2

In this article, we will deeply understand the details of Detectron2 for Object Detection. This includes its introduction, origin, getting started perks, and implementation.

Introduction and Origin to Detectron2

Detectron2 is an advanced library launched by Facebook AI Research (FAIR) in 2018 to implement detection and segmentation problems. It is framed upon the maskrcnn benchmark and requires CUDA to solve heavy computations. It supports numerous operations such as bounding box detection, keypoint detection, instance segmentation, etc. It has come up with pre-trained models that you can load easily and use per your requirements.

The previous framework Detectron was implemented using Caffe2. However, this new framework is implemented in PyTorch and uses Torch attributes. All the models present are pre-trained on COCO Dataset.

Detectron2 for Object Detection

Now that you have a brief idea about Detectron2 let’s start with Detectron2 for Object Detection.

Getting started with Detectron2

To start with Detectron2, we will install the necessary dependencies, check libraries, and import a few necessary packages.


We will start with installing a few prerequisites, such as Torch Vision and COCO API. Then we will check if the CUDA is available. And finally, we will install Detectron2 using the following piece of code.

# installing dependencies:
!pip install -U torch==1.5 torchvision==0.6 -f
!pip install cython pyyaml==5.1
!pip install -U 'git+'
import torch, torchvision
print(torch.__version__, torch.cuda.is_available())
!gcc --version
# install detectron2:
!pip install detectron2==0.1.3 -f

Importing a few necessary packages

import detectron2
from detectron2.utils.logger import setup_logger

#import some common libraries
import numpy as n
import cv2
Import random
from google.colab.patches import cv2_imshow

#importing some common detectron2 utilities
from detectron2 import model_zoo
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
from detectron2.utils.visualizer import Visualizer
from import MetadataCatalog

Implementation of Detectron2 with Datasets

This module covers the complete steps involved in Detectron2 for Object Detection. 

Step-1: Preparing the Dataset

Some datasets have built-in support in Detectron2 and they are listed in builtin datasets folder. If you want to use a Dataset of your choice, you must register it.

Firstly, we will train a Text Detection Model from an existing pre-trained model on the COCO dataset. The Text Detection Dataset has three classes: Hindi, English and Others. There are several formats in which you can feed data to the model. However, Detectron2 accepts only the COCO format. The COCO format contains a JSON file that includes all the image details.

Step-2: Registering the Dataset

Use the following code to register the dataset.

import json
from detectron2.structures import BoxMode
def get_board(imgdir):
    json_file = imgdir+"/dataset.json"
    with open(json_file) as file:
        dataset = json.load(file)
    for i in dataset:
        filename = i["file_name"] 
        i["file_name"] = imgdir+"/"+filename 
        for j in i["annotations"]:
            j["bbox_mode"] = BoxMode.XYWH_ABS
            j["category_id"] = int(j["category_id"])
    return dataset
from import DatasetCatalog, MetadataCatalog
for d in ["train", "val"]:
    DatasetCatalog.register("boardetect_" + d, lambda d=d: get_board("Text_Detection_Dataset_COCO_Format/" + d))
    MetadataCatalog.get("boardetect_" + d).set(thing_classes=["HINDI","ENGLISH","OTHER"])
board_metadata = MetadataCatalog.get("boardetect_train")

Step-3: Starting with the Training Set

We will randomly pick two pictures from our Metro Dataset and analyze what the bounding boxes look like. Use the following Python code for the same.

dataset = get_board("Text_Detection_Dataset_COCO_Format/metro")
for d in random.sample(dataset, 2):
    img = cv2.imread(d["file_name"])
    visualizer = Visualizer(img[:, :, ::-1], metadata=board_metadata)
    vis = visualizer.draw_dataset_dict(d)
    cv2_imshow(vis.get_image()[:, :, ::-1])


example images
example images

Step-4: Training the Model

Here, we will just fine-tune our model on the dataset. You can use the example configuration below for your reference.

from detectron2.engine import DefaultTrainer
from detectron2.config import get_con
import os
con = get_con()
con.DATASETS.TRAIN = ("boardetect_metro",)
con.DATASETS.TEST = ("boardetect_val",)
con.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml") con.SOLVER.IMS_PER_BATCH = 4
con.SOLVER.BASE_LR = 0.0125  # pick a good LearningRate
con.SOLVER.MAX_ITER = 1500  #No. of iterations   
os.makedirs(con.OUTPUT_DIR, exist_ok=True)
trainer = CocoTrainer(cfg) 

Step-5: Example using the Trained Model

Use the code below to check the Trained Model configuration.

from detectron2.utils.visualizer import ColorMode 
con.MODEL.WEIGHTS = os.path.join(con.OUTPUT_DIR, "model_final.pth")
con.DATASETS.TEST = ("boardetect_val", )
predictor = DefaultPredictor(con)
dataset = get_board("Text_Detection_Dataset_COCO_Format/val")
for d in random.sample(dataset, 2):    
    im = cv2.imread(d["file_name"])
    outputs = predictor(im)
    v = Visualizer(im[:, :, ::-1],
    v = v.draw_instance_predictions(outputs["instances"].to("cpu"))    cv2_imshow(v.get_image()[:, :, ::-1])


example images

example images

Step-6: Evaluation of the Trained Model

Use the reference code to evaluate the trained model.

from detectron2.evaluation import COCOEvaluator, inference_on_dataset
from import build_detection_test_loader
evaluator = COCOEvaluator("boardetect_val", con, False, output_dir="/output/")
val_loader = build_detection_test_loader(con, "boardetect_val")
inference_on_dataset(predictor.model, val_loader, evaluator)

Also see, Sampling and Quantization

Get the tech career you deserve, faster!
Connect with our expert counsellors to understand how to hack your way to success
User rating 4.7/5
1:1 doubt support
95% placement record
Akash Pal
Senior Software Engineer
326% Hike After Job Bootcamp
Himanshu Gusain
Programmer Analyst
32 LPA After Job Bootcamp
After Job

Frequently Asked Questions

Differentiate between Object Recognition and Object Detection.

Object detection is the process of searching instances of objects in images. However, object recognition is a phenomenon where the thing is identified and located in a snap. It allows multiple objects to be identified and located within the same idea.

How will you learn Object Detection?

Object detection can be done by machine learning or a deep learning approach. The machine learning approach requires the features to be defined using various methods and classification techniques.

What are the several types of models Detectron2 offers?

The several types of models Detectron2 offers are Box, Mask, Keypoint, Densepose, Semantic Segmentation, etc.


Overall, we understood the various concepts of Detectron2 for Object Detection. This includes its introduction, its applications, and its implementation.

We hope the above discussion helped you learn Detectron2 for Object Detection more clearly and can be used for future reference whenever needed. To learn more about Object Detection, you can refer to blogs on Object Detection using Deep LearningFeature Detection with Haar Cascade, and Detecting Objects with Xpath in Katalon.

Related links:

Visit our website to read more such blogs. Make sure you enroll in our courses, take mock tests, solve problems, and interview puzzles. Also, you can pay attention to interview stuff- interview experiences and an interview bundle for placement preparations. Do upvote our blog to help fellow ninjas grow.

Happy Coding!

Previous article
OpenCV Python Tutorial
Next article
Visual QA
Live masterclass