Important: This guide is based on official tutorial and could intersect with this Tutorial from Roboflow-Team.

Introduction

In this notebook, we implement The TensorFlow 2 Object Detection Library for training on your own dataset.

We will take the following steps to implement a model from TensorFlow 2 Detection Model Zoo on our custom data:

Install TensorFlow2 Object Detection Dependencies
Download Custom TensorFlow2 Object Detection Dataset
Write Custom TensorFlow2 Object Detection Training Configuation
Train Custom TensorFlow2 Object Detection Model
Export Custom TensorFlow2 Object Detection Weights
Use Trained TensorFlow2 Object Detection For Inference on Test Images

When you are done you will have a custom detector that you can use. It will make inference like this:

Install TensorFlow2 Object Detection Dependencies

To install TensorFlow2 Object Detection on Google-Colab run the following steps.

import os
import pathlib

# Clone the tensorflow models repository if it doesn't already exist
if "models" in pathlib.Path.cwd().parts:
  while "models" in pathlib.Path.cwd().parts:
    os.chdir('..')
elif not pathlib.Path('models').exists():
  !git clone --depth 1 https://github.com/tensorflow/models

# Install the Object Detection API
%%bash
cd models/research/
protoc object_detection/protos/*.proto --python_out=.
cp object_detection/packages/tf2/setup.py .
python -m pip install . --quiet

Run the TF2 model builder tests to make sure our environment is up and running. If successful If successful, you should see the following outputs at the end of the cell execution printouts.

[ RUN      ] ModelBuilderTF2Test.test_unknown_ssd_feature_extractor
[       OK ] ModelBuilderTF2Test.test_unknown_ssd_feature_extractor
----------------------------------------------------------------------
Ran 20 tests in 52.705s

OK (skipped=1)

#run model builder test to ensure everything is up and runnning
!python /content/models/research/object_detection/builders/model_builder_tf2_test.py

To install on a custom machine check : Installation

Download the data:

For this task we are going to be using the Oxford Pets dataset. This dataset contains 37 category pet dataset with roughly 200 images for each class. The annotations contain tight bounding box (ROI) around the head of the animal.

#Download the Oxford-IIIT Pet 
!wget http://www.robots.ox.ac.uk/~vgg/data/pets/data/images.tar.gz
!wget http://www.robots.ox.ac.uk/~vgg/data/pets/data/annotations.tar.gz
!tar -xf annotations.tar.gz
!tar -xf images.tar.gz

Before training let us create a folder /content/workspace/.

It is within the workspace that we will store all our training set-ups. This will contain all files related to our model training.

#We will store all the required files in the workspace folder
!mkdir /content/workspace/
!mkdir /content/workspace/images/ # store images
!mkdir /content/workspace/annotations/ # store xml annotation files
!mkdir /content/workspace/images/train # train images
!mkdir /content/workspace/images/test # test images
!mkdir /content/workspace/annotations/train # train annotations
!mkdir /content/workspace/annotations/test # test annotations
!mkdir /content/workspace/data/ # directory to store the tf_records & the label_map

import os
import pathlib
import logging
import re
import shutil
import glob
import pandas as pd
import xml.etree.ElementTree as ET
from tqdm import tqdm
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
import tarfile
import time
from collections import defaultdict
from io import StringIO
from matplotlib import pyplot as plt
from PIL import Image
from IPython.display import display
import numpy as np


IMAGE_DIR = "/content/images"
ANNOT_DIR = "/content/annotations/xmls"

pd.set_option("display.max_colwidth", None)
os.chdir("/content/")

%load_ext tensorboard
%load_ext autoreload
%matplotlib inline
%autoreload 2

import tensorflow.compat.v1 as tf1
import contextlib2
import tensorflow as tf

from object_detection.utils import dataset_util
from object_detection.utils import label_map_util
from object_detection.utils import config_util
from object_detection.utils import visualization_utils as viz_utils
from object_detection.utils import colab_utils
from object_detection.builders import model_builder
from object_detection.dataset_tools import tf_record_creation_util


# Enable GPU dynamic memory allocation
gpus = tf.config.experimental.list_physical_devices('GPU')
for gpu in gpus:
    tf.config.experimental.set_memory_growth(gpu, True)

Prepare Tensorflow 2 Object Detection Training Data

Tensorflow object detection API expects the data to be in the form of TFRecords . In this part we are going to convert our data present in Pascal-VOC format into TFRecords.

To do this we will implement the following the steps:

Iterate over all the annotations and partition the annotations into train and test datasets. The train annotatins and images will be saved to /content/workspace/annotations/train & /content/workspace/images/train respectively. Similarly the test data will be saved to /content/workspace/annotations/test & /content/workspace/images/test .
Convert all the *.xml annotation files into a single Pandas DataFrame object.
We will create a tensoflow 2 object detection format label-map which will be used in training/evaluation the model .
Use this Pandas DataFrame to create TFRecords for the train and test datasets. The TFRecords will be saved to /content/workspace/data/.

1. Partition the Dataset :

If we look at the data that is saved in /content/images/ & /content/annotations/ we will see that not all the images have the corresponding annotations and the images and annotations are saved as {filename}.jpeg & {filename}.xml respectively.

What we will do is we will first split the images using sklearn.train_test_split into a train and test dataset. Then we will check for the corresponding annotation for the image . If the annotation file exists we will copy the image and annotation into their repectives directories under /content/workspace .

all_images    = os.listdir(IMAGE_DIR)

#Split the images into train and test datasets
train_images, test_images = train_test_split(all_images, test_size=0.2, random_state = 123) 

#Grab the list of all the annotations for the train and test images
#Some annotations may not exist we will filter these in the next cell
train_xmls = [f.split(".")[0] + ".xml" for f in train_images]
test_xmls  = [f.split(".")[0] + ".xml" for f in test_images ]

def move_file(fileList : list, src: str, dest: str):
    """
    This Fn copy's files from a given fileList from src to dest
    if the file exits.

    Args:
        fileList: List containing all the files present in the src directory.
        src     : source directory for the files.
        dest    : destination where to copy the files present in fileList.
    """
    for f in tqdm(fileList):
        fileName = os.path.join(src, f)
        #Check if the file exits, if the file exits copy contents from src to dest
        if os.path.exists(fileName):
            shutil.copy2(src=fileName, dst=os.path.join(dest, f))



#Move images and annotations to workspace directory
move_file(train_images, src=IMAGE_DIR, dest="/content/workspace/images/train/")
move_file(test_images,  src=IMAGE_DIR, dest="/content/workspace/images/test/")

move_file(train_xmls, src=ANNOT_DIR, dest="/content/workspace/annotations/train/")
move_file(test_xmls,  src=ANNOT_DIR, dest="/content/workspace/annotations/test/")

100%|██████████| 5914/5914 [00:02<00:00, 2955.61it/s]
100%|██████████| 1479/1479 [00:00<00:00, 2300.28it/s]
100%|██████████| 5914/5914 [00:00<00:00, 15798.62it/s]
100%|██████████| 1479/1479 [00:00<00:00, 18011.92it/s]

2. Create Pandas DataFrame Object :

Now, that we have partitioned our dataset and the images/annotations are present in the repective directories we will now create a pandas dataframe from the *.xml files . The DataFrame will contrain the fillowing information:

filename (str): Path to the image file.
width (float/int): Absolute width of the image.
height (float/int): Absolute height of the image.
labels (str): The class of the object present in the bounding box.
xmin (float/int): Absolute xmin co-ordinate for the bounding box.
ymin (float/int): Absolute ymin co-ordinate for the bounding box.
xmax (float/int): Absolute xmax co-ordinate for the bounding box.
ymax (float/int): Absolute ymax co-ordinate for the bounding box.
encoded_label (int): The label for the object in the bounding box. 0 represents always the background class.

#from the filename.
exp = r"/([^/]+)_\d+.jpg$"
exp = re.compile(exp)

#sklearn.LabelEncoder will be used to convert the class of the object into integer format.
le  = LabelEncoder()

def xml2pandas(annot_dir):
    """
    Fn converts the xml files into a pandas dataframe.

    Args:
        annot_dir: Directory where all the *.xml annotation files are stored
    """
    xml_list = []
    for xml_file in tqdm(glob.glob(annot_dir + '/*.xml')):
        tree = ET.parse(xml_file)
        root = tree.getroot()
        for member in root.findall('object'):
            value = (
                root.find('filename').text,
                int(root.find('size')[0].text),
                int(root.find('size')[1].text),
                member[0].text,
                int(member[4][0].text),
                int(member[4][1].text),
                int(member[4][2].text),
                int(member[4][3].text)
                )
            xml_list.append(value)
        column_name = ['filename', 'width', 'height','labels', 'xmin', 'ymin', 'xmax', 'ymax']
        xml_df = pd.DataFrame(xml_list, columns=column_name)
    logging.info("DataFrame Generated ! ")
    return xml_df


def process_data(annotDir, imageDir, image_set="train"):
    """
    Fn creates a pandas DataFrame object from the annotation in annotDir
    and images in imageDir. This Fn also extracts the name of the class from the
    filename and converts it into integer labels starting from 1 as 0 is reserved
    for the background class always.

    Args:
        annotDir  : directory where the *.xml annotation files are stored.
        imageDir  : directory where all the images are stored.
        image_set : one of either `train` or `test`, this use when converting 
                    the class objects into integer formats.
    """
    data = xml2pandas(annotDir)
    #modify the filename to point to the original filename
    data.filename = [os.path.join(imageDir, fname) for fname in data.filename.values]
    #extract the class labels from the filenames
    data["labels"] = [exp.search(data.filename[idx]).group(1).lower() for idx in range(len(data))]
    #encoded the labels into integers starting from 1
    if image_set == "train" :
        data["encoded_label"] = le.fit_transform(data.labels) + 1
    elif image_set == "test" :
        data["encoded_label"] = le.transform(data.labels) + 1
    
    return data

TRAIN_IMAGE_DIR = "/content/workspace/images/train/"
TEST_IMAGE_DIR  = "/content/workspace/images/test/"
TRAIN_ANNOTATION_DIR = "/content/workspace/annotations/train/"
TEST_ANNOTATION_DIR  = "/content/workspace/annotations/test/"

#Create pandas datafame from the *.xml files
train_data = process_data(TRAIN_ANNOTATION_DIR, TRAIN_IMAGE_DIR, "train")
test_data  = process_data(TEST_ANNOTATION_DIR, TEST_IMAGE_DIR, "test")

100%|██████████| 2982/2982 [00:08<00:00, 356.66it/s]
100%|██████████| 706/706 [00:01<00:00, 704.56it/s]

#Cross check for missing files
for f in train_data.filename:
    if not os.path.exists(f):
        #remove the missing file
        print(f"{f} is missing in train_data")
        train_data = train_data[train_data.filename != f]
        train_data.reset_index(inplace=True, drop=True)


for f in test_data.filename:
    if not os.path.exists(f):
        #remove the missing file
        print(f"{f} missing in test_data")
        test_data = test_data[test_data.filename != f]
        test_data.reset_index(inplace=True, drop=True)

Our datasets are going to look something like this :

The train_data :

The test_data :

3. Create Label Map :

TensorFlow requires dataset to have a label map associated with it. This label map defines a mapping from string class names to integer class Ids. The label map should be a StringIntLabelMap text protobuf. Label map files have the extention .pbtxt and we will place it under /content/workspace/data along with the TFRecod files which we will create in the next step.

unique_labels  = list(train_data.labels.unique())
integer_labels = le.transform(unique_labels) + 1

label_dict = {unique_labels[i] : integer_labels[i] for i in range(len(unique_labels))}

label_map = "/content/workspace/data/label_map.pbtxt"
categories = train_data.labels.unique()
categories.sort()	

end = '\n'
s = ' '

for name in categories:
    out = ''
    out += 'item' + s + '{' + end
    out += s*2 + 'id:' + ' ' + (str(label_dict[name])) + end
    out += s*2 + 'name:' + ' ' + '\'' + name + '\'' + end
    out += '}' + end*2
    
    with open(label_map, 'a') as f:
        f.write(out)

Our label_map.pbtxt file will look like this :

item {
  id: 1
  name: 'abyssinian'
}

item {
  id: 2
  name: 'american_bulldog'
}

item {
  id: 3
  name: 'american_pit_bull_terrier'
}

item {
  id: 4
  name: 'basset_hound'
}
...
...

The label_map.pbtxt file has been placed under /content/workspace/data/label_map.pbtxt

4. Create TensorFlow Records :

In this step we will convert our annotatinos present in the pandas dataframe object into TFRecord format.

For every example in our dataset, we should have the following information:

An RGB image for the dataset encoded as jpeg or png.
A bounding box coordinates for each image (with origin in top left corner) defined by 4 floating point numbers [ymin, xmin, ymax, xmax].
The class of the object in the bounding box.

Note: For the bounding-boxes, the normalized coordinates (x / width, y / height) are stored in the TFRecord dataset.

Since our dataset has more than a fairly large number of annotations we will shard your dataset into multiple files. Instead of writing all tf.Example protos to a single file we will store the dataset into multiple files .

Our dataset is going to look something like this:

/{directory_path}/dataset.record-00000-00010
/{directory_path}/dataset.record-00001-00010
...
/{directory_path}/dataset.record-00009-00010

Our train dataset is going to be stored as :

/content/workspace/data/train.record-00000-of-00010
/content/workspace/data/train.record-00001-of-00010
...
/content/workspace/data/train.record-00009-of-00010

Similary for the test dataset :

/content/workspace/data/test.record-00000-of-00010
/content/workspace/data/test.record-00001-of-00010
...
/content/workspace/data/test.record-00009-of-00010

def create_tf_example(fname, data):
    """
    Creates a tf.Example proto from a single image
    from the given data

    Args:
        fname: filename of a single image from data.
        data : a pandas dataframe object in the format 
               specified in step 2. 

    Returns:
        example: The created tf.Example.
    """
    curr_data = data.loc[data.filename == fname]
    
    filename = fname.encode('utf8') # Filename of the image
    height = curr_data["height"].values[0] # Image height
    width = curr_data["width"].values[0] # Image width
    
    image_format = b'jpeg' # b'jpeg' or b'png'
    
    # List of normalized left x coordinates in bounding box (1 per box).
    xmins = list(curr_data["xmin"].values / width) 
    # List of normalized right x coordinates in bounding box (1 per box).
    xmaxs = list(curr_data["xmax"].values / width) 
    # List of normalized top y coordinates in bounding box (1 per box).
    ymins = list(curr_data["ymin"].values / height)
    # List of normalized bottom y coordinates in bounding box (1 per box).
    ymaxs = list(curr_data["ymax"].values / height)
    
    # List of string class name of bounding box (1 per box)
    classes_text = list(curr_data["labels"].values)
    classes_text = [text.encode('utf8') for text in classes_text]
    
    # List of integer class id of bounding box (1 per box)
    classes = list(curr_data["encoded_label"].values) 

    with tf1.gfile.GFile(filename, 'rb') as fid:
        encoded_image_data = fid.read() # Encoded image bytes

    features = tf1.train.Example(features=tf1.train.Features(feature={
      'image/height': dataset_util.int64_feature(height),
      'image/width': dataset_util.int64_feature(width),
      'image/filename': dataset_util.bytes_feature(filename),
      'image/source_id': dataset_util.bytes_feature(filename),
      'image/encoded': dataset_util.bytes_feature(encoded_image_data),
      'image/format': dataset_util.bytes_feature(image_format),
      'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
      'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
      'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
      'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
      'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
      'image/object/class/label': dataset_util.int64_list_feature(classes),
      }))
    
    return features


def create_records(output_path, data, shards=10):
    """
    Fn iterates over all the annotations in dataset and creates a 
    sharded TFRecord dataset and additionally saves the sharded TFRecord dataset
    to output path.

    Args:
        output_path: Path where to save the dataset
        data       : A pandas Dataframe object as specified in step-2.
        shards     : Number of the shards over which to save the dataset.
                     The dataset is going to saved inside `shards` no. of files.
    """
    writer = tf1.python_io.TFRecordWriter(output_path)
    fnames = list(data.filename.unique())

    with contextlib2.ExitStack() as tf_record_close_stack:
        output_tfrecords = tf_record_creation_util.open_sharded_output_tfrecords(tf_record_close_stack,output_path,shards)
        #enumerate over all the unique images present in the dataset
        #and create a tf.Example proto for the particular annotations.
        for index, fname in enumerate(fnames):
            tf_example = create_tf_example(fname, data)
            output_shard_index = index % shards
            output_tfrecords[output_shard_index].write(tf_example.SerializeToString())

print("Creating TFRecords ..... ", end='')
start_time = time.time()

create_records("/content/workspace/data/train.record", data=train_data)
create_records("/content/workspace/data/test.record",  data=test_data )

end_time = time.time()
elapsed_time = end_time - start_time
print('Done! Took {} seconds'.format(elapsed_time))

Creating TFRecords ..... Done! Took 4.879507780075073 seconds

Our dataset is now prepared for training using a model from TensorFlow 2 Detection Model Zoo .

The directory structure for the workspace should look something like this at this stage:

/content/workspace
├── annotations
│   ├── test [706 entries exceeds filelimit, not opening dir]
│   └── train [2982 entries exceeds filelimit, not opening dir]
├── data
│   ├── label_map.pbtxt
│   ├── test.record
│   ├── test.record-00000-of-00010
│   ├── test.record-00001-of-00010
│   ├── test.record-00002-of-00010
│   ├── test.record-00003-of-00010
│   ├── test.record-00004-of-00010
│   ├── test.record-00005-of-00010
│   ├── test.record-00006-of-00010
│   ├── test.record-00007-of-00010
│   ├── test.record-00008-of-00010
│   ├── test.record-00009-of-00010
│   ├── train.record
│   ├── train.record-00000-of-00010
│   ├── train.record-00001-of-00010
│   ├── train.record-00002-of-00010
│   ├── train.record-00003-of-00010
│   ├── train.record-00004-of-00010
│   ├── train.record-00005-of-00010
│   ├── train.record-00006-of-00010
│   ├── train.record-00007-of-00010
│   ├── train.record-00008-of-00010
│   └── train.record-00009-of-00010
└── images
    ├── test [1479 entries exceeds filelimit, not opening dir]
    └── train [5914 entries exceeds filelimit, not opening dir]

7 directories, 23 files

Configure Custom TensorFlow2 Object Detection Training Configuration

In this section we will download a pretrained-model from the TF2 OD model zoo and set up out training configuration.

In this tutorial we are going to implement the lightweight, smallest state of the art efficientdet model.

We will create a directory call pretrained-models in our wokspace folder.

We will download the latest pre-trained network for the model we wish to use. This can be in TensorFlow 2 Detection Model Zoo.

Once the *.tar.gz file has been downloaded, we will extract the file contents into. /content/workspace/pre-trained-models .

# Download the latest-pretrained weights for the efficientdet_d0 model and the config file

#LINK : http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d0_coco17_tpu-32.tar.gz
model_name = "efficientdet_d0_coco17_tpu-32"
model = "efficientdet_d0_coco17_tpu-32.tar.gz"

os.makedirs("/content/workspace/pre_trained_models/", exist_ok=True)
download_tar = f"http://download.tensorflow.org/models/object_detection/tf2/20200711/{model}"
!wget {download_tar} -P "/content/workspace/pre_trained_models/"

tar = tarfile.open(f"/content/workspace/pre_trained_models/{model}")
tar.extractall(path="/content/workspace/pre_trained_models/")
tar.close()

os.unlink(f"/content/workspace/pre_trained_models/{model}")

The directory structure for the workspace should look something like this at this stage:

/content/workspace
├── annotations
│   ├── test [706 entries exceeds filelimit, not opening dir]
│   └── train [2982 entries exceeds filelimit, not opening dir]
├── data
│   ├── label_map.pbtxt
│   ├── test.record
│   ├── test.record-00000-of-00010
│   ├── test.record-00001-of-00010
│   ├── test.record-00002-of-00010
│   ├── test.record-00003-of-00010
│   ├── test.record-00004-of-00010
│   ├── test.record-00005-of-00010
│   ├── test.record-00006-of-00010
│   ├── test.record-00007-of-00010
│   ├── test.record-00008-of-00010
│   ├── test.record-00009-of-00010
│   ├── train.record
│   ├── train.record-00000-of-00010
│   ├── train.record-00001-of-00010
│   ├── train.record-00002-of-00010
│   ├── train.record-00003-of-00010
│   ├── train.record-00004-of-00010
│   ├── train.record-00005-of-00010
│   ├── train.record-00006-of-00010
│   ├── train.record-00007-of-00010
│   ├── train.record-00008-of-00010
│   └── train.record-00009-of-00010
├── images
│   ├── test [1479 entries exceeds filelimit, not opening dir]
│   └── train [5914 entries exceeds filelimit, not opening dir]
└── pre_trained_models
    └── efficientdet_d0_coco17_tpu-32
        ├── checkpoint
        │   ├── checkpoint
        │   ├── ckpt-0.data-00000-of-00001
        │   └── ckpt-0.index
        ├── pipeline.config
        └── saved_model
            ├── assets
            ├── saved_model.pb
            └── variables
                ├── variables.data-00000-of-00001
                └── variables.index

13 directories, 30 files

Now that we have downloaded and extracted our pre-trained model, let’s create a directory for our training job. Under the /content/workspace/ create a new directory named models this will be the folder where we will store all the configurations, model_checkpoints, logs for our custom trained model.

Under the /content/workspace/models/ dir create a dir named as efficientdet_d0_coco17_tpu-32 and copy the /content/workspace/pre-trained-models/efficientdet_d0_coco17_tpu-32/pipeline.config file inside the newly created directory.

os.makedirs("/content/workspace/models/", exist_ok=True)
os.makedirs(f"/content/workspace/models/{model_name}", exist_ok=True)

config_path = f"/content/workspace/pre_trained_models/{model_name}/pipeline.config"
shutil.copy2(config_path, f"/content/workspace/models/{model_name}")

'/content/workspace/models/efficientdet_d0_coco17_tpu-32/pipeline.config'

Each model has a model_name, a pipeline.config file, a pretrained_checkpoint.

The pipeline.config file is a shell of a training configuration specific to each model type, provided by the authors of the TF2 OD repository.

The pretrained_checkpoint is the location of a pretrained weights file saved from when the object detection model was pretrained on the COCO dataset.

We will start from these weights, and then fine tune into our particular custom dataset task. By using pretraining, our model does not need to start from square one in identifying what features might be useful for object detection.

We will map our training data files to variables for use in our computer vision training pipeline configuration.

We will now edit the /content/workspace/models/pipeline.config to point to our custom data, the pretrained_checkpoint, and we also specify some training parameters.

test_record_fname = "/content/workspace/data/test.record-?????-of-00010"
train_record_fname = "/content/workspace/data/train.record-?????-of-00010"

#Path to the TensorFlow Object Detection format label_map
label_map_pbtxt_fname = "/content/workspace/data/label_map.pbtxt"

#Path to the pipeline.config file
config_path = f"/content/workspace/models/{model_name}/pipeline.config"

#Path to the pretrained model checkpoints 
fine_tune = f"/content/workspace/pre_trained_models/{model_name}/checkpoint/ckpt-0"

#if you can fit a large batch in memory, it may speed up your training
batch_size = 16
#The more steps, the longer the training
epochs  = 30
num_steps =  len(train_data) // batch_size * epochs

model_dir = f"/content/workspace/models/{model_name}"


def get_num_classes(pbtxt_fname):
    """Get total number of classes from label_map.pbtxt file"""
    label_map = label_map_util.load_labelmap(pbtxt_fname)
    categories = label_map_util.convert_label_map_to_categories(
        label_map, max_num_classes=90, use_display_name=True)
    category_index = label_map_util.create_category_index(categories)
    return len(category_index.keys())


num_classes = get_num_classes(label_map_pbtxt_fname)

print("CUSTOM CONFIGURATION PARAMETERS : ")
print("-"*40)
print("Config Path: ", config_path)
print("Checkpoint Path: ", fine_tune)
print("Label Map: ", label_map_pbtxt_fname)
print("Train TFRecords: ", train_record_fname)
print("Test TFRecords: ", test_record_fname)
print("Total Steps: ", num_steps)
print("Num classes: ", num_classes)
print("-"*40)

CUSTOM CONFIGURATION PARAMETERS : 
----------------------------------------
Config Path:  /content/workspace/models/efficientdet_d0_coco17_tpu-32/pipeline.config
Checkpoint Path:  /content/workspace/pre_trained_models/efficientdet_d0_coco17_tpu-32/checkpoint/ckpt-0
Label Map:  /content/workspace/data/label_map.pbtxt
Train TFRecords:  /content/workspace/data/train.record-?????-of-00010
Test TFRecords:  /content/workspace/data/test.record-?????-of-00010
Total Steps:  5580
Num classes:  37
----------------------------------------

with open(config_path) as f:
    s = f.read()

with open(config_path, 'w') as f:
    # fine_tune_checkpoint
    s = re.sub('fine_tune_checkpoint: ".*?"', 'fine_tune_checkpoint: "{}"'.format(fine_tune), s)
    # tfrecord files train and test
    s = re.sub('(input_path: ".*?)(PATH_TO_BE_CONFIGURED/train)(.*?")', 'input_path: "{}"'.format(train_record_fname), s)
    s = re.sub('(input_path: ".*?)(PATH_TO_BE_CONFIGURED/val)(.*?")', 'input_path: "{}"'.format(test_record_fname), s)
    # label_map_path
    s = re.sub('label_map_path: ".*?"', 'label_map_path: "{}"'.format(label_map_pbtxt_fname), s)
    # Set training batch_size.
    s = re.sub('batch_size: [0-9]+','batch_size: {}'.format(batch_size), s)
    # Set training steps, num_steps
    s = re.sub('num_steps: [0-9]+', 'num_steps: {}'.format(num_steps), s)
    # Set number of classes num_classes.
    s = re.sub('num_classes: [0-9]+','num_classes: {}'.format(num_classes), s)
    #fine-tune checkpoint type
    s = re.sub('fine_tune_checkpoint_type: "classification"', 'fine_tune_checkpoint_type: "{}"'.format('detection'), s)
        
    f.write(s)

The modified config file will be saved as /content/workspace/models/efficientdet_d0_coco17_tpu-32/pipeline.config

Let's check the directory structure :

/content/workspace
├── annotations
│   ├── test [706 entries exceeds filelimit, not opening dir]
│   └── train [2982 entries exceeds filelimit, not opening dir]
├── data
│   ├── label_map.pbtxt
│   ├── test.record
│   ├── test.record-00000-of-00010
│   ├── test.record-00001-of-00010
│   ├── test.record-00002-of-00010
│   ├── test.record-00003-of-00010
│   ├── test.record-00004-of-00010
│   ├── test.record-00005-of-00010
│   ├── test.record-00006-of-00010
│   ├── test.record-00007-of-00010
│   ├── test.record-00008-of-00010
│   ├── test.record-00009-of-00010
│   ├── train.record
│   ├── train.record-00000-of-00010
│   ├── train.record-00001-of-00010
│   ├── train.record-00002-of-00010
│   ├── train.record-00003-of-00010
│   ├── train.record-00004-of-00010
│   ├── train.record-00005-of-00010
│   ├── train.record-00006-of-00010
│   ├── train.record-00007-of-00010
│   ├── train.record-00008-of-00010
│   └── train.record-00009-of-00010
├── images
│   ├── test [1479 entries exceeds filelimit, not opening dir]
│   └── train [5914 entries exceeds filelimit, not opening dir]
├── models
│   └── efficientdet_d0_coco17_tpu-32
│       └── pipeline.config
└── pre_trained_models
    └── efficientdet_d0_coco17_tpu-32
        ├── checkpoint
        │   ├── checkpoint
        │   ├── ckpt-0.data-00000-of-00001
        │   └── ckpt-0.index
        ├── pipeline.config
        └── saved_model
            ├── assets
            ├── saved_model.pb
            └── variables
                ├── variables.data-00000-of-00001
                └── variables.index

15 directories, 31 files

Train Custom TF2 Object Detector

To initiate a new training job, we need to run the script /content/models/research/object_detection/model_main_tf2.py

config_path: path to the configuration file defined above in writing custom training configuration.
model_dir: the location tensorboard logs and saved model checkpoints will save to

!python /content/models/research/object_detection/model_main_tf2.py \
    --pipeline_config_path={config_path} \
    --num_train_steps={num_steps} \
    --model_dir={model_dir} \
    --alsologtostderr

To evaluate our model on COCO-Evaluation metrics we need to run the script /content/models/research/object_detection/model_main_tf2.py .

Note: This process automatically evaluates the model on the latest checkpoints that the training job generates. So we can also run this script in the backgound as our model keeps training and as checkpoints are generated the script will automatically evaluate the model on the COCO-metrics.

!python /content/models/research/object_detection/model_main_tf2.py \
    --pipeline_config_path={config_path} \
    --model_dir={model_dir} \
    --checkpoint_dir={model_dir} \
    --alsologtostderr \
    --eval_timeout=10

Monitor Training Job Progress using TensorBoard:

We can either use one of the 2 commands:

To open in a terminal :

tensorboard --logdir "/content/workspace/models/efficientdet_d0_coco17_tpu-32/

For a Jupyter-Environment:

%load_ext tensorboard
%tensorboard --logdir "/content/workspace/models/efficientdet_d0_coco17_tpu-32/"

We will have logs that are going to look similar to this :

logs

Exporting a Trained Inference Graph

Once your training job is complete, you need to extract the newly trained inference graph, which will be later used to perform the object detection. This can be done as follows:

%ls "/content/workspace/models/efficientdet_d0_coco17_tpu-32/"

checkpoint                  ckpt-3.index                ckpt-6.index
ckpt-1.data-00000-of-00001  ckpt-4.data-00000-of-00001  eval/
ckpt-1.index                ckpt-4.index                pipeline.config
ckpt-2.data-00000-of-00001  ckpt-5.data-00000-of-00001  train/
ckpt-2.index                ckpt-5.index
ckpt-3.data-00000-of-00001  ckpt-6.data-00000-of-00001

# using the /content/models/research/object_detection/exporter_main_v2.py given by TensorFlow Team
# we will export our model into a model graph in this folder created
os.makedirs(f"/content/workspace/exported_models/{model_name}", exist_ok=True)

#Once your training job is complete, you need to extract the newly trained inference graph, 
#which will be later used to perform the object detection

#path to save the exporter inference graph
output_directory = f"/content/workspace/exported_models/{model_name}/"

#path to trained model checkpoints
checkpoint_dir = f"/content/workspace/models/{model_name}/"

#run script to export model weights
!python /content/models/research/object_detection/exporter_main_v2.py \
    --trained_checkpoint_dir {checkpoint_dir} \
    --output_directory {output_directory} \
    --pipeline_config_path {config_path}

After the whole process (Training/Evaluating/Exporting) we are going to have the following files in the folwwing folders:

/content/workspace
├── annotations
│   ├── test [706 entries exceeds filelimit, not opening dir]
│   └── train [2982 entries exceeds filelimit, not opening dir]
├── data [23 entries exceeds filelimit, not opening dir]
├── exported_models
│   └── efficientdet_d0_coco17_tpu-32
│       ├── checkpoint
│       │   ├── checkpoint
│       │   ├── ckpt-0.data-00000-of-00001
│       │   └── ckpt-0.index
│       ├── pipeline.config
│       └── saved_model
│           ├── assets
│           ├── saved_model.pb
│           └── variables
│               ├── variables.data-00000-of-00001
│               └── variables.index
├── images
│   ├── test [1479 entries exceeds filelimit, not opening dir]
│   └── train [5914 entries exceeds filelimit, not opening dir]
├── models
│   └── efficientdet_d0_coco17_tpu-32
│       ├── checkpoint
│       ├── ckpt-1.data-00000-of-00001
│       ├── ckpt-1.index
│       ├── ckpt-2.data-00000-of-00001
│       ├── ckpt-2.index
│       ├── ckpt-3.data-00000-of-00001
│       ├── ckpt-3.index
│       ├── ckpt-4.data-00000-of-00001
│       ├── ckpt-4.index
│       ├── ckpt-5.data-00000-of-00001
│       ├── ckpt-5.index
│       ├── ckpt-6.data-00000-of-00001
│       ├── ckpt-6.index
│       ├── eval
│       │   └── events.out.tfevents.1604410855.57f7c64792d3.12442.7946.v2
│       ├── pipeline.config
│       └── train
│           └── events.out.tfevents.1604404487.57f7c64792d3.556.10071.v2
└── pre_trained_models
    └── efficientdet_d0_coco17_tpu-32
        ├── checkpoint
        │   ├── checkpoint
        │   ├── ckpt-0.data-00000-of-00001
        │   └── ckpt-0.index
        ├── pipeline.config
        └── saved_model
            ├── assets
            ├── saved_model.pb
            └── variables
                ├── variables.data-00000-of-00001
                └── variables.index

23 directories, 30 files

In the next part we are going to use this exported graph to do inference on custom images

Run Inference on Test Images with Custom TensorFlow2 Object Detector

In this section we will run inference on images using our Custom TensorFlow2 Object Detector exported graph

To run inference we will create a few helper functions first:

def load_image_into_numpy_array(path):
  """Load an image from file into a numpy array.

  Puts image into numpy array to feed into tensorflow graph.
  Note that by convention we put it into a numpy array with shape
  (height, width, channels), where channels=3 for RGB.

  Args:
    path: the file path to the image

  Returns:
    uint8 numpy array with shape (img_height, img_width, 3)
  """
  img_data = tf.io.gfile.GFile(path, 'rb').read()
  image = Image.open(BytesIO(img_data))
  (im_width, im_height) = image.size
  return np.array(image.getdata()).reshape((im_height, im_width, 3)).astype(np.uint8)


def load_model(model_dir):
    model_dir = pathlib.Path(model_dir)/"saved_model"
    model = tf.saved_model.load(str(model_dir))
    return model

Load in the category_index, which is a dictionary mapping of the classes and the index labels & the Custom TensorFlow2 Object Detector from the exported graph

PATH_TO_LABELS = "/content/workspace/data/label_map.pbtxt"
#generate a category index dictionary from the label map
category_index = label_map_util.create_category_index_from_labelmap(PATH_TO_LABELS, use_display_name=True)

#path to the images from the test directory we will reun inference on these images
PATH_TO_TEST_IMAGES_DIR = pathlib.Path("/content/workspace/images/test/")
TEST_IMAGE_PATHS = sorted(list(PATH_TO_TEST_IMAGES_DIR.glob("*.jpg")))

exported_model_path = f"/content/workspace/exported_models/{model_name}"

#Load model from the exported model graph
print(f'Loading model from {exported_model_path} ...', end='')
start_time = time.time()
detection_model = load_model(exported_model_path)
end_time = time.time()
elapsed_time = end_time - start_time
print('Done! Took {} seconds'.format(elapsed_time))

Loading model from /content/workspace/exported_models/efficientdet_d0_coco17_tpu-32 ...Done! Took 24.72133469581604 seconds

def run_inference_for_single_image(model, image):
    """
    Fn to run inference on a single image
    """
    image = np.asarray(image)
    # The input needs to be a tensor, convert it using `tf.convert_to_tensor`.
    input_tensor = tf.convert_to_tensor(image)
    # The model expects a batch of images, so add an axis with `tf.newaxis`.
    input_tensor = input_tensor[tf.newaxis,...]
    # Run inference
    model_fn = model.signatures['serving_default']
    output_dict = model_fn(input_tensor)
    
    # All outputs are batches tensors.
    # Convert to numpy arrays, and take index [0] to remove the batch dimension
    # We're only interested in the first num_detections.
    num_detections = int(output_dict.pop('num_detections'))
    output_dict = {key:value[0, :num_detections].numpy() for key,value in output_dict.items()}
    output_dict['num_detections'] = num_detections
    # detection_classes should be ints.
    output_dict['detection_classes'] = output_dict['detection_classes'].astype(np.int64)
    
    # Handle models with masks
    if 'detection_masks' in output_dict:
        # Reframe the the bbox mask to the image size.
        detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(
            output_dict['detection_masks'], output_dict['detection_boxes'],
            image.shape[0], 
            image.shape[1]
            )      
        
        detection_masks_reframed = tf.cast(detection_masks_reframed > 0.5, tf.uint8)
        output_dict['detection_masks_reframed'] = detection_masks_reframed.numpy()
    
    return output_dict


def show_inference(model, image_path, threshold = 0.5):
    """
    Runs infernce on the given image at the image_path and also
    draws the bounding box over the image .
    """
    # the array based representation of the image will be used later in order to prepare the
    # result image with boxes and labels on it.
    print('Running inference for {}... '.format(image_path), end='')
    image_np = np.array(Image.open(image_path))
    # Actual detection.
    output_dict = run_inference_for_single_image(model, image_np)
    # Visualization of the results of a detection.
    viz_utils.visualize_boxes_and_labels_on_image_array(
        image_np,
        output_dict['detection_boxes'],
        output_dict['detection_classes'],
        output_dict['detection_scores'],
        category_index,
        instance_masks=output_dict.get('detection_masks_reframed', None),
        use_normalized_coordinates=True,
        max_boxes_to_draw=100,
        min_score_thresh=threshold,
        )
    
    print('Done !')
    display(Image.fromarray(image_np))

Using the helper functions defined above let's run inference on Images :

path = TEST_IMAGE_PATHS[np.random.randint(0, len(TEST_IMAGE_PATHS))]
#Run inference over the image and display the results
show_inference(detection_model, path)

Running inference for /content/workspace/images/test/newfoundland_112.jpg... Done !

path = TEST_IMAGE_PATHS[np.random.randint(0, len(TEST_IMAGE_PATHS))]
#Run inference over the image and display the results
show_inference(detection_model, path)

Running inference for /content/workspace/images/test/boxer_93.jpg... Done !

path = TEST_IMAGE_PATHS[np.random.randint(0, len(TEST_IMAGE_PATHS))]
#Run inference over the image and display the results
show_inference(detection_model, path)

Running inference for /content/workspace/images/test/Bengal_72.jpg... Done !

Note: These cells given below will work only for Google-Colab.

#use google colab to load in a random image from local machine
from google.colab import files

#Upload file
fname = files.upload()
fname = list(fname.keys())[0]

show_inference(detection_model, fname, threshold=0.35)

Running inference for cat_dog.jpg... Done !

Congrats!

Hope you enjoyed this!

	filename	width	height	labels	xmin	ymin	xmax	ymax	encoded_label
0	/content/workspace/images/train/Persian_191.jpg	500	333	persian	229	36	315	132	24
1	/content/workspace/images/train/beagle_18.jpg	336	500	beagle	43	31	291	204	5
2	/content/workspace/images/train/Sphynx_192.jpg	500	333	sphynx	334	20	412	109	34
3	/content/workspace/images/train/boxer_181.jpg	500	333	boxer	259	8	362	112	9
4	/content/workspace/images/train/Birman_126.jpg	334	500	birman	78	135	180	239	7

	filename	width	height	labels	xmin	ymin	xmax	ymax	encoded_label
0	/content/workspace/images/test/Siamese_131.jpg	500	423	siamese	14	16	385	348	33
1	/content/workspace/images/test/Bombay_115.jpg	600	428	bombay	44	87	234	335	8
2	/content/workspace/images/test/Abyssinian_140.jpg	500	333	abyssinian	231	87	323	154	1
3	/content/workspace/images/test/Russian_Blue_124.jpg	500	375	russian_blue	29	25	164	159	28
4	/content/workspace/images/test/basset_hound_179.jpg	500	375	basset_hound	152	162	340	317	4

TensorFlow Object Detection API Tutorial