TensorFlow Object Detection API Tutorial
TensorFlow recently announced TF Object Detection API models to be TensorFlow 2 compatible . In this tutorial we will go over on how to train a object detection model on custom dataset using TensorFlow Object Detection API 2.
Introduction
In this notebook, we implement The TensorFlow 2 Object Detection Library for training on your own dataset.
We will take the following steps to implement a model from TensorFlow 2 Detection Model Zoo on our custom data:
- Install TensorFlow2 Object Detection Dependencies
- Download Custom TensorFlow2 Object Detection Dataset
- Write Custom TensorFlow2 Object Detection Training Configuation
- Train Custom TensorFlow2 Object Detection Model
- Export Custom TensorFlow2 Object Detection Weights
- Use Trained TensorFlow2 Object Detection For Inference on Test Images
When you are done you will have a custom detector that you can use. It will make inference like this:
To install TensorFlow2 Object Detection on Google-Colab run the following steps.
import os
import pathlib
# Clone the tensorflow models repository if it doesn't already exist
if "models" in pathlib.Path.cwd().parts:
while "models" in pathlib.Path.cwd().parts:
os.chdir('..')
elif not pathlib.Path('models').exists():
!git clone --depth 1 https://github.com/tensorflow/models
# Install the Object Detection API
%%bash
cd models/research/
protoc object_detection/protos/*.proto --python_out=.
cp object_detection/packages/tf2/setup.py .
python -m pip install . --quiet
Run the TF2 model builder tests to make sure our environment is up and running. If successful If successful, you should see the following outputs at the end of the cell execution printouts.
[ RUN ] ModelBuilderTF2Test.test_unknown_ssd_feature_extractor
[ OK ] ModelBuilderTF2Test.test_unknown_ssd_feature_extractor
----------------------------------------------------------------------
Ran 20 tests in 52.705s
OK (skipped=1)
#run model builder test to ensure everything is up and runnning
!python /content/models/research/object_detection/builders/model_builder_tf2_test.py
To install on a custom machine check : Installation
For this task we are going to be using the Oxford Pets dataset. This dataset contains 37 category pet dataset with roughly 200 images for each class. The annotations contain tight bounding box (ROI) around the head of the animal.
#Download the Oxford-IIIT Pet
!wget http://www.robots.ox.ac.uk/~vgg/data/pets/data/images.tar.gz
!wget http://www.robots.ox.ac.uk/~vgg/data/pets/data/annotations.tar.gz
!tar -xf annotations.tar.gz
!tar -xf images.tar.gz
Before training let us create a folder /content/workspace/.
It is within the workspace that we will store all our training set-ups. This will contain all files related to our model training.
#We will store all the required files in the workspace folder
!mkdir /content/workspace/
!mkdir /content/workspace/images/ # store images
!mkdir /content/workspace/annotations/ # store xml annotation files
!mkdir /content/workspace/images/train # train images
!mkdir /content/workspace/images/test # test images
!mkdir /content/workspace/annotations/train # train annotations
!mkdir /content/workspace/annotations/test # test annotations
!mkdir /content/workspace/data/ # directory to store the tf_records & the label_map
import os
import pathlib
import logging
import re
import shutil
import glob
import pandas as pd
import xml.etree.ElementTree as ET
from tqdm import tqdm
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
import tarfile
import time
from collections import defaultdict
from io import StringIO
from matplotlib import pyplot as plt
from PIL import Image
from IPython.display import display
import numpy as np
IMAGE_DIR = "/content/images"
ANNOT_DIR = "/content/annotations/xmls"
pd.set_option("display.max_colwidth", None)
os.chdir("/content/")
%load_ext tensorboard
%load_ext autoreload
%matplotlib inline
%autoreload 2
import tensorflow.compat.v1 as tf1
import contextlib2
import tensorflow as tf
from object_detection.utils import dataset_util
from object_detection.utils import label_map_util
from object_detection.utils import config_util
from object_detection.utils import visualization_utils as viz_utils
from object_detection.utils import colab_utils
from object_detection.builders import model_builder
from object_detection.dataset_tools import tf_record_creation_util
# Enable GPU dynamic memory allocation
gpus = tf.config.experimental.list_physical_devices('GPU')
for gpu in gpus:
tf.config.experimental.set_memory_growth(gpu, True)
Tensorflow object detection API expects the data to be in the form of TFRecords . In this part we are going to convert our data present in Pascal-VOC format into TFRecords.
To do this we will implement the following the steps:
Iterate over all the annotations and partition the annotations into train and test datasets. The train annotatins and images will be saved to /content/workspace/annotations/train & /content/workspace/images/train respectively. Similarly the test data will be saved to /content/workspace/annotations/test & /content/workspace/images/test .
Convert all the
*.xml
annotation files into a single Pandas DataFrame object.We will create a tensoflow 2 object detection format label-map which will be used in training/evaluation the model .
Use this Pandas DataFrame to create TFRecords for the train and test datasets. The TFRecords will be saved to /content/workspace/data/.
1. Partition the Dataset :
If we look at the data that is saved in /content/images/ & /content/annotations/ we will see that not all the images have the corresponding annotations and the images and annotations are saved as {filename}.jpeg & {filename}.xml respectively.
What we will do is we will first split the images using sklearn.train_test_split
into a train and test dataset. Then we will check for the corresponding annotation for the image . If the annotation file exists we will copy the image and annotation into their repectives directories under /content/workspace .
all_images = os.listdir(IMAGE_DIR)
#Split the images into train and test datasets
train_images, test_images = train_test_split(all_images, test_size=0.2, random_state = 123)
#Grab the list of all the annotations for the train and test images
#Some annotations may not exist we will filter these in the next cell
train_xmls = [f.split(".")[0] + ".xml" for f in train_images]
test_xmls = [f.split(".")[0] + ".xml" for f in test_images ]
def move_file(fileList : list, src: str, dest: str):
"""
This Fn copy's files from a given fileList from src to dest
if the file exits.
Args:
fileList: List containing all the files present in the src directory.
src : source directory for the files.
dest : destination where to copy the files present in fileList.
"""
for f in tqdm(fileList):
fileName = os.path.join(src, f)
#Check if the file exits, if the file exits copy contents from src to dest
if os.path.exists(fileName):
shutil.copy2(src=fileName, dst=os.path.join(dest, f))
#Move images and annotations to workspace directory
move_file(train_images, src=IMAGE_DIR, dest="/content/workspace/images/train/")
move_file(test_images, src=IMAGE_DIR, dest="/content/workspace/images/test/")
move_file(train_xmls, src=ANNOT_DIR, dest="/content/workspace/annotations/train/")
move_file(test_xmls, src=ANNOT_DIR, dest="/content/workspace/annotations/test/")
2. Create Pandas DataFrame Object :
Now, that we have partitioned our dataset and the images/annotations are present in the repective directories we will now create a pandas dataframe from the *.xml
files . The DataFrame will contrain the fillowing information:
filename
(str)
: Path to the image file.width
(float/int)
: Absolute width of the image.height
(float/int)
: Absolute height of the image.labels
(str)
: The class of the object present in the bounding box.xmin
(float/int)
: Absolutexmin
co-ordinate for the bounding box.ymin
(float/int)
: Absoluteymin
co-ordinate for the bounding box.xmax
(float/int)
: Absolutexmax
co-ordinate for the bounding box.ymax
(float/int)
: Absoluteymax
co-ordinate for the bounding box.encoded_label
(int)
: The label for the object in the bounding box. 0 represents always the background class.
#from the filename.
exp = r"/([^/]+)_\d+.jpg$"
exp = re.compile(exp)
#sklearn.LabelEncoder will be used to convert the class of the object into integer format.
le = LabelEncoder()
def xml2pandas(annot_dir):
"""
Fn converts the xml files into a pandas dataframe.
Args:
annot_dir: Directory where all the *.xml annotation files are stored
"""
xml_list = []
for xml_file in tqdm(glob.glob(annot_dir + '/*.xml')):
tree = ET.parse(xml_file)
root = tree.getroot()
for member in root.findall('object'):
value = (
root.find('filename').text,
int(root.find('size')[0].text),
int(root.find('size')[1].text),
member[0].text,
int(member[4][0].text),
int(member[4][1].text),
int(member[4][2].text),
int(member[4][3].text)
)
xml_list.append(value)
column_name = ['filename', 'width', 'height','labels', 'xmin', 'ymin', 'xmax', 'ymax']
xml_df = pd.DataFrame(xml_list, columns=column_name)
logging.info("DataFrame Generated ! ")
return xml_df
def process_data(annotDir, imageDir, image_set="train"):
"""
Fn creates a pandas DataFrame object from the annotation in annotDir
and images in imageDir. This Fn also extracts the name of the class from the
filename and converts it into integer labels starting from 1 as 0 is reserved
for the background class always.
Args:
annotDir : directory where the *.xml annotation files are stored.
imageDir : directory where all the images are stored.
image_set : one of either `train` or `test`, this use when converting
the class objects into integer formats.
"""
data = xml2pandas(annotDir)
#modify the filename to point to the original filename
data.filename = [os.path.join(imageDir, fname) for fname in data.filename.values]
#extract the class labels from the filenames
data["labels"] = [exp.search(data.filename[idx]).group(1).lower() for idx in range(len(data))]
#encoded the labels into integers starting from 1
if image_set == "train" :
data["encoded_label"] = le.fit_transform(data.labels) + 1
elif image_set == "test" :
data["encoded_label"] = le.transform(data.labels) + 1
return data
TRAIN_IMAGE_DIR = "/content/workspace/images/train/"
TEST_IMAGE_DIR = "/content/workspace/images/test/"
TRAIN_ANNOTATION_DIR = "/content/workspace/annotations/train/"
TEST_ANNOTATION_DIR = "/content/workspace/annotations/test/"
#Create pandas datafame from the *.xml files
train_data = process_data(TRAIN_ANNOTATION_DIR, TRAIN_IMAGE_DIR, "train")
test_data = process_data(TEST_ANNOTATION_DIR, TEST_IMAGE_DIR, "test")
#Cross check for missing files
for f in train_data.filename:
if not os.path.exists(f):
#remove the missing file
print(f"{f} is missing in train_data")
train_data = train_data[train_data.filename != f]
train_data.reset_index(inplace=True, drop=True)
for f in test_data.filename:
if not os.path.exists(f):
#remove the missing file
print(f"{f} missing in test_data")
test_data = test_data[test_data.filename != f]
test_data.reset_index(inplace=True, drop=True)
Our datasets are going to look something like this :
The train_data :
The test_data :
3. Create Label Map :
TensorFlow requires dataset to have a label map associated with it. This label map defines a mapping from string class names to integer class Ids. The label map should be a StringIntLabelMap text protobuf. Label map files have the extention .pbtxt
and we will place it under /content/workspace/data along with the TFRecod files which we will create in the next step.
unique_labels = list(train_data.labels.unique())
integer_labels = le.transform(unique_labels) + 1
label_dict = {unique_labels[i] : integer_labels[i] for i in range(len(unique_labels))}
label_map = "/content/workspace/data/label_map.pbtxt"
categories = train_data.labels.unique()
categories.sort()
end = '\n'
s = ' '
for name in categories:
out = ''
out += 'item' + s + '{' + end
out += s*2 + 'id:' + ' ' + (str(label_dict[name])) + end
out += s*2 + 'name:' + ' ' + '\'' + name + '\'' + end
out += '}' + end*2
with open(label_map, 'a') as f:
f.write(out)
Our label_map.pbtxt
file will look like this :
item {
id: 1
name: 'abyssinian'
}
item {
id: 2
name: 'american_bulldog'
}
item {
id: 3
name: 'american_pit_bull_terrier'
}
item {
id: 4
name: 'basset_hound'
}
...
...
The label_map.pbtxt
file has been placed under /content/workspace/data/label_map.pbtxt
4. Create TensorFlow Records :
In this step we will convert our annotatinos present in the pandas dataframe object into TFRecord format.
For every example in our dataset, we should have the following information:
- An RGB image for the dataset encoded as jpeg or png.
- A bounding box coordinates for each image
(with origin in top left corner)
defined by 4 floating point numbers[ymin, xmin, ymax, xmax]
. - The class of the object in the bounding box.
Since our dataset has more than a fairly large number of annotations we will shard your dataset into multiple files. Instead of writing all tf.Example protos to a single file we will store the dataset into multiple files .
Our dataset is going to look something like this:
/{directory_path}/dataset.record-00000-00010
/{directory_path}/dataset.record-00001-00010
...
/{directory_path}/dataset.record-00009-00010
Our train dataset is going to be stored as :
/content/workspace/data/train.record-00000-of-00010
/content/workspace/data/train.record-00001-of-00010
...
/content/workspace/data/train.record-00009-of-00010
Similary for the test dataset :
/content/workspace/data/test.record-00000-of-00010
/content/workspace/data/test.record-00001-of-00010
...
/content/workspace/data/test.record-00009-of-00010
def create_tf_example(fname, data):
"""
Creates a tf.Example proto from a single image
from the given data
Args:
fname: filename of a single image from data.
data : a pandas dataframe object in the format
specified in step 2.
Returns:
example: The created tf.Example.
"""
curr_data = data.loc[data.filename == fname]
filename = fname.encode('utf8') # Filename of the image
height = curr_data["height"].values[0] # Image height
width = curr_data["width"].values[0] # Image width
image_format = b'jpeg' # b'jpeg' or b'png'
# List of normalized left x coordinates in bounding box (1 per box).
xmins = list(curr_data["xmin"].values / width)
# List of normalized right x coordinates in bounding box (1 per box).
xmaxs = list(curr_data["xmax"].values / width)
# List of normalized top y coordinates in bounding box (1 per box).
ymins = list(curr_data["ymin"].values / height)
# List of normalized bottom y coordinates in bounding box (1 per box).
ymaxs = list(curr_data["ymax"].values / height)
# List of string class name of bounding box (1 per box)
classes_text = list(curr_data["labels"].values)
classes_text = [text.encode('utf8') for text in classes_text]
# List of integer class id of bounding box (1 per box)
classes = list(curr_data["encoded_label"].values)
with tf1.gfile.GFile(filename, 'rb') as fid:
encoded_image_data = fid.read() # Encoded image bytes
features = tf1.train.Example(features=tf1.train.Features(feature={
'image/height': dataset_util.int64_feature(height),
'image/width': dataset_util.int64_feature(width),
'image/filename': dataset_util.bytes_feature(filename),
'image/source_id': dataset_util.bytes_feature(filename),
'image/encoded': dataset_util.bytes_feature(encoded_image_data),
'image/format': dataset_util.bytes_feature(image_format),
'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
'image/object/class/label': dataset_util.int64_list_feature(classes),
}))
return features
def create_records(output_path, data, shards=10):
"""
Fn iterates over all the annotations in dataset and creates a
sharded TFRecord dataset and additionally saves the sharded TFRecord dataset
to output path.
Args:
output_path: Path where to save the dataset
data : A pandas Dataframe object as specified in step-2.
shards : Number of the shards over which to save the dataset.
The dataset is going to saved inside `shards` no. of files.
"""
writer = tf1.python_io.TFRecordWriter(output_path)
fnames = list(data.filename.unique())
with contextlib2.ExitStack() as tf_record_close_stack:
output_tfrecords = tf_record_creation_util.open_sharded_output_tfrecords(tf_record_close_stack,output_path,shards)
#enumerate over all the unique images present in the dataset
#and create a tf.Example proto for the particular annotations.
for index, fname in enumerate(fnames):
tf_example = create_tf_example(fname, data)
output_shard_index = index % shards
output_tfrecords[output_shard_index].write(tf_example.SerializeToString())
print("Creating TFRecords ..... ", end='')
start_time = time.time()
create_records("/content/workspace/data/train.record", data=train_data)
create_records("/content/workspace/data/test.record", data=test_data )
end_time = time.time()
elapsed_time = end_time - start_time
print('Done! Took {} seconds'.format(elapsed_time))
Our dataset is now prepared for training using a model from TensorFlow 2 Detection Model Zoo .
The directory structure for the workspace should look something like this at this stage:
Configure Custom TensorFlow2 Object Detection Training Configuration
In this section we will download a pretrained-model from the TF2 OD model zoo and set up out training configuration.
In this tutorial we are going to implement the lightweight, smallest state of the art efficientdet model.
We will create a directory call pretrained-models in our wokspace folder.
We will download the latest pre-trained network for the model we wish to use. This can be in TensorFlow 2 Detection Model Zoo.
Once the *.tar.gz file has been downloaded, we will extract the file contents into. /content/workspace/pre-trained-models .
# Download the latest-pretrained weights for the efficientdet_d0 model and the config file
#LINK : http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d0_coco17_tpu-32.tar.gz
model_name = "efficientdet_d0_coco17_tpu-32"
model = "efficientdet_d0_coco17_tpu-32.tar.gz"
os.makedirs("/content/workspace/pre_trained_models/", exist_ok=True)
download_tar = f"http://download.tensorflow.org/models/object_detection/tf2/20200711/{model}"
!wget {download_tar} -P "/content/workspace/pre_trained_models/"
tar = tarfile.open(f"/content/workspace/pre_trained_models/{model}")
tar.extractall(path="/content/workspace/pre_trained_models/")
tar.close()
os.unlink(f"/content/workspace/pre_trained_models/{model}")
The directory structure for the workspace should look something like this at this stage:
Now that we have downloaded and extracted our pre-trained model, let’s create a directory for our training job. Under the /content/workspace/ create a new directory named models this will be the folder where we will store all the configurations, model_checkpoints, logs for our custom trained model.
Under the /content/workspace/models/ dir create a dir named as efficientdet_d0_coco17_tpu-32 and copy the /content/workspace/pre-trained-models/efficientdet_d0_coco17_tpu-32/pipeline.config file inside the newly created directory.
os.makedirs("/content/workspace/models/", exist_ok=True)
os.makedirs(f"/content/workspace/models/{model_name}", exist_ok=True)
config_path = f"/content/workspace/pre_trained_models/{model_name}/pipeline.config"
shutil.copy2(config_path, f"/content/workspace/models/{model_name}")
Each model has a model_name, a pipeline.config file, a pretrained_checkpoint.
The pipeline.config file is a shell of a training configuration specific to each model type, provided by the authors of the TF2 OD repository.
The pretrained_checkpoint is the location of a pretrained weights file saved from when the object detection model was pretrained on the COCO dataset.
We will start from these weights, and then fine tune into our particular custom dataset task. By using pretraining, our model does not need to start from square one in identifying what features might be useful for object detection.
We will map our training data files to variables for use in our computer vision training pipeline configuration.
We will now edit the /content/workspace/models/pipeline.config to point to our custom data, the pretrained_checkpoint, and we also specify some training parameters.
test_record_fname = "/content/workspace/data/test.record-?????-of-00010"
train_record_fname = "/content/workspace/data/train.record-?????-of-00010"
#Path to the TensorFlow Object Detection format label_map
label_map_pbtxt_fname = "/content/workspace/data/label_map.pbtxt"
#Path to the pipeline.config file
config_path = f"/content/workspace/models/{model_name}/pipeline.config"
#Path to the pretrained model checkpoints
fine_tune = f"/content/workspace/pre_trained_models/{model_name}/checkpoint/ckpt-0"
#if you can fit a large batch in memory, it may speed up your training
batch_size = 16
#The more steps, the longer the training
epochs = 30
num_steps = len(train_data) // batch_size * epochs
model_dir = f"/content/workspace/models/{model_name}"
def get_num_classes(pbtxt_fname):
"""Get total number of classes from label_map.pbtxt file"""
label_map = label_map_util.load_labelmap(pbtxt_fname)
categories = label_map_util.convert_label_map_to_categories(
label_map, max_num_classes=90, use_display_name=True)
category_index = label_map_util.create_category_index(categories)
return len(category_index.keys())
num_classes = get_num_classes(label_map_pbtxt_fname)
print("CUSTOM CONFIGURATION PARAMETERS : ")
print("-"*40)
print("Config Path: ", config_path)
print("Checkpoint Path: ", fine_tune)
print("Label Map: ", label_map_pbtxt_fname)
print("Train TFRecords: ", train_record_fname)
print("Test TFRecords: ", test_record_fname)
print("Total Steps: ", num_steps)
print("Num classes: ", num_classes)
print("-"*40)
with open(config_path) as f:
s = f.read()
with open(config_path, 'w') as f:
# fine_tune_checkpoint
s = re.sub('fine_tune_checkpoint: ".*?"', 'fine_tune_checkpoint: "{}"'.format(fine_tune), s)
# tfrecord files train and test
s = re.sub('(input_path: ".*?)(PATH_TO_BE_CONFIGURED/train)(.*?")', 'input_path: "{}"'.format(train_record_fname), s)
s = re.sub('(input_path: ".*?)(PATH_TO_BE_CONFIGURED/val)(.*?")', 'input_path: "{}"'.format(test_record_fname), s)
# label_map_path
s = re.sub('label_map_path: ".*?"', 'label_map_path: "{}"'.format(label_map_pbtxt_fname), s)
# Set training batch_size.
s = re.sub('batch_size: [0-9]+','batch_size: {}'.format(batch_size), s)
# Set training steps, num_steps
s = re.sub('num_steps: [0-9]+', 'num_steps: {}'.format(num_steps), s)
# Set number of classes num_classes.
s = re.sub('num_classes: [0-9]+','num_classes: {}'.format(num_classes), s)
#fine-tune checkpoint type
s = re.sub('fine_tune_checkpoint_type: "classification"', 'fine_tune_checkpoint_type: "{}"'.format('detection'), s)
f.write(s)
The modified config file will be saved as /content/workspace/models/efficientdet_d0_coco17_tpu-32/pipeline.config
Let's check the directory structure :
Train Custom TF2 Object Detector
To initiate a new training job, we need to run the script /content/models/research/object_detection/model_main_tf2.py
config_path: path to the configuration file defined above in writing custom training configuration.
model_dir: the location tensorboard logs and saved model checkpoints will save to
!python /content/models/research/object_detection/model_main_tf2.py \
--pipeline_config_path={config_path} \
--num_train_steps={num_steps} \
--model_dir={model_dir} \
--alsologtostderr
To evaluate our model on COCO-Evaluation metrics we need to run the script /content/models/research/object_detection/model_main_tf2.py .
!python /content/models/research/object_detection/model_main_tf2.py \
--pipeline_config_path={config_path} \
--model_dir={model_dir} \
--checkpoint_dir={model_dir} \
--alsologtostderr \
--eval_timeout=10
Monitor Training Job Progress using TensorBoard:
We can either use one of the 2 commands:
To open in a terminal :
tensorboard --logdir "/content/workspace/models/efficientdet_d0_coco17_tpu-32/
For a Jupyter-Environment:
%load_ext tensorboard
%tensorboard --logdir "/content/workspace/models/efficientdet_d0_coco17_tpu-32/"
We will have logs that are going to look similar to this :
Once your training job is complete, you need to extract the newly trained inference graph, which will be later used to perform the object detection. This can be done as follows:
%ls "/content/workspace/models/efficientdet_d0_coco17_tpu-32/"
# using the /content/models/research/object_detection/exporter_main_v2.py given by TensorFlow Team
# we will export our model into a model graph in this folder created
os.makedirs(f"/content/workspace/exported_models/{model_name}", exist_ok=True)
#Once your training job is complete, you need to extract the newly trained inference graph,
#which will be later used to perform the object detection
#path to save the exporter inference graph
output_directory = f"/content/workspace/exported_models/{model_name}/"
#path to trained model checkpoints
checkpoint_dir = f"/content/workspace/models/{model_name}/"
#run script to export model weights
!python /content/models/research/object_detection/exporter_main_v2.py \
--trained_checkpoint_dir {checkpoint_dir} \
--output_directory {output_directory} \
--pipeline_config_path {config_path}
After the whole process (Training/Evaluating/Exporting) we are going to have the following files in the folwwing folders:
In the next part we are going to use this exported graph to do inference on custom images
In this section we will run inference on images using our Custom TensorFlow2 Object Detector exported graph
To run inference we will create a few helper functions first:
def load_image_into_numpy_array(path):
"""Load an image from file into a numpy array.
Puts image into numpy array to feed into tensorflow graph.
Note that by convention we put it into a numpy array with shape
(height, width, channels), where channels=3 for RGB.
Args:
path: the file path to the image
Returns:
uint8 numpy array with shape (img_height, img_width, 3)
"""
img_data = tf.io.gfile.GFile(path, 'rb').read()
image = Image.open(BytesIO(img_data))
(im_width, im_height) = image.size
return np.array(image.getdata()).reshape((im_height, im_width, 3)).astype(np.uint8)
def load_model(model_dir):
model_dir = pathlib.Path(model_dir)/"saved_model"
model = tf.saved_model.load(str(model_dir))
return model
Load in the category_index, which is a dictionary mapping of the classes and the index labels & the Custom TensorFlow2 Object Detector from the exported graph
PATH_TO_LABELS = "/content/workspace/data/label_map.pbtxt"
#generate a category index dictionary from the label map
category_index = label_map_util.create_category_index_from_labelmap(PATH_TO_LABELS, use_display_name=True)
#path to the images from the test directory we will reun inference on these images
PATH_TO_TEST_IMAGES_DIR = pathlib.Path("/content/workspace/images/test/")
TEST_IMAGE_PATHS = sorted(list(PATH_TO_TEST_IMAGES_DIR.glob("*.jpg")))
exported_model_path = f"/content/workspace/exported_models/{model_name}"
#Load model from the exported model graph
print(f'Loading model from {exported_model_path} ...', end='')
start_time = time.time()
detection_model = load_model(exported_model_path)
end_time = time.time()
elapsed_time = end_time - start_time
print('Done! Took {} seconds'.format(elapsed_time))
def run_inference_for_single_image(model, image):
"""
Fn to run inference on a single image
"""
image = np.asarray(image)
# The input needs to be a tensor, convert it using `tf.convert_to_tensor`.
input_tensor = tf.convert_to_tensor(image)
# The model expects a batch of images, so add an axis with `tf.newaxis`.
input_tensor = input_tensor[tf.newaxis,...]
# Run inference
model_fn = model.signatures['serving_default']
output_dict = model_fn(input_tensor)
# All outputs are batches tensors.
# Convert to numpy arrays, and take index [0] to remove the batch dimension
# We're only interested in the first num_detections.
num_detections = int(output_dict.pop('num_detections'))
output_dict = {key:value[0, :num_detections].numpy() for key,value in output_dict.items()}
output_dict['num_detections'] = num_detections
# detection_classes should be ints.
output_dict['detection_classes'] = output_dict['detection_classes'].astype(np.int64)
# Handle models with masks
if 'detection_masks' in output_dict:
# Reframe the the bbox mask to the image size.
detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(
output_dict['detection_masks'], output_dict['detection_boxes'],
image.shape[0],
image.shape[1]
)
detection_masks_reframed = tf.cast(detection_masks_reframed > 0.5, tf.uint8)
output_dict['detection_masks_reframed'] = detection_masks_reframed.numpy()
return output_dict
def show_inference(model, image_path, threshold = 0.5):
"""
Runs infernce on the given image at the image_path and also
draws the bounding box over the image .
"""
# the array based representation of the image will be used later in order to prepare the
# result image with boxes and labels on it.
print('Running inference for {}... '.format(image_path), end='')
image_np = np.array(Image.open(image_path))
# Actual detection.
output_dict = run_inference_for_single_image(model, image_np)
# Visualization of the results of a detection.
viz_utils.visualize_boxes_and_labels_on_image_array(
image_np,
output_dict['detection_boxes'],
output_dict['detection_classes'],
output_dict['detection_scores'],
category_index,
instance_masks=output_dict.get('detection_masks_reframed', None),
use_normalized_coordinates=True,
max_boxes_to_draw=100,
min_score_thresh=threshold,
)
print('Done !')
display(Image.fromarray(image_np))
Using the helper functions defined above let's run inference on Images :
path = TEST_IMAGE_PATHS[np.random.randint(0, len(TEST_IMAGE_PATHS))]
#Run inference over the image and display the results
show_inference(detection_model, path)
path = TEST_IMAGE_PATHS[np.random.randint(0, len(TEST_IMAGE_PATHS))]
#Run inference over the image and display the results
show_inference(detection_model, path)
path = TEST_IMAGE_PATHS[np.random.randint(0, len(TEST_IMAGE_PATHS))]
#Run inference over the image and display the results
show_inference(detection_model, path)
#use google colab to load in a random image from local machine
from google.colab import files
#Upload file
fname = files.upload()
fname = list(fname.keys())[0]
show_inference(detection_model, fname, threshold=0.35)