When doing object detection, we can find where the target objects are from the bounding box predicted. However, there are times that we not only want to know where the objects are, we may also wish there is a mask overlapping the objects and indicating their exact borders. This is called “instance segmentation”. Here is a nice graph that compares this term with other ones:
Instance segmentation can be achieved by implementing Mask R-CNN. In this article, I will give a step by step guide on using detecron2 that loads the weights of Mask R-CNN. In the end, we will create a predictor that is able to show a mask on mangoes in each picture 🥭🥭
This article will cover:
- Preparing our custom dataset
- Training the network
- Predicting with our trained network
Preparing our custom dataset
To label our own images, we first need a handy tool that helps us label and create the corresponding file required in later training. Here I use VGG image annotator.
We first upload 53 images as our training dataset (I prepared another 14 images for validation.)