Building a Damage Detection Model with Faster R-CNN

Package damage during transit is a common issue in logistics, leading to inefficiencies and customer dissatisfaction. In this project, we designed a computer vision-based solution to automatically identify and classify damage in parcel boxes using object detection.

Data Collection and Annotation

We compiled a custom dataset of 1200 images of parcel boxes from different countries and manually annotated 750 images using VoTT, labeling them into three damage categories:

Damage_A: Minor sticker or surface damage
Damage_B: Moderate compression or creases
Damage_C: Severe tearing, holes, or wet boxes

Bounding box annotations were exported in PASCAL VOC format and processed using Roboflow for training-ready datasets (resized to 640×640).

Model Architecture

We used Detectron2's implementation of Faster R-CNN with a ResNet-X101 backbone and Feature Pyramid Network (FPN). The Region Proposal Network (RPN) generated candidate object locations, which were classified and refined by the detection head.

Training Setup

The model was trained on GPU for 2000 iterations with a learning rate of 0.001. We registered the dataset using Detectron2's metadata tools and trained the model using Detectron2's DefaultTrainer class.

Evaluation

Table 1: Classification Results for Faster R-CNN Model

Correct Category	Predicted Category				Total	Accuracy (%)
Correct Category	A	B	C	NaN	Total	Accuracy (%)
A	28	1	0	5	34	82.35
B	5	21	11	28	68	30.88
C	1	2	38	10	51	74.51
Total Samples					150

Model performance was evaluated using COCO metrics (AP50, AP70, AP75). While bounding box accuracy was modest, classification accuracy was strong for two out of three categories:

Damage_A: 82.35%
Damage_C: 74.51%
Damage_B: 30.88% (with high number of missed detections)

Overall accuracy across 150 validation images was 58% due to overlap between damage categories and the inherent difficulty in labeling subtle compression.

The predicted accuracy of the model in the validation dataset showed 58% overall across 150 images. Category A and C achieved over 75% classification accuracy. However, category B struggled to predict its own class accurately, with a large number of B-type damage images left unpredicted. The unpredicted percentage for B-type damage was 41%, while the overall unpredicted percentage was only 0.28%. This indicates that while the model performs well for prominent damage types, it struggles with ambiguous or overlapping cases, particularly for moderate compression (category B).

In conclusion, categorizing damage into three distinct levels was challenging, especially because the middle category (Damage_B) shares characteristics with both the minor and severe classes. Additionally, subjective variations in labeling due to the complexity of identifying damage levels may have influenced results. Nonetheless, the model performed well in classifying categories A and C, indicating its potential in practical logistics damage detection scenarios.

Takeaways

This project demonstrated the feasibility of using R-CNN models for freight damage detection, especially for clearly distinguishable categories. Future work could involve using instance segmentation (e.g., Mask R-CNN), refining annotations, or augmenting the dataset with synthetic images to boost accuracy.