Our new model ZOO works with DepthAI V3. Find out more in our documentation.
0 Likes
Model Details
Model Description
DeepLab V3+ is an advanced deep learning model designed for semantic image segmentation tasks. It extends the previous DeepLab V3 model by incorporating an additional decoder module that enhances the spatial resolution of segmentation results, making it well-suited for tasks requiring precise boundary detection.
Developed by: Google
Shared by:
Model type: Computer vision
License:
Resources for more information:
Training Details
Training Data
The model was trained on on 21 different classes.
For more information about training data check the .
Testing Details
Metrics
The evaluation was done on validation set of VOC dataset.
mIOU
87.8
Results are taken from .
Technical Specifications
Input/Output Details
Input:
Name: image
Info: NCHW BGR un-normalized image
Output:
Name: mask
Info: segmentation masks for 21 different classes
Model Architecture
Backbone: MobileNetV2 - This lightweight architecture serves as the feature extractor. MobileNetV2 is efficient and optimized for mobile and embedded devices, using depthwise separable convolutions to reduce the number of parameters while maintaining strong feature representation.
Neck: The ASPP (Atrous Spatial Pyramid Pooling) module in DeepLabV3+ captures multi-scale context by applying atrous convolutions with different dilation rates, allowing the model to gather information at different spatial scales without reducing resolution.
Head: The decoder upsamples the features extracted from the ASPP module and refines them using low-level features from earlier layers in the backbone. This helps recover spatial details, especially for accurate object boundaries in the segmentation map.
* Benchmarked with , using 2 threads (and the DSP runtime in balanced mode for RVC4).
* Parameters and FLOPs are obtained from the package.
Quantization
RVC4 version of the model was quantized using General dataset in HubAI.
Utilization
Models converted for RVC Platforms can be used for inference on OAK devices.
DepthAI pipelines are used to define the information flow linking the device, inference model, and the output parser (as defined in model head(s)).
Below, we present the most crucial utilization steps for the particular model.
Please consult the docs for more information.
SegmentationParser that outputs message (segmentation mask for 21 classes).
Get parsed output(s):
while pipeline.isRuning():
parser_output: dai.ImgFrame = parser_output_queue.get()
Example
You can quickly run the model using our script.
It automatically downloads the model, creates a DepthAI pipeline, runs the inference, and displays the results using our DepthAI visualizer tool.
To try it out, run: