X

Model Details

Model Description

DeepLab V3+ is an advanced deep learning model designed for semantic image segmentation tasks. It extends the previous DeepLab V3 model by incorporating an additional decoder module that enhances the spatial resolution of segmentation results, making it well-suited for tasks requiring precise boundary detection.

Developed by: Google
Shared by:
Model type: Computer vision
License:
Resources for more information:

Training Details

Training Data

The model was trained on on 21 different classes. For more information about training data check the .

Testing Details

Metrics

The evaluation was done on validation set of VOC dataset.

mIOU
87.8

Results are taken from .

Technical Specifications

Input/Output Details

Input:
- Name: image
  - Info: NCHW BGR un-normalized image
Output:
- Name: mask
  - Info: segmentation masks for 21 different classes

Model Architecture

Backbone: MobileNetV2 - This lightweight architecture serves as the feature extractor. MobileNetV2 is efficient and optimized for mobile and embedded devices, using depthwise separable convolutions to reduce the number of parameters while maintaining strong feature representation.
Neck: The ASPP (Atrous Spatial Pyramid Pooling) module in DeepLabV3+ captures multi-scale context by applying atrous convolutions with different dilation rates, allowing the model to gather information at different spatial scales without reducing resolution.
Head: The decoder upsamples the features extracted from the ASPP module and refines them using low-level features from earlier layers in the backbone. This helps recover spatial details, especially for accurate object boundaries in the segmentation map.

Throughput

Model variant: deeplab-v3-plus:person-256x256

Platform	Precision	Throughput (infs/sec)	Power Consumption (W)
RVC2	FP16	35.02	N/A

Model variant: deeplab-v3-plus:512x512

• Input shape: [1, 3, 512, 512] • Output shape: [1, 21, 512, 512]

• Params (M): 5.208 • GFLOPs: 30.956

Platform	Precision	Throughput (infs/sec)	Power Consumption (W)
RVC2	FP16	1.83	N/A
RVC4	INT8	200.30	5.45

Model variant: deeplab-v3-plus:256x256

Platform	Precision	Throughput (infs/sec)	Power Consumption (W)
RVC2	FP16	17.62	N/A

Model variant: deeplab-v3-plus:512x288

• Input shape: [1, 3, 288, 512] • Output shape: [1, 21, 288, 512]

• Params (M): 5.208 • GFLOPs: 17.566

Platform	Precision	Throughput (infs/sec)	Power Consumption (W)
RVC2	FP16	3.63	N/A
RVC4	INT8	333.92	5.85

Model variant: deeplab-v3-plus:513x513

• Input shape: [1, 3, 513, 513] • Output shape: [1, 21, 513, 513]

• Params (M): 5.798 • GFLOPs: 27.256

Platform	Precision	Throughput (infs/sec)	Power Consumption (W)
RVC2	FP16	3.****/A
RVC4	INT8	177.92	4.09

Model variant: deeplab-v3-plus:person-513x513

Platform	Precision	Throughput (infs/sec)	Power Consumption (W)
RVC2	FP16	6.35	N/A

* Benchmarked with , using 2 threads (and the DSP runtime in balanced mode for RVC4).

* Parameters and FLOPs are obtained from the package.

Quantization

RVC4 version of the model was quantized using General dataset in HubAI.

Utilization

Models converted for RVC Platforms can be used for inference on OAK devices. DepthAI pipelines are used to define the information flow linking the device, inference model, and the output parser (as defined in model head(s)). Below, we present the most crucial utilization steps for the particular model. Please consult the docs for more information.

Install DAIv3 and depthai-nodes libraries:

pip install depthai
pip install depthai-nodes

Define model:

model_description = dai.NNModelDescription(
    "luxonis/deeplab-v3-plus:513x513"
)

nn = pipeline.create(ParsingNeuralNetwork).build(
    <CameraNode>, model_description
)

Inspect model head(s):

SegmentationParser that outputs message (segmentation mask for 21 classes).

Get parsed output(s):

while pipeline.isRuning():
    parser_output: dai.ImgFrame = parser_output_queue.get()

Example

You can quickly run the model using our script. It automatically downloads the model, creates a DepthAI pipeline, runs the inference, and displays the results using our DepthAI visualizer tool. To try it out, run:

python3 main.py \
    --model luxonis/deeplab-v3-plus:512x288 \
    -overlay

Multi-class image segmentation model.
License	MIT Commercial use
Downloads	2527
Tasks	Semantic Segmentation
Model Types	IR ONNX

Name	Version	Available For	Created At	Deploy
		RVC2, RVC4	8 months ago
		RVC2, RVC4	8 months ago
		RVC2	About 1 year ago
		RVC2	About 1 year ago
		RVC2	About 1 year ago
		RVC2, RVC4	About 1 year ago