Luxonis
    Our new model ZOO works with DepthAI V3. Find out more in our documentation.
    Model Details
    Model Description
    DeepLab V3+ is an advanced deep learning model designed for semantic image segmentation tasks. It extends the previous DeepLab V3 model by incorporating an additional decoder module that enhances the spatial resolution of segmentation results, making it well-suited for tasks requiring precise boundary detection.
    • Developed by: Google
    • Shared by:
    • Model type: Computer vision
    • License:
    • Resources for more information:
    Training Details
    Training Data
    The model was trained on on 21 different classes. For more information about training data check the .
    Testing Details
    Metrics
    The evaluation was done on validation set of VOC dataset.
    mIOU
    87.8
    Results are taken from .
    Technical Specifications
    Input/Output Details
    • Input:
      • Name: image
        • Info: NCHW BGR un-normalized image
    • Output:
      • Name: mask
        • Info: segmentation masks for 21 different classes
    Model Architecture
    • Backbone: MobileNetV2 - This lightweight architecture serves as the feature extractor. MobileNetV2 is efficient and optimized for mobile and embedded devices, using depthwise separable convolutions to reduce the number of parameters while maintaining strong feature representation.
    • Neck: The ASPP (Atrous Spatial Pyramid Pooling) module in DeepLabV3+ captures multi-scale context by applying atrous convolutions with different dilation rates, allowing the model to gather information at different spatial scales without reducing resolution.
    • Head: The decoder upsamples the features extracted from the ASPP module and refines them using low-level features from earlier layers in the backbone. This helps recover spatial details, especially for accurate object boundaries in the segmentation map.
    Throughput
    Model variant: deeplab-v3-plus:person-256x256
    PlatformPrecisionThroughput (infs/sec)Power Consumption (W)
    RVC2FP1635.02N/A
    Model variant: deeplab-v3-plus:512x512
    • Input shape: [1, 3, 512, 512] • Output shape: [1, 21, 512, 512]
    • Params (M): 5.208 • GFLOPs: 30.956
    PlatformPrecisionThroughput (infs/sec)Power Consumption (W)
    RVC2FP161.83N/A
    RVC4INT8200.305.45
    Model variant: deeplab-v3-plus:256x256
    PlatformPrecisionThroughput (infs/sec)Power Consumption (W)
    RVC2FP1617.62N/A
    Model variant: deeplab-v3-plus:512x288
    • Input shape: [1, 3, 288, 512] • Output shape: [1, 21, 288, 512]
    • Params (M): 5.208 • GFLOPs: 17.566
    PlatformPrecisionThroughput (infs/sec)Power Consumption (W)
    RVC2FP163.63N/A
    RVC4INT8333.925.85
    Model variant: deeplab-v3-plus:513x513
    • Input shape: [1, 3, 513, 513] • Output shape: [1, 21, 513, 513]
    • Params (M): 5.798 • GFLOPs: 27.256
    PlatformPrecisionThroughput (infs/sec)Power Consumption (W)
    RVC2FP163.****/A
    RVC4INT8177.924.09
    Model variant: deeplab-v3-plus:person-513x513
    PlatformPrecisionThroughput (infs/sec)Power Consumption (W)
    RVC2FP166.35N/A
    * Benchmarked with , using 2 threads (and the DSP runtime in balanced mode for RVC4).
    * Parameters and FLOPs are obtained from the package.
    Quantization
    RVC4 version of the model was quantized using General dataset in HubAI.
    Utilization
    Models converted for RVC Platforms can be used for inference on OAK devices. DepthAI pipelines are used to define the information flow linking the device, inference model, and the output parser (as defined in model head(s)). Below, we present the most crucial utilization steps for the particular model. Please consult the docs for more information.
    Install DAIv3 and depthai-nodes libraries:
    pip install depthai
    pip install depthai-nodes
    
    Define model:
    model_description = dai.NNModelDescription(
        "luxonis/deeplab-v3-plus:513x513"
    )
    
    nn = pipeline.create(ParsingNeuralNetwork).build(
        <CameraNode>, model_description
    )
    
    Inspect model head(s):
    • SegmentationParser that outputs message (segmentation mask for 21 classes).
    Get parsed output(s):
    while pipeline.isRuning():
        parser_output: dai.ImgFrame = parser_output_queue.get()
    
    Example
    You can quickly run the model using our script. It automatically downloads the model, creates a DepthAI pipeline, runs the inference, and displays the results using our DepthAI visualizer tool. To try it out, run:
    python3 main.py \
        --model luxonis/deeplab-v3-plus:512x288 \
        -overlay
    
    DeepLab-V3-Plus
    Multi-class image segmentation model.
    License
    MIT
    Commercial use
    Downloads
    11526
    Tasks
    Semantic Segmentation
    Model Types
    IR
    ONNX
    Model Variants
    NameVersionAvailable ForCreated AtDeploy
    RVC2, RVC4Over 1 year ago
    RVC2, RVC4Over 1 year ago
    RVC2Over 1 year ago
    RVC2Over 1 year ago
    Luxonis - Robotic vision made simple.
    XYouTubeLinkedInGitHub