X

Model Details

Model Description

DM-Count is a model for crowd counting. It uses distribution matching to estimate a crowd density map from which the count is computed. The model outperforms detection-then-counting approaches on images of large crowds. For small crowds, we sugest to use the latter instead (e.g. see YuNet or YOLOv6 models). We offer here three versions of the model trained on ShanghaiTech (SHA, SHB), and UCF-QNRF (QNRF) data sets. Try all models and use the one that works the best for your use case (e.g. use SHA for higher crowd density).

Developed by: Boyu Wang et al.
Shared by: repository
Model type: Computer Vision
License:
Resources for more information:

Photo by RUN 4 FFWPU from

Training Details

Training Data

The SHA and SHB models were trained on A and B parts of data set consisting of 300 and 400 training images of crowded scenes. The A part consists of images with significantly higher crowd density than in the B part. The QNRF model was trained on data set consisting of 1200 images of diverse crowd density and view angles.

Testing Details

Metrics

MAE and MSE were calculated on test sets of the appropriate datasets (see and ). The results are taken from the repository. Beware that metrics were calculated for models with dynamic input shapes while we offer here the models with fixed input shape This might affect their performance if significant input image resizing is performed.

Model	MAE	MSE
SHA	61.39	98.56
SHB	7.68	12.66
QNRF	88.97	154.11

Technical Specifications

Input/Output Details

Input:
- Name: image
  - Info: NCHW BGR image
Output:
- Name: density_map
  - Info: Estimated crowd density

Model Architecture

Backbone: VGG-19 (with final pooling and fully connected layers removed)

The model architecture is based on the work of Zhiheng Ma et al. (see the for more information).

Throughput

Model variant: dm-count:shb-960x540, dm-count:sha-960x540, dm-count:qnrf-960x540

• Input shape: [1, 3, 540, 960] • Output shape: [1, 1, 66, 120]

• Params (M): 21.499 • GFLOPs: 212.704

Platform	Precision	Throughput (infs/sec)	Power Consumption (W)
RVC4	INT8	70.90	6.50

Model variant: dm-count:shb-256x144, dm-count:sha-256x144, dm-count:qnrf-256x144

• Input shape: [1, 3, 144, 256] • Output shape: [1, 1, 18, 32]

• Params (M): 21.499 • GFLOPs: 15.210

Platform	Precision	Throughput (infs/sec)	Power Consumption (W)
RVC2	FP16	8.83	N/A

Model variant: dm-count:shb-426x240, dm-count:sha-426x240, dm-count:qnrf-426x240

• Input shape: [1, 3, 240, 426] • Output shape: [1, 1, 30, 52]

• Params (M): 21.499 • GFLOPs: 41.916

Platform	Precision	Throughput (infs/sec)	Power Consumption (W)
RVC2	FP16	1.93	N/A
RVC4	INT8	358.28	7.05

Model variant: dm-count:shb-640x360, dm-count:sha-640x360, dm-count:qnrf-640x360

• Input shape: [1, 3, 360, 640] • Output shape: [1, 1, 44, 80]

• Params (M): 21.499 • GFLOPs: 94.756

Platform	Precision	Throughput (infs/sec)	Power Consumption (W)
RVC2	FP16	0.88	N/A
RVC4	INT8	154.53	6.27

* Benchmarked with , using 2 threads (and the DSP runtime in balanced mode for RVC4).

* Parameters and FLOPs are obtained from the package.

Quantization

RVC4 models were quantized on 100-image subsets of the appropriate training datasets (see the Training Data section).

Utilization

Models converted for RVC Platforms can be used for inference on OAK devices. DepthAI pipelines are used to define the information flow linking the device, inference model, and the output parser (as defined in model head(s)). Below, we present the most crucial utilization steps for the particular model. Please consult the docs for more information.

Install DAIv3 and depthai-nodes libraries:

pip install depthai
pip install depthai-nodes

Define model:

model_description = dai.NNModelDescription(
    "luxonis/dm-count:shb-426x240"
)

nn = pipeline.create(ParsingNeuralNetwork).build(
    <CameraNode>, model_description
)

Inspect model head(s):

MapOutputParser that outputs message (crowd density map).

Get parsed output(s):

while pipeline.isRuning():
    parser_output: Map2D = parser_output_queue.get()

Example

You can quickly run the model using our example.

The example demonstrates how to build a 1-stage DepthAI pipeline consisting of a crowd density estimation model and processing it's outputs. It automatically downloads the model, creates a DepthAI pipeline, runs the inference, and displays the results using our DepthAI visualizer tool.

To try it out, run:

python main.py \
    -model luxonis/dm-count:shb-426x240

Crowd density esitmation model.
License	MIT Commercial use
Downloads	4464
Tasks	Object Detection
Model Types	ONNX

Name	Version	Available For	Created At	Deploy
		RVC2	About 1 year ago
		RVC2, RVC4	About 1 year ago
		RVC2, RVC4	About 1 year ago
		RVC2, RVC4	About 1 year ago
		RVC2	About 1 year ago
		RVC2, RVC4	About 1 year ago
		RVC2, RVC4	About 1 year ago
		RVC2, RVC4	About 1 year ago
		RVC2	About 1 year ago
		RVC2, RVC4	About 1 year ago
		RVC2, RVC4	About 1 year ago
		RVC2, RVC4	About 1 year ago