Our new model ZOO works with DepthAI V3. Find out more in our documentation.
0 Likes
Model Details
Model Description
DM-Count is a model for crowd counting.
It uses distribution matching to estimate a crowd density map from which the count is computed.
The model outperforms detection-then-counting approaches on images of large crowds.
For small crowds, we sugest to use the latter instead (e.g. see YuNet or YOLOv6 models).
We offer here three versions of the model trained on ShanghaiTech (SHA, SHB), and UCF-QNRF (QNRF) data sets.
Try all models and use the one that works the best for your use case (e.g. use SHA for higher crowd density).
Developed by: Boyu Wang et al.
Shared by: repository
Model type: Computer Vision
License:
Resources for more information:
Photo by RUN 4 FFWPU from
Training Details
Training Data
The SHA and SHB models were trained on A and B parts of data set consisting of 300 and 400 training images of crowded scenes.
The A part consists of images with significantly higher crowd density than in the B part.
The QNRF model was trained on data set consisting of 1200 images of diverse crowd density and view angles.
Testing Details
Metrics
MAE and MSE were calculated on test sets of the appropriate datasets (see
and ).
The results are taken from the repository.
Beware that metrics were calculated for models with dynamic input shapes while we offer here the models with fixed input shape
This might affect their performance if significant input image resizing is performed.
Model
MAE
MSE
SHA
61.39
98.56
SHB
7.68
12.66
QNRF
88.97
154.11
Technical Specifications
Input/Output Details
Input:
Name: image
Info: NCHW BGR image
Output:
Name: density_map
Info: Estimated crowd density
Model Architecture
Backbone: VGG-19 (with final pooling and fully connected layers removed)
The model architecture is based on the work of Zhiheng Ma et al. (see the for more information).
Throughput
Model variant: dm-count:shb-960x540, dm-count:sha-960x540, dm-count:qnrf-960x540
* Benchmarked with , using 2 threads (and the DSP runtime in balanced mode for RVC4).
* Parameters and FLOPs are obtained from the package.
Quantization
RVC4 models were quantized on 100-image subsets of the appropriate training datasets (see the Training Data section).
Utilization
Models converted for RVC Platforms can be used for inference on OAK devices.
DepthAI pipelines are used to define the information flow linking the device, inference model, and the output parser (as defined in model head(s)).
Below, we present the most crucial utilization steps for the particular model.
Please consult the docs for more information.
MapOutputParser that outputs message (crowd density map).
Get parsed output(s):
while pipeline.isRuning():
parser_output: Map2D = parser_output_queue.get()
Example
You can quickly run the model using our example.
The example demonstrates how to build a 1-stage DepthAI pipeline consisting of a crowd density estimation model and processing it's outputs.
It automatically downloads the model, creates a DepthAI pipeline, runs the inference, and displays the results using our DepthAI visualizer tool.