Our new model ZOO works with DepthAI V3. Find out more in our documentation.
0 Likes
Model Details
Model Description
Lite-HRNet is an efficient human pose estimator from the HRNet (high-resolution network) model family.
It detects a single human pose with 17 body keypoints (nose, eyes, ears, shoulders, elbows, wrists, hips, knees, and ankles).
For optimal performance, it is advisable to couple the model with a human pose detector and take as input human pose cropouts.
We implement here the Lite-HRNet-18 and Lite-HRNet-30 variant of the model.
Developed by: Changqian Yu et al.
Shared by:
Model type: Computer Vision
License:
Resources for more information:
Photo by Guy Kawasaki from
Training Details
Training Data
The model was trained on dataset consisting of about 50K real-world images with 150K human pose instances.
Testing Details
Metrics
The model performance was evaluated on dataset calculating various Average Precision (AP) and Average Recall (AR) metrics. See the for more details.
Model
Input Size
AP
AP50
AP75
APM
APL
AR
Lite-HRNet-18
256x192
64.8
86.7
73.0
62.1
70.5
71.2
Lite-HRNet-18
384x288
67.6
87.8
75.0
64.5
73.7
73.7
Lite-HRNet-30
256x192
67.2
88.0
75.0
64.3
73.1
73.3
Lite-HRNet-30
384x288
70.4
88.7
77.7
67.5
76.3
76.2
Technical Specifications
Input/Output Details
Input:
Name: image
Info: NCHW BGR image
Output:
Name: heatmaps
Info: 2D grids where each grid corresponds to a particular keypoint
Model Architecture
The model integrates HRNet's architecture with ShuffleNet's shuffle block node. Additionally, it replaces the expensive pointwise convolution in the shuffle block with "conditional channel weighting".
For more details, refer to the .
* Benchmarked with , using 2 threads (and the DSP runtime in balanced mode for RVC4).
* Parameters and FLOPs are obtained from the package.
Quantization
RVC4 models were not quantized to int8 due to issues with sigmoid quantization.
Utilization
Models converted for RVC Platforms can be used for inference on OAK devices.
DepthAI pipelines are used to define the information flow linking the device, inference model, and the output parser (as defined in model head(s)).
Below, we present the most crucial utilization steps for the particular model.
Please consult the docs for more information.
HRNetParser that outputs message (detected body skeleton keypoints).
Get parsed output(s):
while pipeline.isRuning():
parser_output: Keypoints = parser_output_queue.get()
Example
You can quickly run the model using our example.
The example demonstrates how to build a 2-stage DepthAI pipeline for human pose estimation. The pipeline consists of a pose detection model and a pose estimation model.
It automatically downloads the models, creates a DepthAI pipeline, runs the inference, and displays the results using our DepthAI visualizer tool.