Our new model ZOO works with DepthAI V3. Find out more in our documentation.
1+ Likes
Model Details
Model Description
SCRFD is an fast efficient high accuracy face-detection model that can detect faces of different scales. For this reason, it is suitable for detecting faces near the camera or extremely away from it. SCRFD performs well in crowded scenes and challenging lighting conditions. Besides face detection, it also detects 5 keypoints on every face (2 for the eyes, 1 for the nose, and 2 for each end of the mouth).
Developed by: InsightFace
Shared by:
Model type: Computer Vision
License:
Resources for more information:
Training Details
Training Data
The model was trained on . It is split into training, validation and test sets. WIDERFace is a face detection benchmark dataset, of which images are selected from the publicly available . They choose 32,203 images and label 393,703 faces with a high degree of variability in scale, pose, and occlusion as depicted in the sample images. WIDERFace dataset is organized based on 61 event classes. For each event class, they randomly select 40%/10%/50% data as training, validation, and testing sets. Based on the detection rate of EdgeBox (Zitnick & Dolla ́r, 2014), three levels of difficulty (i.e. Easy, Medium and Hard) are defined by incrementally incorporating hard samples.
Testing Details
Metrics
Evaluation of the model on WIDERFace dataset for all three categories: Easy, Medium and Hard. Results are taken from .
Category
mAP
Easy
95.16
Medium
93.87
Hard
83.05
Technical Specifications
Input/Output Details
Input:
Name: input.1
Info: NCHW BGR un-normalized image
Output:
Name: Multiple (please consult NN archive config.json)
Info: Classification scores, bounding boxes, and keypoints for a multitude of detections.
* Benchmarked with , using 2 threads (and the DSP runtime in balanced mode for RVC4).
* Parameters and FLOPs are obtained from the package.
Quantization
RVC4 version of the model was quantized using a custom dataset.
This was created by taking a 40-image subset of the dataset.
Utilization
Models converted for RVC Platforms can be used for inference on OAK devices.
DepthAI pipelines are used to define the information flow linking the device, inference model, and the output parser (as defined in model head(s)).
Below, we present the most crucial utilization steps for the particular model.
Please consult the docs for more information.
SCRFDParser that outputs message (bounding boxes and confidence scores for every detected face).
Get parsed output(s):
while pipeline.isRuning():
parser_output: ImgDetectionsExtended = parser_output_queue.get()
Example
You can quickly run the model using our script.
It automatically downloads the model, creates a DepthAI pipeline, runs the inference, and displays the results using our DepthAI visualizer tool.
To try it out, run: