Our new model ZOO works with DepthAI V3. Find out more in our documentation.
0 Likes
Model Details
Model Description
The ArcFace model is a deep face recognition model with ResNet100 backbone and Additive Angular Margin Loss (ArcFace). ArcFace is a novel supervisor signal called additive angular margin which used as an additive term in the softmax loss to enhance the discriminative power of softmax loss.
Developed by: Jiankang Deng et al.
Shared by:
Model type: Computer Vision
License:
Resources for more information:
Training Details
Training Data
The model was trained using the MS1MV3 and IBUG500K datasets with a ResNet100 backbone, as these datasets were found to deliver the best performance on the evaluation benchmarks (LFW and YTF).
Training Datasets:
The dataset (also called MS1M-RetinaFace) is a cleaned version of the MSCeleb1M dataset (); all its face images have been pre-processed by the Retina-Face detector () and are of size 112 × 112 pixels. It consists of 93K identities and 5.1 million Images/Videos
The IBUG500K dataset proposed in the ArcFace consists of 493K identities and 11.96 million Images/Videos
Evaluation Datasets:
The is a public benchmark for face verification, focusing on unconstrained face recognition. It contains over 13,000 face images collected from the web, each labeled with the individual's name. Among them, 1,680 people have two or more distinct photos.
The is a public benchmark of face videos designed for studying the problem of unconstrained face recognition in videos. The dataset contains 3,425 videos of 1,595 different people. All the videos were downloaded from YouTube. An average of 2.15 videos are available for each subject. The shortest clip duration is 48 frames, the longest clip is 6,070 frames, and the average length of a video clip is 181.3 frames.
Testing Details
Metrics
These results showcase the performance of ArcFace on the LFW, and YTF benchmarks since they are the most widely used benchmark for unconstrained face verification on images and videos. The results are obtained from the .
Evaluation Dataset
Training Dataset
Recognition accuracy (%)
LFW
MS1MV3
99.83
LFW
IBUG500K
99.83
YTF
MS1MV3
98.02
YTF
IBUG500K
98.01
Technical Specifications
Input/Output Details
Input:
Name: images
Info: NCHW, BGR un-normalized image
Output:
Name: output
Info: NF, the output embeddings of the model
Model Architecture
The ArcFace is built around the Additive Angular Margin Loss to enhance class separability and discriminative power. It uses Sub-center ArcFace, where each class has multiple sub-centers to manage label noise by grouping clean and challenging samples separately. This improves model robustness in real-world conditions. Additionally, ArcFace can generate identity-preserved face images using gradients and Batch Normalization, without needing extra generative components. ArcFace delivers strong performance in both face recognition and face synthesis tasks.
* Benchmarked with , using 2 threads (and the DSP runtime in balanced mode for RVC4).
* Parameters and FLOPs are obtained from the package.
Utilization
Models converted for RVC Platforms can be used for inference on OAK devices.
DepthAI pipelines are used to define the information flow linking the device, inference model, and the output parser (as defined in model head(s)).
Below, we present the most crucial utilization steps for the particular model.
Please consult the docs for more information.