Luxonis
    Our new model ZOO works with DepthAI V3. Find out more in our documentation.
    0 Likes
    Model Details
    Model Description
    The ArcFace model is a deep face recognition model with ResNet100 backbone and Additive Angular Margin Loss (ArcFace). ArcFace is a novel supervisor signal called additive angular margin which used as an additive term in the softmax loss to enhance the discriminative power of softmax loss.
    • Developed by: Jiankang Deng et al.
    • Shared by:
    • Model type: Computer Vision
    • License:
    • Resources for more information:
    Training Details
    Training Data
    The model was trained using the MS1MV3 and IBUG500K datasets with a ResNet100 backbone, as these datasets were found to deliver the best performance on the evaluation benchmarks (LFW and YTF).
    • Training Datasets:
      • The dataset (also called MS1M-RetinaFace) is a cleaned version of the MSCeleb1M dataset (); all its face images have been pre-processed by the Retina-Face detector () and are of size 112 × 112 pixels. It consists of 93K identities and 5.1 million Images/Videos
      • The IBUG500K dataset proposed in the ArcFace consists of 493K identities and 11.96 million Images/Videos
    • Evaluation Datasets:
      • The is a public benchmark for face verification, focusing on unconstrained face recognition. It contains over 13,000 face images collected from the web, each labeled with the individual's name. Among them, 1,680 people have two or more distinct photos.
      • The is a public benchmark of face videos designed for studying the problem of unconstrained face recognition in videos. The dataset contains 3,425 videos of 1,595 different people. All the videos were downloaded from YouTube. An average of 2.15 videos are available for each subject. The shortest clip duration is 48 frames, the longest clip is 6,070 frames, and the average length of a video clip is 181.3 frames.
    Testing Details
    Metrics
    These results showcase the performance of ArcFace on the LFW, and YTF benchmarks since they are the most widely used benchmark for unconstrained face verification on images and videos. The results are obtained from the .
    Evaluation DatasetTraining DatasetRecognition accuracy (%)
    LFWMS1MV399.83
    LFWIBUG500K99.83
    YTFMS1MV398.02
    YTFIBUG500K98.01
    Technical Specifications
    Input/Output Details
    • Input:
      • Name: images
        • Info: NCHW, BGR un-normalized image
    • Output:
      • Name: output
        • Info: NF, the output embeddings of the model
    Model Architecture
    The ArcFace is built around the Additive Angular Margin Loss to enhance class separability and discriminative power. It uses Sub-center ArcFace, where each class has multiple sub-centers to manage label noise by grouping clean and challenging samples separately. This improves model robustness in real-world conditions. Additionally, ArcFace can generate identity-preserved face images using gradients and Batch Normalization, without needing extra generative components. ArcFace delivers strong performance in both face recognition and face synthesis tasks.
    Throughput
    Model variant: arcface:lfw-112x112
    • Input shape: [1, 3, 112, 112] • Output shape: [1, 512]
    PlatformPrecisionThroughput (infs/sec)Power Consumption (W)
    RVC2FP167.61N/A
    RVC4FP16110.986.91
    * Benchmarked with , using 2 threads (and the DSP runtime in balanced mode for RVC4).
    * Parameters and FLOPs are obtained from the package.
    Utilization
    Models converted for RVC Platforms can be used for inference on OAK devices. DepthAI pipelines are used to define the information flow linking the device, inference model, and the output parser (as defined in model head(s)). Below, we present the most crucial utilization steps for the particular model. Please consult the docs for more information.
    Install DAIv3 and depthai-nodes libraries:
    pip install depthai
    pip install depthai-nodes
    
    Define model:
    model_description = dai.NNModelDescription(
        "luxonis/arcface:lfw-112x112"
    )
    
    nn = pipeline.create(ParsingNeuralNetwork).build(
        <CameraNode>, model_description
    )
    
    Inspect model head(s):
    • EmbeddingsParser that outputs dai.NNData containing the output embeddings of the model.
    Get parsed output(s):
    while pipeline.isRuning():
        parser_output: dai.NNData = parser_output_queue.get()
        embeddings = message.getTensor("output")  # embeddings of shape (1, 512)
    
    Example
    You can check out the . You can run it with:
    python3 main.py \
        -det luxonis/scrfd-face-detection:10g-640x640 \
        -rec luxonis/arcface:lfw-112x112
    
    ArcFace
    A deep face recognition model with ResNet100 backbone and ArcFace loss
    License
    MIT
    Commercial use
    Downloads
    321
    Tasks
    Image Embedding
    Model Types
    ONNX
    Model Variants
    NameVersionAvailable ForCreated AtDeploy
    RVC2, RVC48 months ago
    Luxonis - Robotic vision made simple.
    XYouTubeLinkedInGitHub