Facial Recognition and Swapping

简介: Photo sharing has become a norm in social networking sites today. Even with so much happening in social media, self-portraits, or selfies, continue to dominate.

Facial_Trends

Photo sharing has become a norm in social networking sites today. Even with so much happening in social media, self-portraits, or selfies, continue to dominate. The growing phenomena of selfies have resulted in the emergence of face-related applications embedded in cameras and social media platforms. These applications can track and detect human faces in real-time or even categorize photos by faces. They can also be used for verification, such as Alipay's face login, which uses a person's face as a personal ID.

These applications are based on either face detection or facial recognition technology. Facial recognition is an extension of face detection, which matches unique characteristics of a face for the purpose of identification. Collectively, both technologies are termed face recognition technology.

Outlined below are some of the different categories of facial effect applications.

Category of Effects

Face Warp

01

Face warp exposes images with an irregular lens to reshape or resize specific parts of a face. This is achieved through the remapping of the pixel coordinates.

Facial Textures and Accessories

02

Numerous applications, such as MeiTu and Snapchat, make use of these types of effects. Upon successful identification of a human face, the app allows users to apply different textures and accessories onto the photo. Most of these apps can also be applied to real-time videos.

Face Swap

03

Face swap is primarily applicable to group photos. First, the user identifies a source face and then swaps it with the target face in the same photo. The swapped faces are then processed with an image fusion technology to make the swap appear more realistic.

Face Morph

04

Similar to a face swap, face morph requires two faces but combines them into a single face. Face morph is also applicable to animated figures or animal faces.

Face Animation

05

This category is typically a combination of multiple face effects, such as a combination of face warp and textures. The images are then animated to enhance the effects.

Implementation Principles

06

In this example, the user's face replaces the one in the painting. The photo on the right shows the result of the replacement. In terms of algorithm, this process includes face detection, key point location, lens conversion, region extraction, color transfer, and edge fusion.

Face Detection

Face detection is a technology that identifies a human face in digital images.
This example uses DLib for face detection and the code is as follows:

dlib::frontal_face_detector detector = dlib::get_frontal_face_detector();
dlib::cv_image<dlib::rgb_pixel> img = cvImg;
std::vector<dlib::rectangle> faces = detector(img);

The rectangle boxes (dlib::rectangle) are the results of the detection.

07

Key Point Location

Upon detecting a human face, DLib performs key point location. Key points, also known as landmarks, help in identifying the key features of the face.
DLib provides a 68-point landmark detection function:

dlib::shape_predictor sp;

// Read the feature library
dlib::deserialize(LandMarksModelFile) >> sp;

// Get the first human face
dlib::full_object_detection shape = sp(img, faces[0]);
for (size_t i = 0; i < shape.num_parts(); i++) {
    dlib::point pt = shape.part(i);
    landmarks.push_back(pt);
}

The 68 landmarks are coordinates of various parts of the human face stored in the following order:

{
    IdxRange jaw;       // [0 , 16]
    IdxRange rightBrow; // [17, 21]
    IdxRange leftBrow;  // [22, 26]
    IdxRange nose;      // [27, 35]
    IdxRange rightEye;  // [36, 41]
    IdxRange leftEye;   // [42, 47]
    IdxRange mouth;     // [48, 59]
    IdxRange mouth2;    // [60, 67]
}

08

Lens Deformation

The lens deformation effect in this example is performed using homography transformation. Homography "H" describes the correspondence between two human faces, and treats a human face as a plane for location transformation:

09

// Estimate the homography transformation between two human faces based on the landmark
cv::Mat H = cv::findHomography(face1.landMarks, face2.landMarks);

// Apply homography transformation to the entire photo
cv::warpPerspective(im1, warpIm1, H, im2.size());

The transformation result is shown in the figure below. We can see that the angle and posture of the transformed face is similar to the face in the painting.

10

Regional Extraction

The regional extraction technique filters out all the other aspects/parts of a face, including hair and neck. The aim of regional extraction is to find a mask containing only the landmarks of the face. To obtain the mask, Gaussian Blur is first applied to blur the image on the region, expanding the selected region. Binarization is then performed to convert an ordinary image into a binary image:

int blurAmount = 5;
cv::Mat maskBlur;
cv::GaussianBlur(histMask, maskBlur, cv::Size(blurAmount, blurAmount), 0);
cv::threshold(maskBlur, histMask, 0, 255, CV_THRESH_BINARY); 

11

Color Transfer

The aim of color transfer is to make the color of the current face similar to the face intended for replacement. While various ways exist to achieve such transfer, this example adopts the histogram adjustment method, which is comparatively easy to implement. It involves the following steps:
1) Calculate the color histograms of the current image and the target image
2) Adjust the histogram of the current image to make it consistent with that of the target image
3) Apply the adjusted histogram to the current image

12

Edge Fusion

After the color transfer, the extracted face is ready to be transferred. However, if we copy the face directly onto the other, the edge may look abrupt. As such, this demo applies the Laplacian pyramid fusion to make the edges more coherent. Click to learn more about Laplacian pyramid based image fusion.

13

Conclusion

Emerging facial trends, namely facial recognition and swapping, continue to draw attention on social media due to the ease with which individuals can manipulate photos. However, these technologies are not only applicable to fun and social apps but also useful for more critical applications.

With the advancements of deep learning, the accuracy of face recognition has greatly improved. Many startups are taking advantage of these technological improvements, producing a multitude of products with various applications. One such start-up, Megvii's Face++, provides high-recognition accuracy solutions, ranked among the top globally. The company aims to expand into industries such as finance, smart cities, and robotics in the near future.

Some links

Face2Face: Real-time Face Capture and Reenactment of RGB Videos
A highly controversy new technology at CVPR - Face2Face
Switching Eds: Face swapping with Python, dlib, and OpenCV
https://github.com/mc-jesus/FaceSwap

目录
相关文章
|
机器学习/深度学习 搜索推荐 算法
Learning Disentangled Representations for Recommendation | NIPS 2019 论文解读
近年来随着深度学习的发展,推荐系统大量使用用户行为数据来构建用户/商品表征,并以此来构建召回、排序、重排等推荐系统中的标准模块。普通算法得到的用户商品表征本身,并不具备可解释性,而往往只能提供用户-商品之间的attention分作为商品粒度的用户兴趣。我们在这篇文章中,想仅通过用户行为,学习到本身就具备一定可解释性的解离化的用户商品表征,并试图利用这样的商品表征完成单语义可控的推荐任务。
23693 0
Learning Disentangled Representations for Recommendation | NIPS 2019 论文解读
|
5月前
|
算法 计算机视觉
2017cvpr论文解读——Nasal Patches and Curves for Expression-Robust 3D Face Recognition
2017cvpr论文解读——Nasal Patches and Curves for Expression-Robust 3D Face Recognition
20 1
|
9月前
|
机器学习/深度学习 编解码 数据可视化
Speech Emotion Recognition With Local-Global aware Deep Representation Learning论文解读
语音情感识别(SER)通过从语音信号中推断人的情绪和情感状态,在改善人与机器之间的交互方面发挥着至关重要的作用。尽管最近的工作主要集中于从手工制作的特征中挖掘时空信息,但我们探索如何从动态时间尺度中建模语音情绪的时间模式。
85 0
|
机器学习/深度学习 移动开发 数据挖掘
Understanding Few-Shot Learning in Computer Vision: What You Need to Know
Few-Shot Learning is a sub-area of machine learning. It’s about classifying new data when you have only a few training samples with supervised information. FSL is a rather young area that needs more research and refinement. As of today, you can use it in CV tasks. A computer vision model can work
133 0
|
机器学习/深度学习 自然语言处理 算法
【文本分类】Convolutional Neural Networks for Sentence Classification
【文本分类】Convolutional Neural Networks for Sentence Classification
【文本分类】Convolutional Neural Networks for Sentence Classification
|
机器学习/深度学习 资源调度 算法框架/工具
翻译:Deep Residual Learning for Image Recognition
翻译:Deep Residual Learning for Image Recognition
102 0
|
机器学习/深度学习 搜索推荐 算法
SysRec2016 | Deep Neural Networks for YouTube Recommendations
YouTube有很多用户原创内容,其商业模式和Netflix、国内的腾讯、爱奇艺等流媒体不同,后者是采购或自制的电影,并且YouTube的视频基数巨大,用户难以发现喜欢的内容。本文根据典型的两阶段信息检索二分法:首先描述一种深度候选生成模型,接着描述一种分离的深度排序模型。
239 0
SysRec2016 | Deep Neural Networks for YouTube Recommendations
|
机器学习/深度学习 数据挖掘 Java
CV:翻译并解读2019《A Survey of the Recent Architectures of Deep Convolutional Neural Networks》第四章(二)
CV:翻译并解读2019《A Survey of the Recent Architectures of Deep Convolutional Neural Networks》第四章
|
机器学习/深度学习 数据挖掘 计算机视觉
CV:翻译并解读2019《A Survey of the Recent Architectures of Deep Convolutional Neural Networks》第四章(一)
CV:翻译并解读2019《A Survey of the Recent Architectures of Deep Convolutional Neural Networks》第四章
CV:翻译并解读2019《A Survey of the Recent Architectures of Deep Convolutional Neural Networks》第四章(一)
|
机器学习/深度学习 数据挖掘 计算机视觉
CV:翻译并解读2019《A Survey of the Recent Architectures of Deep Convolutional Neural Networks》第四章(三)
CV:翻译并解读2019《A Survey of the Recent Architectures of Deep Convolutional Neural Networks》第四章