Project 3: Face Morphing and Modeling a Photo Collection

In this project, we used the concept of transformations in geometric space, specifically affine transformations of images, to compute variations on image pairs, including midway hybrids of two faces, morph animations, mean faces of a population, and caricatures.

Defining Correspondences

In order to "move" an image to another image, we first need to define the triangles which segment the image into its key features. By selecting points corresponding to each other in each image, we can then triangulate between these points, in this case 37 points, using Delaunay triangulation. The result on the average of the two point sets is the below triangulation, which vaguely captures the facial details. We can also see the triangulation on a face. Note that the corners must also be selected to triangulate the background as well.

Computing the "Mid-way Face"

Using the triangles of the mean points from above, we can find the "mean shape" of the two images, which is just the pointwise mean of the two point sets. This is helpful in defining a concrete set of triangles that match up in both images. Then, for each image, and for each triangle in the image, we can compute the inverse of the affine transformation matrix that maps points in the original image to the halfway face triangle. We can get the points in each triangle using a polygon mask, then multiplying the halfway points by the inverse affine matrix to get the corresponding points in the original image. Since values may not fall on an exact pixel, we use scikit-image's interpolation griddata function to correct assign the data to integer positions. Once each image has pixels filled in on the halfway image, we can then average the two morphed images to get the "mean face" below.

The Morph Sequence

In order to visualize the full morphing sequence from one image to another, we can generalize the previous portion: instead of morphing to 50%, we can morph on uniform intervals from 0% (the original image) to 100% (the target). This means when calculating the mean points, we instead use a weighted average between the two point sets to determine the common triangulation. We then run the morph process from before, with increasing weights on the target and decreasing weights on the original each time, to generate frames for the below morph sequence.

The "mean face" of a population

We can visualize what the average member of a population looks like by averaging the pixel values of images of the members. However, in order to ensure the facial features align correctly, we first morph each member to the "average shape" of the population determined by the average of the facial key points, then take the pixelwise mean over all the images. The result below shows the average face of the FEI database, part 1, as well as some of the intermediate results of members warped to the average shape. The dataset came with predefined facial key points, for which it was necessary to match the corresponding points on new images.

With this mean face, we can also experiment by warping into the mean shape, as well as warping the mean face into my own shape. Due to inconsistencies in key point selection, the results are not exactly similar to the population mean, or vice versa.

Caricatures: Extrapolating from the mean

Another fun experiment we can do with face warping is to exaggerate the differences between one's own face and the population mean face. Specifically, we can add the difference (self - mean) * some alpha factor, then add it back to one's own points as the destination shape, which results in the differences between oneself and the mean to become cartoonishly large. This is the same effect as choosing an morph value which is out of [0, 1] when warping, effectively shooting past the target image.

Bells and Whistles: Changing Ethnicity

We can also try and experiment with changing ethnicities using the average face tones and shape of the ethnic population. Below we have the average Burmese person. We can define correspondences on the face, then warp a face to this shape and appearance in order to see the effects. Below is an example with Kenny, first warping the shape alone, then the appearance, then also averaging the pixel values for a combined effect.