FaceFit · SUTD ADL 2026

We built a model that reads face shape from a single photo.

Then it recommends eyewear and hats that suit it. It runs live in your browser.


The numbers

The fashion
industry has
a shape problem.

Every year, billions of items travel back and forth between homes and warehouses. The root cause is rarely quality. It's fit, and fit starts with shape.

30% Return rate
Returns & refunds

1 in 3 fashion purchases
never stays home.

Online fashion return rates average 30–40%, compared to ~9% in-store. The dominant driver? Products don't match the buyer's body, and face shape is the least-addressed dimension of all.

Source: Statista, 2023 Global E-commerce Return Study
70% Fit mismatch
Fit & sizing

Fit & style mismatch
drives 70% of returns.

Of all returned items, nearly 70% are cited as fit or style errors, not defects. For eyewear and hats, products shaped around the face, this number climbs further. Shoppers lack a frame of reference.

Source: Narvar Consumer Report, 2022
24M Tonnes CO₂ / yr
Environmental cost

Returns generate
24 million tonnes of CO₂.

Return logistics emit roughly 24 million tonnes of CO₂ annually, equivalent to millions of transatlantic flights. Better fit prediction doesn't just save money; it reduces unnecessary transport at scale.

Source: Optoro Returns Impact Report, 2023

Most people don't know their face shape.
Most apps don't try to figure it out.

Five common shapes. Most faces fall cleanly into one. Many sit between two, so the model returns probabilities across all five and the second-place class is never lost.

Heart face shape, wider forehead and narrower chin Heart Wider forehead, narrower chin.
Oblong face shape, length exceeds width Oblong Length exceeds width.
Oval face shape, balanced proportions Oval Balanced proportions, soft jaw.
Round face shape, equal width and length Round Equal width and length, soft angles.
Square face shape, strong jaw and parallel sides Square Strong jaw, parallel sides.

How we built it

v1 42.3% 108K params

Started from scratch. A 3-block CNN + landmark MLP. Capacity-limited. The Round class barely worked.

v2 54.4% 1.98M params

Went deeper. Attention-gated fusion, 5-block CNN. Round recovered. But we were overfitting.

v3 70.8% 1.40M params vs ResNet-18 · 70.1% · 11.2M params

Swapped to MobileNetV3-Small pretrained weights, added face alignment and MixUp. Fewer params than v2. Matches ResNet-18.

v3 matches a fine-tuned ResNet-18 at 8× fewer parameters.

Architecture

Dual-branch fusion. CNN image features plus landmark geometry, gated by attention.

Branch A MobileNetV3-Small → 576-dim
Branch B Landmark MLP (4 ratios) → 128-dim
concat + gate 704-dim Attention Gate
5 classes Heart · Oblong · Oval · Round · Square
Confusion matrix for FaceFit v3 on the test set
Confusion matrix, v3 test set (70.8% accuracy). Click to enlarge ↗
Training and validation accuracy/loss curves for FaceFit v3
Training curves, v3 (86 epochs, MobileNetV3 backbone). Click to enlarge ↗

See it work
on your face.

Try FaceFit live

Inference runs in your browser. No frames leave your device.