AI Brand Sync now ships in under 600ms.
We rebuilt the palette extractor with a smarter color quantizer. Average extraction time dropped from 2.1s to 580ms. Here's the engineering deep-dive — including the dead end we wasted three weeks on.
When we shipped AI Brand Sync last August, the median palette extraction time was 2.1 seconds. That was good enough for the launch — users were impressed it worked at all — but it sat at the very edge of "feels instant" for a feature people use 50+ times a day.
Last month we shipped a rewrite that drops it to 580ms median. About 3.6× faster. Same accuracy on our 4,000-image evaluation set. Here's how we got there, including the embarrassing dead end we wasted three weeks on.
Photo: cottonbro studio on Pexels
| Metric | Old | New |
|---|---|---|
| Median time | 2.1s | 580ms |
| Speedup | — | 3.6× |
| Accuracy on 4k-image eval | baseline | +1.3% |
The old algorithm
The original AI Brand Sync used a fairly standard k-means color quantizer:
- Resize the input logo to 256×256
- Convert RGB to LAB color space
- Run k-means with k=8 clusters
- Sort clusters by population, return the top 5
This works well for accuracy — LAB space matches human color perception, and k-means at k=8 captures the dominant colors most logos have. But it's slow because k-means has to do 50+ iterations to converge, and each iteration recomputes distances against all 65,536 pixels.
The 2.1-second median wasn't dominated by clustering — it was dominated by the iterations × pixels product. Reducing pixels means losing accuracy. So we focused on iterations.
The dead end (3 weeks)
Our first instinct was to pre-cluster. Use a perceptual hashing technique to seed k-means with already-near-final centroids, so it converges in 5 iterations instead of 50.
We built it. It worked — but only on simple logos. On logos with a lot of color variation (gradients, photographs, anything that wasn't a flat-color icon), the perceptual hash often clustered too aggressively, and the final palette missed important secondary colors. The accuracy regression on our 4,000-image eval set was 11% — meaning users would frequently get a "wrong-looking" palette.
We sat on this for three weeks trying to fix the hash function. We added various weighting schemes, tried HSL space instead of LAB, considered abandoning hashing entirely.
"The accuracy regression isn't a bug. It's that we're trying to make a slow algorithm faster by skipping work. There's no free lunch."
— Tom Larsen, QA · QRBliss
Tom was right. We dropped the perceptual-hash approach and started over.
The actual fix
The breakthrough came from a paper on median-cut quantization — a technique used in PNG indexing that's roughly 10x faster than k-means but had been considered "not as accurate" for general-purpose color quantization. We rebuilt our extractor around it, with two key adaptations:
- Convert to OKLab space first. OKLab is a newer perceptual color space that's slightly better for human-color matching than LAB. We get a small accuracy win for free.
- Run median-cut to k=12, then merge. The straightforward median-cut would give us 8 colors directly, but with somewhat noisy boundaries. Running it to k=12 and then merging the closest pairs (using OKLab distance) gives us a much smoother final palette of 8 colors.
Total runtime: about 200ms for the quantization itself, plus 380ms for image decoding, resize, and color-space conversion. That's the 580ms median.
Accuracy: actually 1.3% better than the old k-means on our evaluation set. The OKLab conversion plus median-cut-to-12 merge does a slightly better job of preserving low-frequency colors that k-means tends to lose.
What's next
The 580ms median is good. The 95th percentile is still around 1.4s, mostly because of large images (2000×2000+ logos). We're working on a streaming version that begins quantizing while the image is still uploading — early experiments show this could push the 95th percentile down to ~700ms.
We're also exploring on-device extraction for the mobile apps. The library is small enough (~80KB compiled) that we could ship it client-side and skip the server roundtrip entirely. Privacy bonus: the logo never leaves the user's phone.
If you're curious about the implementation, the OKLab conversion is the most interesting part — the color space is described in Björn Ottosson's blog post and is now built into modern CSS via the oklch() color function.
Postscript: the "embarrassing" part
The three weeks we spent on perceptual hashing? We had a working median-cut implementation in a separate branch from a previous experiment. We just didn't think to try it. Dropping out of the local maximum (k-means optimization) to try a fundamentally different algorithm took embarrassingly long.
Lesson: when you're stuck optimizing, ask whether you're optimizing the right algorithm. Often the answer is no.