DEEP MIST
AI
All case studies

Representative engagement E-commerce Updated

Automating Product Enrichment for 50,000 SKUs

Fifty thousand SKUs, enriched and standardized in five days.

Series B e-commerce brand, 50,000+ SKUs, merchandising-led catalog team

50,000

SKUs enriched and standardized

34%

improvement in on-site search relevance

5 days

kickoff to production

23%

increase in filter-to-cart conversion

THE SITUATION

Why this work mattered

The brand had grown through acquisition and rapid catalog expansion, leaving 50,000 SKUs with wildly inconsistent product data. The same attribute appeared as "Color: Navy", "Colour: Dark Blue", or not at all depending on which team had entered it; descriptions ranged from full paragraphs to one line. The downstream impact hit revenue directly: on-site search returned irrelevant results and category filters were unreliable, while the published linkage between search quality and conversion is large and well-documented (Amazon's conversion rate rises roughly six-fold when visitors engage search, and after an unsuccessful onsite search about eight in ten shoppers say they are more likely to buy elsewhere). The commercial stake was concrete: support tickets about not finding products had doubled in six months, and search is how most retail shoppers say they find anything.

THE FAILURE MODE

What was breaking before us

The brand had tried to clean the catalog with contractor data-entry passes and rule-based scripts. Manual passes could not keep pace with 50,000 SKUs and reintroduced their own inconsistencies; rule-based scripts handled the fields someone anticipated and silently skipped the attributes that were never entered in the first place. The core problem, missing attributes that no rule can extract because the source text never stated them, is exactly what those approaches could not address, and that is why each attempt left the search and filter quality roughly where it started.

THE BUILD

What we built

We built a batch pipeline that reads the existing catalog, enriches descriptions, extracts and standardizes attributes, and generates the missing taxonomy labels. Claude Opus 4.8 handles semantic understanding, reading descriptions and images to infer attributes that were never explicitly entered; GPT-5.5 performs structured attribute extraction, mapping free text to a standardized schema. Cross-validation between the two models catches inconsistencies: when they disagree on a product, the item is flagged for human review. The pipeline processes roughly 10,000 SKUs an hour and outputs clean structured data ready for direct import. A lightweight review dashboard lets the merchandising team approve batched changes before anything goes live.

Catalog review dashboard showing enriched SKUs, standardized attributes, and items flagged where the two models disagreed
Catalog review dashboard showing enriched SKUs, standardized attributes, and items flagged where the two models disagreed

HOW IT WORKS

How it actually works

Dataflow diagram: catalog ingest, semantic inference and structured extraction in a two-model pass, cross-validation flag for human review, merchandising approval, formatted export

The pipeline ingests the catalog, runs each SKU through a two-model pass (Claude Opus 4.8 for semantic inference from text and images, GPT-5.5 for structured extraction to the target schema), and stages results for the merchandising dashboard. Disagreement between the two models is the trigger for the human boundary: matching outputs flow through, divergent ones are held for a person, so the team gets control without touching individual SKUs. Built on Python and Pandas with PostgreSQL, AWS Lambda, and S3, throughput around 10,000 SKUs an hour, output formatted for direct import into the e-commerce platform.

The system carries the volume. A person carries every judgement call.

THE OUTCOMES

The outcomes that held

Every number below carries its denominator, window, and scope. No claim a buyer with a calculator can break.

50,000

SKUs enriched and standardized

the full catalog of roughly 50,000 SKUsthe single 5-day enrichment rundescription enrichment, attribute extraction, taxonomy normalization; merchandising approves batched changes before they go live

34%

improvement in on-site search relevance

relevance on the post-enrichment catalog versus the pre-enrichment baselinemeasured after the enrichment run landedon-site search relevance metric; representative outcome, methodology on request

5 days

kickoff to production

single engagementkickoff to completed catalog runbatch pipeline build plus the merchandising review dashboard

23%

increase in filter-to-cart conversion

filter-driven sessions, post-enrichment versus pre-enrichment baselinemeasured after enrichment landedfilter-to-cart conversion; representative outcome, methodology on request

SECOND-ORDER EFFECTS

On-site search and category filters started returning the right products because the attributes behind them were finally complete and consistent, which is the documented upstream driver of search relevance and filter quality. The recommendation engine inherited cleaner signal from the same enrichment. The merchandising team kept editorial control through the approval dashboard rather than ceding it to a black box, and the "cannot find product" support burden had a structural cause removed rather than a symptom patched.

Our search finally works. Customers are finding products they did not know we carried, and merchandising still signs off on every batch before it goes live.

VP of ProductSeries B e-commerce brand, 50,000+ SKUs

RELATED WORK

More of this work

The same shared system, applied to four other regulated and high-volume problems.

Tell us the problem. We'll scope the path.

Tell us the problem, the constraint, and what success looks like. We'll tell you whether there's a credible path to production.