50,000
SKUs enriched and standardized
the full catalog of roughly 50,000 SKUsthe single 5-day enrichment rundescription enrichment, attribute extraction, taxonomy normalization; merchandising approves batched changes before they go live
THE SITUATION
The brand had grown through acquisition and rapid catalog expansion, leaving 50,000 SKUs with wildly inconsistent product data. The same attribute appeared as "Color: Navy", "Colour: Dark Blue", or not at all depending on which team had entered it; descriptions ranged from full paragraphs to one line. The downstream impact hit revenue directly: on-site search returned irrelevant results and category filters were unreliable, while the published linkage between search quality and conversion is large and well-documented (Amazon's conversion rate rises roughly six-fold when visitors engage search, and after an unsuccessful onsite search about eight in ten shoppers say they are more likely to buy elsewhere). The commercial stake was concrete: support tickets about not finding products had doubled in six months, and search is how most retail shoppers say they find anything.
THE FAILURE MODE
The brand had tried to clean the catalog with contractor data-entry passes and rule-based scripts. Manual passes could not keep pace with 50,000 SKUs and reintroduced their own inconsistencies; rule-based scripts handled the fields someone anticipated and silently skipped the attributes that were never entered in the first place. The core problem, missing attributes that no rule can extract because the source text never stated them, is exactly what those approaches could not address, and that is why each attempt left the search and filter quality roughly where it started.
THE BUILD
We built a batch pipeline that reads the existing catalog, enriches descriptions, extracts and standardizes attributes, and generates the missing taxonomy labels. Claude Opus 4.8 handles semantic understanding, reading descriptions and images to infer attributes that were never explicitly entered; GPT-5.5 performs structured attribute extraction, mapping free text to a standardized schema. Cross-validation between the two models catches inconsistencies: when they disagree on a product, the item is flagged for human review. The pipeline processes roughly 10,000 SKUs an hour and outputs clean structured data ready for direct import. A lightweight review dashboard lets the merchandising team approve batched changes before anything goes live.

HOW IT WORKS
The pipeline ingests the catalog, runs each SKU through a two-model pass (Claude Opus 4.8 for semantic inference from text and images, GPT-5.5 for structured extraction to the target schema), and stages results for the merchandising dashboard. Disagreement between the two models is the trigger for the human boundary: matching outputs flow through, divergent ones are held for a person, so the team gets control without touching individual SKUs. Built on Python and Pandas with PostgreSQL, AWS Lambda, and S3, throughput around 10,000 SKUs an hour, output formatted for direct import into the e-commerce platform.
The system carries the volume. A person carries every judgement call.
THE OUTCOMES
Every number below carries its denominator, window, and scope. No claim a buyer with a calculator can break.
50,000
SKUs enriched and standardized
the full catalog of roughly 50,000 SKUsthe single 5-day enrichment rundescription enrichment, attribute extraction, taxonomy normalization; merchandising approves batched changes before they go live
34%
improvement in on-site search relevance
relevance on the post-enrichment catalog versus the pre-enrichment baselinemeasured after the enrichment run landedon-site search relevance metric; representative outcome, methodology on request
5 days
kickoff to production
single engagementkickoff to completed catalog runbatch pipeline build plus the merchandising review dashboard
23%
increase in filter-to-cart conversion
filter-driven sessions, post-enrichment versus pre-enrichment baselinemeasured after enrichment landedfilter-to-cart conversion; representative outcome, methodology on request
SECOND-ORDER EFFECTS
On-site search and category filters started returning the right products because the attributes behind them were finally complete and consistent, which is the documented upstream driver of search relevance and filter quality. The recommendation engine inherited cleaner signal from the same enrichment. The merchandising team kept editorial control through the approval dashboard rather than ceding it to a black box, and the "cannot find product" support burden had a structural cause removed rather than a symptom patched.
Our search finally works. Customers are finding products they did not know we carried, and merchandising still signs off on every batch before it goes live.
VP of ProductSeries B e-commerce brand, 50,000+ SKUs
RELATED WORK
The same shared system, applied to four other regulated and high-volume problems.
Tell us the problem, the constraint, and what success looks like. We'll tell you whether there's a credible path to production.