DEEP MIST
AI
All case studies

Representative engagement Legal Updated

Cutting Contract Review Time by 74%

40% of associate time, reclaimed in two weeks.

Global commercial law firm, roughly 120 lawyers, partner-led contract practice

74%

reduction in initial review time

200+

contracts reviewed per month

0

fabricated clause citations

2 weeks

kickoff to production

THE SITUATION

Why this work mattered

The firm runs 200 or more commercial contracts a month through junior associates for the first review pass. Each contract takes three to four hours of manual work: flagging non-standard clauses, summarising key terms, and routing risk to a partner. Independent benchmarks put even a routine manual contract review near 92 minutes, and put routine review at 60 to 80% of a legal team's time, so at this firm's volume the first pass alone consumed a large, fixed slice of the associate base. Partners were paying associate hours (mid-market associate billing runs 300 to 625 dollars an hour, and far higher at the top of the market) for work whose consistency they did not trust. The commercial stake was direct: slower turnaround on client contracts, uneven flagging between associates, and senior time spent re-checking junior output.

THE FAILURE MODE

What was breaking before us

The firm had already tried two off-the-shelf legal-AI tools and rejected both. One was too US-centric for a global practice. The other produced clause citations that did not exist or did not say what the tool claimed. That failure is the industry norm, not bad luck: the Stanford RegLab and HAI study of leading legal-AI research tools found Lexis+ AI and Ask Practical Law AI produced incorrect information more than 17% of the time, and Westlaw AI-Assisted Research more than 34%, where a hallucination includes a real citation that does not support the stated claim. A tool that misgrounds one flag in three is worse than no tool, because a partner cannot tell which third is wrong without redoing the review. That is why the prior attempts failed and why grounding, not speed, was the problem to solve.

THE BUILD

What we built

We spent the first two days reading 50 of the firm's contracts to learn their clause taxonomy and risk criteria from their own paper, not a generic template. The system splits each contract, embeds and indexes it against a curated library of standard commercial clauses, and classifies and flags against that library. Every flag carries the retrieved clause it matched, so an associate sees the basis, not just a verdict. High-risk clauses are routed to a human reviewer rather than auto-accepted. The interface lives inside the firm's existing document-management system: associates see flagged sections with citations, partners get a daily digest of the high-risk contracts.

Contract-review console showing flagged clauses with the matched reference clause beside each flag
Contract-review console showing flagged clauses with the matched reference clause beside each flag

HOW IT WORKS

How it actually works

Dataflow diagram: contract ingest, segment and embed, retrieve against curated clause library, classify, human-in-the-loop on high-risk clauses, digest output

Each contract is segmented and embedded, then retrieved against a pgvector index of the curated clause library, so classification is grounded in a real reference clause rather than free generation. GPT-5.5 runs the classification and flagging pass; Claude Opus 4.8/Sonnet 4.6 and Gemini 3.5 Flash cross-check on the harder clause types. The grounding contract is strict: a flag only stands if it traces to a retrieved clause, which is what holds the fabricated-citation rate at zero against the 17 to 34% industry bar. The human boundary is explicit and non-negotiable: high-risk clauses are always human-reviewed and never auto-accepted, so the system carries volume and a lawyer carries judgement. Built on Next.js, PostgreSQL, pgvector, and AWS Lambda, integrated into the existing document-management system.

The system carries the volume. A person carries every judgement call.

THE OUTCOMES

The outcomes that held

Every number below carries its denominator, window, and scope. No claim a buyer with a calculator can break.

74%

reduction in initial review time

per-contract first-pass review, across 200+ contracts per monthsustained across the production window to dateinitial associate review pass; partner sign-off and human review of high-risk clauses unchanged

200+

contracts reviewed per month

commercial contracts entering the review queuesteady-state monthly throughputcommercial contracts in scope; bespoke one-off instruments routed to manual review

0

fabricated clause citations

across 200+ contracts reviewed per monthproduction window to dateevery flag traces to a retrieved clause; high-risk clauses are human-reviewed and excluded from auto-accept

2 weeks

kickoff to production

single engagementkickoff to first production cutoverbuild and integration into the existing document-management system

SECOND-ORDER EFFECTS

Associates moved off mechanical first-pass review onto the analysis partners actually wanted from them. Flagging became consistent between associates because every flag traces to the same clause library, which removed a recurring source of partner re-checking. The daily high-risk digest gave partners a portfolio view of contract risk they did not have before. None of this replaced a lawyer; it changed what the lawyers spent their hours on.

The system caught clause conflicts our senior associates were missing, and every flag pointed at a real clause we could check. It is not replacing lawyers, it is making them faster.

Senior PartnerGlobal commercial law firm, roughly 120 lawyers

RELATED WORK

More of this work

The same shared system, applied to four other regulated and high-volume problems.

Tell us the problem. We'll scope the path.

Tell us the problem, the constraint, and what success looks like. We'll tell you whether there's a credible path to production.