Automating web tasks at scale, when doing it wrong is not an option.

SHORA captures public web pages deterministically and replays the capture on every visit — even when the page is redesigned. No language model in the data path. No human reviewer in the data path. Built on a record-replay scheme from PhD research at INRIA.

The deterministic web capture engine. Recording captures a page's structural intent — not its surface selectors — so the same recording reads every variant of that page deterministically, including the ones a conventional record-replay tool would break on. SaaS, usage-based: replay is priced per task, roughly 10× cheaper than LLM agents per task, and the re-recording labor that conventional tools depend on is eliminated.

Two ways to run automated web tasks at scale fail. LLM agents top out around 9% end-to-end on the open web. Human reviewers can't hold attention past thirty minutes on repetitive monitoring. CROspector is the third option: deterministic record-replay, ~10× cheaper per task than LLM agents.

See the receipts →

In production today.

Used by teams whose decisions depend on the same web page being read the same way, every time.

Visit crospector.com

We work with teams who meet three conditions.

  1. The same pages have to be read correctly, at minimum, tens of thousands of times per month.
  2. Reading a field wrong costs you revenue, compliance, or reputation — not convenience.
  3. There is one person inside your organization who owns the data quality outcome and can sign for it.

If those three are true, we have thirty minutes. If they are not, we are probably not the right vendor and we would rather tell you now.

About SHORA

SHORA is a deep-tech company spun out of INRIA. We build the deterministic web capture infrastructure that audit, intelligence, monitoring, and compliance teams use when reading a field wrong costs them revenue, regulatory exposure, or reputation.

Our focus is the unsexy half of web data — the part where the same page has to be read correctly tens of thousands of times, without a language model and without a human in the loop.

Supported by

Questions, answered

What is CROspector?

CROspector is SHORA's deterministic web capture engine. It records how a retailer's page is structured once — under human review, on a live page exactly as a shopper sees it — and then replays that recording to read every page of the same kind, the same way, every time. No language model and no human reviewer in the read path. Built on record-replay research from a PhD at INRIA.

How is it different from a web scraper or an LLM-based extractor?

A scraper binds to surface selectors and breaks when the page is restyled. An LLM extractor re-guesses what each element means on every page from frozen weights — expensive and silently drifting. CROspector binds to the structural intent of the page, not its surface appearance, so it reads through content changes and redesigns without re-recording, and it fails loudly with evidence rather than returning confidently wrong data.

What does CROspector measure for a retailer that buys traffic?

Three numbers your existing tooling does not produce: (1) your wasted-spend number — the euros of paid traffic landing on products shoppers cannot actually buy; (2) your competitors' outage windows — the hours a rival's promoted bestseller cannot be bought, when your listing captures a higher share of redistributed demand; and (3) feed-vs-page-vs-checkout agreement, surfaced before Google's 28-day disapproval clock starts.

How does CROspector solve the Google Merchant Center disapproval problem?

Google's crawler reads the raw HTML your server returns before JavaScript runs; if your final price is injected by JavaScript a moment later, the crawler records the wrong price and can preemptively disapprove the item — and Google keeps no copy of the page state that triggered it. CROspector reads each page both ways — as the crawler reads it and as a shopper reads it — and surfaces the SKUs where those diverge, with the page captured as evidence, before the 28-day clock starts.

Why can't my analytics (GA4) already tell me this?

Tag-based analytics watch behavior from inside your own site and see only the sessions that consented to tracking — in the EU, when "Reject All" sits on equal footing with "Accept All," roughly 60% of users reject. CROspector is an outside-in instrument: it visits your store the way a shopper does, meets the consent banner, and records what actually happens, including the difference between the accepted and rejected paths.

How does it compare to running an AI agent?

Two ways at once. It is about one-tenth the cost per completed web task — there is no language model in the read path, so no per-page inference bill. And unlike an AI agent, it cannot be silently, confidently wrong: when it can read a page it reads it the same way every time, and when it cannot, it stops and shows you the page instead of inventing an answer. Cheaper is the easy half; never lying to a decision that costs you money is the half that matters.

What kind of retailer is CROspector for?

Retailers that buy meaningful paid traffic against a catalog they want to sell — teams running paid media, e-commerce, or pricing where the same pages have to be read correctly tens of thousands of times a month, and where reading a field wrong costs revenue, compliance, or reputation. If your paid traffic never lands on a dead end and your competitors never stock out, you do not need it.

Get in Touch

If you have ten URLs you need read correctly every day, we can show you a working capture in 48 hours.

Visit Us

172 Av. de Bretagne
59000 Lille, France