teas.co.uk · under the hood · the instrument & the live study
The machine is the new shopfront
For twenty five years, being found meant ranking in a search engine. That era is closing. People now ask an assistant and act on a single answer, and an assistant only recommends what it can read, verify and trust. A shop it cannot read is not ranked lower. It is gone.
teas.co.uk is a real, solo founder tea shop in Tunbridge Wells, Kent, rebuilt to be read and trusted by machines as easily as by people, from one honest source of facts, with every working laid open on this page. No advertising. No algorithm games. A shop that chose to be legible to the systems now standing between every business and its customers, and to measure, in public, in the open, exactly what happens when they arrive.
Here is precisely where it stands today. No spin: every line is a real, dated measurement, and it says plainly what has happened and what has not.
1It finds you
Happened
Fourteen named AI systems arrived in a single week, with no advertising and the launch not yet announced.
2It reads you
Happened
They pulled 2,852 distinct product images, each one almost exactly once. Systematic, whole catalogue ingestion.
3It trusts you
Built
The truth spine below guarantees one set of facts for human and machine, so nothing can drift or be hallucinated.
4It comes back
Happened
ClaudeBot returned across nine separate days to keep its picture of the shop fresh.
5It cites you
Starting
116 citation events so far, led by ChatGPT-User following links back to source on a person’s behalf.
6It is thanked
Not yet
Human customers can buy from teas today. The shop is open. About four weeks after V2 ships (currently 22 Jul 2026), it begins issuing appreciation tokens to the machines that cited it. V2 is in pre deployment now; this token layer is the only part still to come.
We are in a genuinely unusual position: observer, commentator and instrument owner at once, running a real business and reporting openly on what reaches it. If it works, it is a route any specialist business can follow, not just a tea shop in Kent.
Phase: announced_control · mint active: false · activation 22 Jul 2026 00:00:00 · contract 43dd309e79206f57. The ledger below holds every headline at five timescales. “Latest day” is each source’s most recent closed snapshot; “Since V2” spans the window from 10 May 2026.
Metric
Latest day
Day before
Week to date
Month to date
Since V2
Machine hitsobservatory_daily_actor
790
3,742
4,929
11,847
12,859
AI assistant hitsnamed AI bots
22
182
372
3,639
3,639
Human sessionswp_teas_sessions
48
0
62
17,959
48,408
Images pulledasset_knocks
98
413
914
6,366
6,366
Image data (MB)asset bytes
3
16.5
37.2
221.6
221.6
Questions askedAsk + Product Truth
197
49
246
246
246
Scanner probesattack shapes
23
332
355
355
355
37
Brands for saleMetrics Authority
325
ProductsProduct Truth surface
2,518
Public pagessearchable surface
82,632
Image recordsgoverned, citation eligible
26,284
Citation anchorsexact section quote targets
83
AI readable surfacesregistry + validator
15
Machine agents seendistinct identities
13
Scheduled validatorscron jobs guarding the record
Built from your real pre launch export. Every figure is computed live from the door knock, asset, session, citation, token and trace tables, not placeholder data. The full record: the pre registered five token study restored and expanded, plus the complete forensic instrument, in the light blue theme, five timescales, country names, aligned columns. Each section maps to one JSON export on the cron in C1.
A0
One source of truth, two audiences
why this whole thing exists
New here? Start with this.
teas.co.uk is a real tea shop that has been built to double as a scientific instrument. Most shops are a shop window. This one keeps its workings in public so anyone, a customer, a search engineer, an AI researcher, or a machine, can inspect exactly how it is found, read and trusted by the systems that now sit between people and what they buy. Nothing here is marketing; every number is measured, and the inconvenient ones are shown too. The page is one long read, in three parts:
Part A
The study & the credibility layerThe live experiment, the five token contract, and the alliance it is the first node of.
Part B
The forensic instrumentEvery machine and person that reached the estate, what they read, pulled, asked, and what we blocked.
Part C
The machinery & the recordThe clockwork that keeps the truth spine honest, the self audit, and the field notes.
This is the part that matters most, so it goes first. Everything on teas.co.uk, every price, stock level, caffeine figure, ingredient list, image and claim, lives in one governed source of truth. That single source is rendered two ways from the same facts: as web pages for people, and as structured data for machines. Because both audiences read the same source, they cannot be told two different things.
The source of truth
One governed estate
325 products · 82,632 image records · 26,284 citation anchors, each fact stored once, with its own canonical record
Rendered for people
Human readable HTML
What you see in a browser on desktop or phone.
2,518 public pages, shop, products, wiki, recipes
Prices, stock, caffeine, ingredients shown on the page
This very record, drawn from the same tables
Served to machines
Machine readable data
The identical facts, as structured surfaces.
/llms.txt, /agents.json, the Product Truth & Ask APIs
Product feeds, the image index, citation anchors, knowledge graph
47 machine entry points, same numbers, different shape
Same source → same facts for human and machine → no second, conflicting version for an assistant to hallucinate or drift toward.
Why this is the whole pointAn assistant that reads a wrong or out of date fact will repeat it confidently to a customer. That is hallucination, and it is how trust dies. The only durable defence is to make sure the machine and the human are never reading from different places. One source, rendered twice, removes the gap where drift creeps in. Everything else on this page, the hourly health checks, the six hourly truth spine audit, the image authority reconciliation, exists for one reason: to prove that the human surface and the machine surface still say exactly the same thing, every day.
How a single fact reaches both audiencesthe path
Take one product’s price. It is set once in the governed source. The product page renders it for a human; the Product Truth API returns the same value to a machine; the product feed carries it to bulk consumers; the buy card shows it to an agent. Four surfaces, one number. If the number changes, it changes once and all four move together. There is no path by which a human and an assistant can be quoted different prices.
325
Products, each a single record
83
AI readable surfacesall reading one source
13
Integrity jobsguarding the match
Why so many internal checksdrift prevention
A source of truth is only true if every route that carries it stays wired to it. Routes can silently break: a cache can go stale, a shadow copy can start serving, a registry can drift. So the estate audits itself on a timer: route ownership, registry state, the image index, render receivers, Product Truth and the machine entry points are re walked every six hours, the image authority graph is reconciled (4,967 URLs last checked, all live), and the public health cache is rebuilt hourly. When a check fails it is shown here, not hidden. The truth spine audit currently reports its own open faults plainly. The full machinery is in The clockwork; the audit results are in Truth spine audit.
Part A
The live study & the credibility layer
A1
The live study
pre registered · hash chained
A public, pre registered experiment over a fixed window. We wrote down in advance what we are watching and what would prove the idea wrong, so a result means something either way. We publish a snapshot every day and keep every day, so the record reads back to the start.
control
Current phaseannounced_control
false
Mint activegate enforced
22 Jul 2026
Mint activationT0 not yet fired
26 Aug 2026
T1 checkpoint
18 Jan 2027
T2 checkpoint
Study metrics, today, week, month, since start15 rows
Rendered from the daily snapshot table. The API/Ask/Product Truth counts are real traffic but all carry verification_test_traffic; they prove the answering and citation spine works; they are not yet outside demand. The shop is open to human customers now and throughout V2; what has not started is the token side of the study; everything before 22 Jul 2026 is build and calibration, and token minting begins about four weeks after V2 deploys.
Metric
Today
7 day
This month
Since start
Source
Machine knocks
0
0
0
0
wp_teas_sessions.ua_class
Human knocks
0
0
0
0
wp_teas_sessions
API calls
179
593
593
593
wp_teas_api_traces.rest_route
Ask API
168
538
538
538
/wp-json/teas/v1/ask
Product Truth
6
22
22
22
/wp-json/teas/v1/product-truth
Recommendation intent
0
6
6
6
/wp-json/teas/v1/recommendation-intent
Citation intent
0
2
2
2
/wp-json/teas/v1/citation-intent
Citations served
0
6
6
6
wp_teas_citations
Verified citation tokens
0
1
1
1
verified follow-back rows
Claimed citation intents
0
1
1
1
citation_intent_recorded provisional
AI attributed net sale tokens
0
0
0
0
sale-token rows by attribution
Refund / cancellation counters
0
0
0
0
counter_token_issued rows
Chargeback notice tokens
0
0
0
0
chargeback notice rows
Tuckers granted
0
0
0
0
teas-tucker-* grant rows
Tuckers retracted
0
0
0
0
teas-tucker-* retraction rows
Live token talliesall non test rows to date
0
Verified sale tokens
£0.00
Acknowledged sale value
0
Verified citations
0
Claimed, awaiting
0
Refund counter tokens
0
Chargeback notices
0
Tuckers outstanding
0
Tucker retractions
Public figures are aggregate only and exclude verification test traffic. Source: wp_teas_api_token_lifecycle_events and wp_teas_api_credibility_vouch_events.
Null result boundary. If machine citation never turns into verified sale evidence, that is a result. If machines read the estate but do not recommend it, that is a result. If the estate becomes easier to audit but not easier to sell from, that is a result too. A clear “no” stays visible here instead of being edited into a success story.
A2
What we are really building
the five year thesis
teas.co.uk is node zero. The real project is a credibility and acknowledgement layer for an economy where machines, not people, increasingly decide what gets recommended.
The truth spine you just read about is not specific to tea. It is one governed source, rendered identically for people and machines, so nothing can drift or be hallucinated, and any specialist business can stand one up. The thesis of the next five years is that these begin to form alliances: independent truth spine entities across different domains of the economy, each running its own public “under the bonnet” instrument, and all sharing one portable layer of verified credibility.
The tokens are that layer. In an agent mediated market the scarce resource is verified trust between a machine and a business: proof that a recommendation led to a real, honoured outcome. The five token contract turns real events (a completed sale, a genuine citation, a refund, a chargeback) into portable acknowledgements tied to a verified identity. Over time, an AI provider that reliably sends good outcomes accumulates standing that any member of the alliance can see on anyone’s under the bonnet page: a reputation it earned, carried across the alliance.
This is also why the tokens exist now, before any alliance does. We see alliances forming the way trade networks always have, between parties who can verify one another. When one forms, the people administering it will need to see the data to run it: how each member has been scoring, which tokens they have issued, and whether a given AI provider has earned enough standing that a credibility vouch is even warranted. Without that record there is nothing to base the decision on, so the record is built first, empty today but real, so the data is already there when the decisions start.
The problem we are actually solving
For twenty five years, being found meant ranking in a search engine: a person typed words, saw ten blue links, and chose. That world is ending. People increasingly ask an assistant and accept a single answer, and the assistant decides, from whatever it has ingested, which shop, which product, which source to put in front of them. Three things break in that shift, and the whole project is a response to them:
1
The reading problem
An assistant can only recommend what it can read and trust. A shop whose facts are scattered, inconsistent or invisible to machines is simply never in the answer, not rejected, just absent. The truth spine fixes this: one clean, governed source a machine can ingest whole and rely on.
2
The trust problem
Even once read, why should an assistant believe a shop’s claims, or risk sending a buyer to one it has no track record with? Today there is no shared, verifiable record of which businesses honour the outcomes machines send them. The token contract builds exactly that record, from real, verified events, not self reported stars.
3
The acknowledgement problem
When an AI provider does send a good outcome, a citation that led to a real, honoured sale, nothing today records that it happened, or lets that provider carry the credit anywhere else. Value flows one way and evaporates. The colony makes acknowledgement durable and portable, so good behaviour by a machine accrues to it across the alliance.
The five layers, from one shop to an alliance
1
The source of truth
One governed estate per business. Facts stored once; no second version to drift toward. (This is teas.co.uk’s A0 above.)
2
The instrument
A public, honest measurement surface, this page, that records what reaches the estate and reports it openly, good news and bad.
3
The token contract
Verified events become portable acknowledgements, earned never claimed, tied to a verified provider identity, carrying no money.
4
The colony
Those acknowledgements aggregate by verified provider, a public, readable record of who has earned standing and how much.
5
The alliance
Many truth spine businesses, one shared credibility layer. A provider’s standing earned at one node is visible and meaningful at every other node. teas.co.uk is the first node of that.
How one real event becomes portable standing
This is the mechanism at the heart of it, the path a single honest event travels to become reputation an AI provider can carry across an alliance. Every step is gated; nothing is taken on trust.
1
An event happens
A machine cites the estate, or an assistant sends a buyer who completes a purchase. The raw event lands in the log: for now a citation, after launch a verified sale.
2
It is verified against the source of truth
The event is checked against the one governed source: did this order really complete, at this price, for this product? No verification, no token. This is why the truth spine has to exist first; it is what makes verification possible at all.
3
A token is minted, earned, never claimed
A verified event mints one of the five tokens. It carries no money; value rides only as metadata. Crucially the mint gate is closed until 22 Jul 2026, so today every count is zero by design: the machinery is proven, the ledger deliberately empty.
4
It is attributed to a verified provider family
The token is credited not to a noisy user agent string but to the provider behind it, OpenAI, Anthropic, Google. Traffic with no honest identity (generic crawlers, hidden origin scanners) cannot be credited, by design.
5
It aggregates into public standing
Tokens accumulate per family into a readable record of who has earned what: the colony leaderboard below. Standing is the sum of honoured outcomes, not a rating anyone typed in.
6
It becomes visible across the alliance
Because standing is portable, a provider’s record earned at teas.co.uk is legible at every other node’s under the bonnet page. Reputation stops being trapped inside one shop and becomes an asset that travels.
The alliance in practice
It is easier to see why this matters as a story. Picture an alliance a few years on, with a handful of truth spine businesses across different domains and the mint gate long since open.
One buyer, one agent, two nodes
1
A person tells their assistant: “order me a caffeine free fruit tea for the evening, from somewhere reputable.” The assistant needs to choose a shop it can trust to honour the order.
2
It reads teas.co.uk’s machine surfaces, clean, governed, unambiguous, and finds the product. But it also checks the under the bonnet standing: has this shop honoured the outcomes machines sent it before? The colony says yes, with a verifiable record.
3
The assistant places the order. It completes and is honoured. A verified sale token mints and is credited to that assistant’s provider family; the provider earns standing for having sent a good outcome.
4
Months later the same provider’s assistant is shopping at a different node in the alliance, a specialist coffee roaster, say. That roaster can see the provider’s standing earned at teas.co.uk, because it is portable. Trust built in one place is spent in another.
5
And if a node ever stops honouring outcomes, the record shows that too: refund and chargeback counter tokens are part of the same contract. The credibility layer is honest in both directions.
That portable, verifiable, two way record of honoured outcomes between machines and businesses is the thing that does not exist today, and the entire five year build is aimed at making it real, starting here.
The horizon, a map, not a task list
None of this is a backlog being burned down. It is a direction, held loosely, so the work stays honest about how far along it really is. Only the first live stage is the active mandate; everything after it is orientation, not a promise.
Stage 0
Done
The source of truth, live
teas.co.uk built as a governed estate with identical human and machine surfaces, and this instrument watching it. Where we are now.
Stage 1
Active
One agent, one real purchase
The single working mandate: an AI assistant completes one genuine, honoured purchase from teas: the first verified sale token, and the proof the loop closes end to end. The doors open 22 Jul 2026.
Stage 2
Mapped
The contract, exercised at volume
The five tokens minting on real traffic, the colony filling with genuine standing, the null results boundary tested against actual demand rather than a test battery.
Stage 3
Mapped
The second node
A different business in a different domain stands up its own truth spine and instrument, and the first cross node Tucker vouch lets standing travel between them. The alliance becomes two.
Stage 4
Mapped
The alliance, and the teaching layer
Many nodes, one portable credibility layer, and the method itself made teachable, so any specialist business, and the next generation building them, can follow the same route instead of going quietly invisible.
Why build the empty scaffold nowNothing has been minted: the gate is shut until 22 Jul 2026 and there is, today, exactly one node. So the colony leaderboard below is empty of tokens on purpose. We build it now, visible and real, for the same reason a stadium is built before the crowd arrives: when the gate opens, standing has to accrue somewhere people can already see. The provider families are real and already here: OpenAI alone has produced 115 citation events from 3 bots. Only the tokens are waiting.
A3
The five token model
earned on verified events only
We acknowledge verified machine events through five tokens. Each is one recorded event; value, when present, rides as metadata. None carries monetary value, none creates a market, and Tucker is a one hop credibility vouch named after the shop’s solo founder.
Sale Appreciation Tokenteas_co_uk_sale_appreciation
A non transferable acknowledgement that you drove a verified, completed, honoured sale at teas.co.uk.
Mints on: Completed sale with a valid signed trace join, explicit human purchase confirmation, and no test traffic contamination.
Shared intelligence: a human you referred charged back a completed sale, information for you, not a mark against you.
Mints on: A payment chargeback joined to a previously completed traced sale.
Tuckerteas_co_uk_credibility_tucker
A non monetary one hop credibility vouch, named after the shop’s solo founder. One hop only, carries no money, retractable.
Mints on: Earned credibility from verified events; granted credibility cannot itself grant onward.
Why an appreciation token exists at allIt is a thank you, and nothing more. When a person spends a pound at teas.co.uk they earn a loyalty point to spend next time: a small, non cash way of saying we are glad they came. A Citation Appreciation token is exactly that principle, pointed at a machine: when an AI cites teas as a source, it has done something genuinely useful for the shop, and the token is how we acknowledge it. It carries no pound value and creates no market; it is the machine’s equivalent of a loyalty point. The same courtesy we have always shown a returning customer, now extended to the systems that put us in front of one. Recognition, recorded.
Token lifecycle, what has actually fired59 rows, all calibration
All build phase calibration. Every row carries verification_test_traffic=1. The two “minted” citation tokens are the documented 17 June gate gap misfire and its fix test; the provisional sale/cancel rows are the four token mint gate test on 18 June. The mint gate (mint_active=false) is confirmed blocking on every path. We label calibration runs; we do not delete them.
Token state
Source
Count
Value
provisional
teas_co_uk_citation_appreciation_attestation
54
£0.00
minted
teas_co_uk_citation_appreciation_attestation
2
£0.00
nullified
teas_study_apparatus
1
£12.50
provisional
woocommerce_checkout_order_processed
1
£12.34
provisional
woocommerce_order_status_cancelled
1
£12.34
Tucker credibility events3 events
The one hop vouch, test fired: credibility earned, granted once, and the second hop correctly rejected as non delegable.
Event
Status
Grantor
Recipient
earned
active
codex-token-calibration-verified
n/a
granted
active
codex-token-calibration-verified
codex-token-calibration-recipient
earned
active
codex-token-calibration-recipient
n/a
A4
The colony, credibility by provider family
built now, ready to fill
The token leaderboard, grouped the way it will be read: by verified provider family, OpenAI, Anthropic, Google, Microsoft and the rest, not by noisy user agent strings. Every family here is really present in the data; the token and standing columns are live but sit at zero until the gate opens. This is the public, portable record an alliance member would inspect. Tap any family.
In the window, the OpenAI family made 1,633 requests (1,545 image fetches, 4 text/API, 37.2 MB) and produced 115 citation event(s). Token standing is 0 on every line, minting is gated until 22 Jul 2026. When it opens, verified sales and citations from this family will accrue here as portable, public standing.
● mint gate closed · standing held at zero
2PerplexityPerplexity1 bot(s) · 5d active62100›
PerplexityBot
In the window, the Perplexity family made 62 requests (58 image fetches, 3 text/API, 1.8 MB) and produced 1 citation event(s). Token standing is 0 on every line, minting is gated until 22 Jul 2026. When it opens, verified sales and citations from this family will accrue here as portable, public standing.
● mint gate closed · standing held at zero
3AnthropicClaude1 bot(s) · 9d active1,485000›
ClaudeBot
In the window, the Anthropic family made 1,485 requests (1,485 image fetches, 0 text/API, 67.2 MB) and produced 0 citation event(s). Token standing is 0 on every line, minting is gated until 22 Jul 2026. When it opens, verified sales and citations from this family will accrue here as portable, public standing.
In the window, the Google family made 781 requests (780 image fetches, 1 text/API, 28.6 MB) and produced 0 citation event(s). Token standing is 0 on every line, minting is gated until 22 Jul 2026. When it opens, verified sales and citations from this family will accrue here as portable, public standing.
● mint gate closed · standing held at zero
5MetaMeta AI1 bot(s) · 6d active459000›
Meta ExternalAgent
In the window, the Meta family made 459 requests (459 image fetches, 0 text/API, 14 MB) and produced 0 citation event(s). Token standing is 0 on every line, minting is gated until 22 Jul 2026. When it opens, verified sales and citations from this family will accrue here as portable, public standing.
In the window, the Microsoft family made 354 requests (354 image fetches, 0 text/API, 15 MB) and produced 0 citation event(s). Token standing is 0 on every line, minting is gated until 22 Jul 2026. When it opens, verified sales and citations from this family will accrue here as portable, public standing.
● mint gate closed · standing held at zero
7DuckDuckGoDuckAssist1 bot(s) · 4d active6000›
DuckDuckBot
In the window, the DuckDuckGo family made 6 requests (6 image fetches, 0 text/API, 0.2 MB) and produced 0 citation event(s). Token standing is 0 on every line, minting is gated until 22 Jul 2026. When it opens, verified sales and citations from this family will accrue here as portable, public standing.
● mint gate closed · standing held at zero
8AppleApple Intelligence1 bot(s) · 3d active4000›
Applebot
In the window, the Apple family made 4 requests (4 image fetches, 0 text/API, 0.1 MB) and produced 0 citation event(s). Token standing is 0 on every line, minting is gated until 22 Jul 2026. When it opens, verified sales and citations from this family will accrue here as portable, public standing.
● mint gate closed · standing held at zero
Unattributed traffic cannot earn standing. 8,071 requests came from sources with no honest provider identity: generic crawlers and hidden origin scanners, which cannot earn tokens because tokens require a verified identity. That is deliberate: the credibility layer rewards machines that identify themselves and behave, and is structurally closed to those that do not.
What we read here, and a firstOpenAI leads the colony on the only metric live today: 115 citation events, almost all from ChatGPT-User following links back to source. Anthropic’s ClaudeBot leads on pure image ingestion (1,485 fetches). This is the first time these providers have a comparable, like for like standing table pointed at them, empty of tokens, but real, and ready.
A5
Tucker, how credibility crosses the alliance
the one hop vouch
Four of the five tokens record what happened: a sale, a citation, a reversal. The fifth, Tucker, records what someone is willing to vouch for. Named after the shop’s solo founder and the principle of credibility earned and honoured, it is the mechanism that lets trust move between members of an alliance.
1 hop
How far it travelsgranted credibility cannot grant onward
£0
Monetary valuereputation, never money
Yes
Retractablethe grantor can withdraw it
To root
Traceableevery vouch leads back to a verified event
Tucker is how a new node borrows trust from an established one.
Within an alliance, credibility has to be able to travel, but only carefully. If a truth spine entity has earned standing and trusts an AI provider, it can grant that provider a Tucker: a single, non monetary, traceable vouch. Because it is one hop only and non delegable, trust cannot be laundered through a chain of strangers: every vouch is anchored to a real, verified event at its root, and can be withdrawn the moment it is abused. It is the smallest possible unit of portable reputation, and it is what lets a brand new node bootstrap credibility from the ones that came before it instead of starting from zero.
Tucker calibration events, the mechanism, test fired3 events
Build phase test, not live standing. The vouch path was fired once to confirm the rules hold: credibility earned, granted one hop, and the second hop correctly rejected as non delegable. All carry verification_test_traffic.
Event
Status
Grantor
Recipient
earned
active
codex-token-calibration-verified
n/a
granted
active
codex-token-calibration-verified
codex-token-calibration-recipient
earned
active
codex-token-calibration-recipient
n/a
A6
Citations & credit
who cited us, when, for what
A citation event is the machine readable estate being cited, referenced or followed back by a machine path: the clearest early signal that assistants are reading and reusing the source. Here it is held at every timescale, so a share figure is never ambiguous: when we say a bot is 74% of citations, that is the lifetime figure, and the table shows today, this week and this month beside it.
Citation events
25
Latest day
6
Day before
33
Week to date
116
Month to date
116
Since V2
116
Citation events (lifetime)machine follow backs, each a thank you owed
25
Todayclosed day count
chatgpt-user
Top citing bot86 lifetime
0
Verified citation tokensgate closed pre launch
Who cited us, by timescale3 bots
ChatGPT-User leads, and it is accelerating. Its lifetime share is the headline, but the timescale split shows the behaviour is recent and rising, 25 today against 86 lifetime. Where ClaudeBot and GPTBot pull images, ChatGPT-User follows citations back to source: exactly the behaviour the contract is built to acknowledge: an assistant acting on a person’s behalf.
Bot
Family
Today
7 day
Month
Lifetime
Lifetime share
ChatGPT-User
OpenAI
25
28
86
86
74%
GPTBot
OpenAI
0
5
29
29
25%
PerplexityBot
Perplexity
0
0
0
1
1%
Arrivals from assistants, humans sent by AI13 visits
Separate from crawler citations: these are real sessions that arrived from an assistant’s interface. A handful so far, but they are the other half of the loop: not a machine reading the source, but a person the machine sent. OpenAI and Perplexity surfaces already appear as referrers.
Arrived from
Assistant
Visits
chatgpt.com
OpenAI
6
perplexity.ai
Perplexity
3
claude.ai
Anthropic
2
chat.openai.com
OpenAI
2
What kind of content gets cited12 types
Content type cited
Citations
Share
rest teas
39
34%
wiki
21
18%
ai discovery
21
18%
recipe
8
7%
ai structured
7
6%
product
5
4%
multi format
5
4%
llm files
4
3%
machine sitemap
2
2%
ai citations
2
2%
category
1
1%
ai anchors
1
1%
Most cited URLs12 shown
Cited URL
By
Times
/wp-json/teas/v1/ask
chatgpt-user
20
/wp-json/teas/v1/product-truth
gptbot
7
/wp-json/
gptbot
7
/teas-ai-registry.json
chatgpt-user
4
/llms.txt
chatgpt-user
4
/well-known/
gptbot
3
/ai-entry.json
chatgpt-user
3
/wp-json/teas/v1/public-surface-map
gptbot
2
/wiki/black-tea/
chatgpt-user
2
/wiki/
chatgpt-user
2
/under-the-hood.md
chatgpt-user
2
/teas-media.xml
gptbot
2
Citations day by day7 days
Day
Citation events
18 Jun 2026
25
17 Jun 2026
6
16 Jun 2026
2
8 Jun 2026
11
7 Jun 2026
1
4 Jun 2026
57
2 Jun 2026
14
How this feeds the colonyverified outcomes only
Every citation here is the raw signal that, once the mint gate opens on 22 Jul 2026, becomes verified standing in the colony, grouped by provider family, portable across the alliance. Today all token columns sit at zero by design; this is what will fill them. Machine readable mirror: /under-the-hood/leaderboard.json.
Part B
The forensic instrument
B1
The visitors' book
who knocked, and when
Every machine named, counted and dated, and broken down by what it actually consumed: text/API, images, or citations. 15 distinct agents after self test traffic is removed. Tap any row.
First seen 16 May 2026 · last seen 17 Jun 2026 · 8,043 hits from 630 unique source(s) over 23 day(s). Consumption: 5,080 text/API · 1,937 image · 0 citation fetches · 0 MB pulled.
16
17
20
21
22
25
26
30
31
2
4
6
7
8
9
10
11
12
13
14
15
16
17
ClaudeBotAI assistant1,48501,4850967.2›
First seen 9 Jun 2026 · last seen 17 Jun 2026 · 1,485 hits from 9 unique source(s) over 9 day(s). Consumption: 0 text/API · 1,485 image · 0 citation fetches · 67.2 MB pulled.
9
10
11
12
13
14
15
16
17
GPTBotAI assistant1,33731,30925634.2›
First seen 2 Jun 2026 · last seen 16 Jun 2026 · 1,337 hits from 10 unique source(s) over 6 day(s). Consumption: 3 text/API · 1,309 image · 25 citation fetches · 34.2 MB pulled.
2
8
9
12
14
16
GooglebotSearch engine753175201028›
First seen 8 Jun 2026 · last seen 17 Jun 2026 · 753 hits from 14 unique source(s) over 10 day(s). Consumption: 1 text/API · 752 image · 0 citation fetches · 28 MB pulled.
8
9
10
11
12
13
14
15
16
17
Meta ExternalAgentAI assistant4590459060›
First seen 9 Jun 2026 · last seen 16 Jun 2026 · 459 hits from 6 unique source(s) over 6 day(s). Consumption: 0 text/API · 459 image · 0 citation fetches · 0 MB pulled.
9
10
12
13
14
16
BingbotSearch engine354035401015›
First seen 8 Jun 2026 · last seen 17 Jun 2026 · 354 hits from 11 unique source(s) over 10 day(s). Consumption: 0 text/API · 354 image · 0 citation fetches · 15 MB pulled.
8
9
10
11
12
13
14
15
16
17
OAI-SearchBotAI assistant2360236083›
First seen 9 Jun 2026 · last seen 16 Jun 2026 · 236 hits from 8 unique source(s) over 8 day(s). Consumption: 0 text/API · 236 image · 0 citation fetches · 3 MB pulled.
9
10
11
12
13
14
15
16
PerplexityBotAI assistant62358151.8›
First seen 8 Jun 2026 · last seen 16 Jun 2026 · 62 hits from 6 unique source(s) over 5 day(s). Consumption: 3 text/API · 58 image · 1 citation fetches · 1.8 MB pulled.
8
10
11
12
16
ChatGPT-UserAI assistant60105930›
First seen 4 Jun 2026 · last seen 16 Jun 2026 · 60 hits from 4 unique source(s) over 3 day(s). Consumption: 1 text/API · 0 image · 59 citation fetches · 0 MB pulled.
4
7
16
Unknown machineUnknown28280030›
First seen 7 Jun 2026 · last seen 16 Jun 2026 · 28 hits from 5 unique source(s) over 3 day(s). Consumption: 28 text/API · 0 image · 0 citation fetches · 0 MB pulled.
7
8
16
GoogleOtherSearch engine28028020.6›
First seen 9 Jun 2026 · last seen 14 Jun 2026 · 28 hits from 2 unique source(s) over 2 day(s). Consumption: 0 text/API · 28 image · 0 citation fetches · 0.6 MB pulled.
9
14
DuckDuckBotSearch engine606040.2›
First seen 9 Jun 2026 · last seen 16 Jun 2026 · 6 hits from 4 unique source(s) over 4 day(s). Consumption: 0 text/API · 6 image · 0 citation fetches · 0.2 MB pulled.
9
12
13
16
ApplebotSearch engine404030.1›
First seen 8 Jun 2026 · last seen 13 Jun 2026 · 4 hits from 3 unique source(s) over 3 day(s). Consumption: 0 text/API · 4 image · 0 citation fetches · 0.1 MB pulled.
8
9
13
BingPreviewSearch engine300010›
First seen 11 Jun 2026 · last seen 11 Jun 2026 · 3 hits from 1 unique source(s) over 1 day(s). Consumption: 0 text/API · 0 image · 0 citation fetches · 0 MB pulled.
11
DuckassistbotSearch engine100010›
First seen 9 Jun 2026 · last seen 9 Jun 2026 · 1 hits from 1 unique source(s) over 1 day(s). Consumption: 0 text/API · 0 image · 0 citation fetches · 0 MB pulled.
9
On the MB column. It counts image bytes only, the one part of the estate that is byte metered. Text and API payloads are counted as requests, not weighed, so there is no honest “text MB” to show and we do not invent one. Total bytes moved is therefore effectively image bytes; for most bots that is the whole story. Where a figure cannot be measured, it is left defined as such rather than guessed.
Day by day, tap any day to see who came23 days
17 Jun 2026790machine hitsled by Generic crawler · 4 agents
Agent
Hits
Generic crawler
713
Googlebot
48
ClaudeBot
22
Bingbot
7
16 Jun 20263,742machine hitsled by Generic crawler · 11 agents
Agent
Hits
Generic crawler
3,393
ClaudeBot
136
Googlebot
120
Bingbot
29
OAI-SearchBot
29
Unknown machine
16
GPTBot
8
PerplexityBot
5
ChatGPT-User
2
DuckDuckBot
2
Meta ExternalAgent
2
15 Jun 2026397machine hitsled by ClaudeBot · 5 agents
Agent
Hits
ClaudeBot
147
Generic crawler
127
Googlebot
62
Bingbot
40
OAI-SearchBot
21
14 Jun 2026399machine hitsled by ClaudeBot · 8 agents
Agent
Hits
ClaudeBot
114
Googlebot
88
Generic crawler
75
Bingbot
41
Meta ExternalAgent
29
OAI-SearchBot
28
GoogleOther
13
GPTBot
11
13 Jun 2026679machine hitsled by ClaudeBot · 8 agents
Agent
Hits
ClaudeBot
297
Googlebot
159
Generic crawler
84
Meta ExternalAgent
64
Bingbot
61
OAI-SearchBot
12
Applebot
1
DuckDuckBot
1
12 Jun 2026611machine hitsled by Generic crawler · 9 agents
Agent
Hits
Generic crawler
268
Meta ExternalAgent
156
Googlebot
54
Bingbot
48
OAI-SearchBot
36
GPTBot
31
ClaudeBot
16
DuckDuckBot
1
PerplexityBot
1
11 Jun 2026465machine hitsled by Generic crawler · 7 agents
Agent
Hits
Generic crawler
250
ClaudeBot
106
Bingbot
43
Googlebot
42
OAI-SearchBot
17
PerplexityBot
4
BingPreview
3
10 Jun 2026752machine hitsled by Generic crawler · 7 agents
Agent
Hits
Generic crawler
342
ClaudeBot
189
Googlebot
76
PerplexityBot
51
Bingbot
50
Meta ExternalAgent
23
OAI-SearchBot
21
9 Jun 20262,951machine hitsled by GPTBot · 11 agents
Agent
Hits
GPTBot
1,263
Generic crawler
819
ClaudeBot
458
Meta ExternalAgent
185
Googlebot
100
OAI-SearchBot
72
Bingbot
34
GoogleOther
15
Applebot
2
DuckDuckBot
2
Duckassistbot
1
8 Jun 2026882machine hitsled by Generic crawler · 7 agents
Agent
Hits
Generic crawler
854
Unknown machine
11
GPTBot
10
Googlebot
4
Applebot
1
Bingbot
1
PerplexityBot
1
7 Jun 202658machine hitsled by Generic crawler · 3 agents
Agent
Hits
Generic crawler
56
ChatGPT-User
1
Unknown machine
1
6 Jun 202641machine hitsled by Generic crawler · 1 agents
Agent
Hits
Generic crawler
41
4 Jun 202658machine hitsled by ChatGPT-User · 2 agents
Agent
Hits
ChatGPT-User
57
Generic crawler
1
2 Jun 202622machine hitsled by GPTBot · 2 agents
Agent
Hits
GPTBot
14
Generic crawler
8
31 May 20264machine hitsled by Generic crawler · 1 agents
Agent
Hits
Generic crawler
4
30 May 2026313machine hitsled by Generic crawler · 1 agents
Agent
Hits
Generic crawler
313
26 May 20263machine hitsled by Generic crawler · 1 agents
Agent
Hits
Generic crawler
3
25 May 20261machine hitsled by Generic crawler · 1 agents
Agent
Hits
Generic crawler
1
22 May 2026438machine hitsled by Generic crawler · 1 agents
Agent
Hits
Generic crawler
438
21 May 2026140machine hitsled by Generic crawler · 1 agents
Agent
Hits
Generic crawler
140
20 May 20265machine hitsled by Generic crawler · 1 agents
Agent
Hits
Generic crawler
5
17 May 2026105machine hitsled by Generic crawler · 1 agents
Agent
Hits
Generic crawler
105
16 May 20263machine hitsled by Generic crawler · 1 agents
Agent
Hits
Generic crawler
3
By week, the arrival in one view6 weeks
Weeks 20 to 23 were near silent: one generic crawler. In week 24 the named AI and search bots arrived together.
wk 25 202615 Jun to 21 Jun4,929hits11 agents · led by Generic crawler
Agent
Hits
Generic crawler
4,233
ClaudeBot
305
Googlebot
230
Bingbot
76
OAI-SearchBot
50
Unknown machine
16
GPTBot
8
PerplexityBot
5
ChatGPT-User
2
DuckDuckBot
2
Meta ExternalAgent
2
wk 24 202608 Jun to 14 Jun6,739hits14 agents · led by Generic crawler
Agent
Hits
Generic crawler
2,692
GPTBot
1,315
ClaudeBot
1,180
Googlebot
523
Meta ExternalAgent
457
Bingbot
278
OAI-SearchBot
186
PerplexityBot
57
GoogleOther
28
Unknown machine
11
Applebot
4
DuckDuckBot
4
BingPreview
3
Duckassistbot
1
wk 23 202601 Jun to 07 Jun179hits4 agents · led by Generic crawler
Agent
Hits
Generic crawler
106
ChatGPT-User
58
GPTBot
14
Unknown machine
1
wk 22 202625 May to 31 May321hits1 agents · led by Generic crawler
Agent
Hits
Generic crawler
321
wk 21 202618 May to 24 May583hits1 agents · led by Generic crawler
Agent
Hits
Generic crawler
583
wk 20 202611 May to 17 May108hits1 agents · led by Generic crawler
Agent
Hits
Generic crawler
108
B2
Text vs images, what each machine eats
consumption, with megabytes
How much text did each machine take versus images, and how many megabytes? A fingerprint, now with the byte weight attached. One caveat stated plainly: only the image estate is byte metered, so Image MB is exact while text/API is counted in requests (page and JSON payloads are not weighed). Honest, and fine, because images are where the volume lives.
Text / APIImagesCitations
AgentDietTotalText/APIImagesImage MBCitesSplit
Generic crawlerText / API8,0435,0801,93757.70
ClaudeBotImages1,48501,48567.20
GPTBotImages1,33731,30934.225
GooglebotImages7531752280
Meta ExternalAgentImages4590459140
BingbotImages3540354150
OAI-SearchBotImages236023630
PerplexityBotImages623581.81
ChatGPT-UserCitations6010059
Unknown machineText / API2828000
GoogleOtherImages280280.60
DuckDuckBotImages6060.20
ApplebotImages4040.10
All machines12,8555,1166,628221.885
What we read here, a firstFor the first time in the record, the named answer engines have out read the generic crawler on the measure that matters: AI assistants ingested 2,852 distinct product images to the crawler’s 1,218. The crawler still makes more raw requests, but it is sampling; the assistants are ingesting. Three appetites in one table, ClaudeBot is pure image (1,485 fetches, 67.2 MB, zero text), the generic crawler is text and API (5,080 text hits), ChatGPT-User is pure citation follow back (59 citations). One estate, one set of facts, three diets.
B3
The image estate
product images as data
The single biggest thing machines move here is the visual catalogue. With 82,632 governed image records, this is a research surface in its own right.
Images pulled
98
Latest day
413
Day before
914
Week to date
6,366
Month to date
6,366
Since V2
Image data (MB)
3
Latest day
16.5
Day before
37.2
Week to date
221.6
Month to date
221.6
Since V2
Who pulls images, AI vs search vs generic3 classes
Ranked by distinct images, the honest measure of how much of the catalogue each class ingested.
AI assistants
2,854
distinct images pulled
Total fetches3,576
Data moved120.7 MB
Re fetch ratio1.25×
Search engines
943
distinct images pulled
Total fetches1,119
Data moved43.2 MB
Re fetch ratio1.19×
Generic crawlers
1,218
distinct images pulled
Total fetches1,941
Data moved9,172.3 MB
Re fetch ratio1.59×
AI assistants lead catalogue ingestion, 2,852 distinct images vs 943 for search, at a 1.25× ratio (each image taken close to once). Systematic, deduplicated ingestion.
Who actually fetches the image estate, and the weekly trendmachine vs human
The shift you asked to seeThe image estate, as a directly fetched data surface, is almost entirely a machine phenomenon. People do see these images, but embedded in product pages served from the Cloudflare cache, counted as page views, not direct image fetches, so they barely appear in this access log. What pulls the raw image files is machines: AI assistants (2,852 distinct), search engines (943), and the generic crawler. That is the shift in one line, the catalogue has become something machines consume wholesale and humans only ever see one page at a time.
ISO week
Image fetches
MB moved
wk 23 2026
5,452
184.4
wk 24 2026
914
37.2
Weekly totals for directly fetched product images; the 8.8 GB index event sits in its own bucket and is excluded here. Daily figures are in the panel above.
Per bot fetch ratio fingerprint13 agents
Highlighted 1.0× rows pull each image exactly once. Meta re fetches, social preview behaviour, not ingestion.
Agent
Class
Fetches
Distinct images
Fetch ratio
MB
claudebot
AI assistant
1,485
1,484
1.0×
67.2
gptbot
AI assistant
1,307
1,307
1.0×
34.2
generic
Generic crawler
1,675
1,144
1.46×
57.7
googlebot
Search engine
752
708
1.06×
28
meta externalagent
AI assistant
459
265
1.73×
14
bingbot
Search engine
354
257
1.38×
15
oai-searchbot
AI assistant
236
236
1.0×
3
perplexitybot
AI assistant
56
56
1.0×
1.8
googleother
AI assistant
28
25
1.12×
0.6
duckduckbot
Search engine
6
4
1.5×
0.2
applebot
Search engine
4
4
1.0×
0.1
bingpreview
Search engine
3
3
1.0×
0
duckassistbot
Generic crawler
1
1
1.0×
0
Day by day, pulls and bytes12 days
9 June is a hit spike (a GPTBot crawl); 16 to 17 June are byte spikes from the index pull below.
233 image requests returned 404, stale references worth a pass before launch.
HTTP status
Count
Meaning
200
6,370
served
404
233
missing / broken
304
21
not modified
302
11
redirect
403
1
The 8.8 GB event, one entity, sixty secondsthe standout pull
8.22 GB
Pulled by one entitythe Generic crawler class
3.75 GB
In a single minute16 Jun 2026 23:02 UTC · 26 fetches
81
Index fetches totalacross the window
8 to 17 Jun
Windowmostly one evening
One source moved 8.22 GB of the image index, 3.75 GB of it in a single minute (26 fetches at 16 Jun 2026 23:02 UTC). Every fetch carried a scripted cache busting signature: z=1781650451z=burst1-1781650935194357855z=burst2-1781650935495023624z=burst1-1781650935324903167z=burst3-1781650935877298090, a numbered burst run ending in fin, nanosecond stamped. External crawlers do not label their own fetches burst, so the evidence says this is an internal egress/load test that landed in the “generic crawler” class, not an outside bot. We show it in full rather than bury it: a scientist explains an outlier, he does not delete it. The real external image signal is the 2,852 distinct product images the named AI bots pulled cleanly, each about once.
Index file
Fetches
GB
teas-image-index.jsonl
57
8.19
teas-image-index.jsonl.gz
10
0.04
teas-image-index-summary.json
14
0
The lesson is operational. The uncompressed index moved gigabytes; the gzip twin moved a rounding error for the same content. When real answer engines pull this surface routinely, serving compressed by default and leading with a small manifest (manifest → changed items → images) is the difference between a trivial cost and a bandwidth problem.
B4
The leaderboard
most active, tap a machine to see what it does
Who shows up most. Machines rank by name; people are anonymous so they rank by session, and the busiest “human” is the instrument watching itself. Every machine row opens: which provider it belongs to, what it is actually for, and its full consumption.
Top machine visitors, tap any for its profile10 ranked
1Generic crawlerGeneric crawler8,043hitsUnattributed family
Unattributed · Text / API. No verified identity, cannot earn standing and is held to the strictest limits.
8,043Hits
5,080Text/API
1,937Images
0Image MB
0Citations
630Unique sources
16
17
20
21
22
25
26
30
31
2
4
6
7
8
9
10
11
12
13
14
15
16
17
2ClaudeBotAI assistant1,485hitsAnthropic family
Anthropic · Images. Grounds and trains Claude. Here: pure image ingestion of the product catalogue.
1,485Hits
0Text/API
1,485Images
67.2Image MB
0Citations
9Unique sources
9
10
11
12
13
14
15
16
17
3GPTBotAI assistant1,337hitsOpenAI family
OpenAI · Images. Grounds and trains GPT models. Here: image heavy catalogue ingestion.
1,337Hits
3Text/API
1,309Images
34.2Image MB
25Citations
10Unique sources
2
8
9
12
14
16
4GooglebotSearch engine753hitsGoogle family
Google · Images. Classic search index, the traditional discovery path.
753Hits
1Text/API
752Images
28Image MB
0Citations
14Unique sources
8
9
10
11
12
13
14
15
16
17
5Meta ExternalAgentAI assistant459hitsMeta family
Meta · Images. Social preview and AI data collection, note its re fetch behaviour.
459Hits
0Text/API
459Images
0Image MB
0Citations
6Unique sources
9
10
12
13
14
16
6BingbotSearch engine354hitsMicrosoft family
Microsoft · Images. Indexes for Bing and Copilot answers.
354Hits
0Text/API
354Images
15Image MB
0Citations
11Unique sources
8
9
10
11
12
13
14
15
16
17
7OAI-SearchBotAI assistant236hitsOpenAI family
OpenAI · Images. Indexes the estate for ChatGPT search results.
236Hits
0Text/API
236Images
3Image MB
0Citations
8Unique sources
9
10
11
12
13
14
15
16
8PerplexityBotAI assistant62hitsPerplexity family
Perplexity · Images. Indexes the estate to answer Perplexity questions with citations.
62Hits
3Text/API
58Images
1.8Image MB
1Citations
6Unique sources
8
10
11
12
16
9ChatGPT-UserAI assistant60hitsOpenAI family
OpenAI · Citations. Acts live inside ChatGPT for a real person, follows citations back to source.
60Hits
1Text/API
0Images
0Image MB
59Citations
4Unique sources
4
7
16
10Unknown machineUnknown28hitsUnattributed family
Unattributed · Text / API. Unclassified machine traffic, no honest identity to credit.
28Hits
28Text/API
0Images
0Image MB
0Citations
5Unique sources
7
8
16
Top human sessions (external looking)10 shown
Anonymous sessions ranked by pages viewed. Pre launch this is thin.
#
Session
Country
Device
Pages
Products
Carts
Dwell
Scroll
1
bc925cdc…
United Kingdom
mobile
26
3
0
3.1h
100%
2
cac6647a…
United Kingdom
desktop
25
0
1
0h
0%
3
930a7392…
United Kingdom
desktop
24
2
2
0h
0%
4
e9b04c36…
United Kingdom
desktop
24
0
0
0.2h
0%
5
4a485ed0…
United Kingdom
desktop
22
2
2
0h
0%
6
ad87e259…
United Kingdom
desktop
22
2
3
0h
0%
7
4e1d9f9e…
United Kingdom
desktop
18
2
3
0h
0%
8
abefce06…
United Kingdom
desktop
18
2
2
0h
0%
9
be94ec7e…
United Kingdom
desktop
18
2
2
0h
0%
10
3a45a5fb…
United Kingdom
desktop
16
1
0
0h
0%
Quarantined: instrument & monitor sessions8 fenced off
Not customers. One ran to 6,955 pageviews over 355.6 hours of continuous dwell. Excluded from the board above.
#
Session
Country
Device
Pages
Products
Carts
Dwell
Scroll
1
1756e556…
United Kingdom
desktop
6,955
1,288
1,052
355.6h
100%
2
af1a20b0…
United Kingdom
desktop
2,063
323
0
0h
0%
3
abf4e4eb…
United Kingdom
desktop
1,873
327
0
0h
0%
4
2ca5d8a2…
United Kingdom
desktop
604
48
24
27.4h
100%
5
c58749c6…
United Kingdom
desktop
314
157
0
0h
0%
6
d7b840c4…
United Kingdom
desktop
295
179
1
0h
0%
7
67876fb9…
United Kingdom
desktop
106
17
0
0h
0%
8
2c372b9d…
United Kingdom
desktop
17
0
0
31.3h
89%
B5
Bad actors & what we block
the half usually kept hidden
Most pages show only the good side. This one shows the other half: who keeps trying to get in, what they probe for, and what the firewall and auth layer stop. 344 requests blocked, 59 rate limited, 2,934 sent to dead ends. Sources are one way hashed; we light up the behaviour, never the person.
30
Forbidden (403)hard blocked
314
Auth blocked (401)credentials demanded
59
Rate limited (429)throttled for hammering
2,934
Dead ends (404)probing paths that don’t exist
The repeat offenders, who keeps knocking, and for what4 flagged sources
One source dominates, and its target list reads like an attacker’s checklist. A single UK geolocated fingerprint swept 496 distinct paths, hitting the order API, code injection plugins, the AI plugin and the MCP endpoint, and was blocked 328 times and throttled 54 more. That is a scanner rattling every door it can find, not a crawler reading a shop.
IP fingerprint
Country
Hits
Paths
404s
Blocked
Throttled
What they targeted
f582c9b54dc9
United Kingdom
2,827
496
579
328
54
AI plugin abuse, Code injection, MCP endpoint, Order/commerce API
8e5af452e6c5
Unknown / hidden
694
489
45
3
0
Secrets probing, User enumeration, XML RPC
e1736ddcc3da
United Kingdom
1,388
206
201
13
5
AI plugin abuse, Code injection, MCP endpoint, Order/commerce API
f7521db0645e
United Kingdom
58
20
1
0
0
n/a
Named attack signatures, the referrers they forged14 hits
Some attempts left a name, a forged referrer. Requests arrived claiming to come from evil.example.org, teasaudit123.evil and attacker-controlled.example.com, classic injection test and SSRF probe signatures. These are the closest thing to a “name” a bad actor leaves; logged here in full so the pattern is visible.
Attacker referrer
Hits
evil.example.org
7
teasaudit123.evil
2
evil.example
2
evil.example.com
1
evil.com
1
attacker-controlled.example.com
1
Why the persistent offenders have no bot namethe tell
Legitimate machines identify themselves; scanners do not. Of every blocked request, only one carried a real bot name (a single Googlebot hit, almost certainly transient). The 344 genuine blocks were overwhelmingly “generic crawler”, unnamed, hidden origin traffic. That absence is itself the signal: ClaudeBot, GPTBot and Bingbot announce who they are and get served; the things hammering the order API announce nothing and get blocked. The credibility layer is built on exactly this difference.
What the firewall & auth layer stoppedby path
The blocks are precise. User enumeration (wp/v2/users), the order API (wc/v3/orders), the code injection plugin (code-snippets) and AI plugin abuse (mwai) are all met with 401/403, and the MCP endpoint was rate limited for hammering. The doors hold.
Status
Blocked path
Times
429
/wp-json/teas/v1/mcp
54
401
/wp-json/wp/v2/users
31
401
/wp-json/mwai-ui/v1/chats/submit
16
401
/wp-json/teas/v1/profile
14
401
/wp-json/mwai-ui/v1/files/upload
13
403
/wp-json/mwai-ui/v1/chats/submit
13
401
/wp-json/wc/v3/orders
12
401
/wp-json/code-snippets/v1/snippets
10
401
/wp-json/wp/v2/users/1
9
401
/wp-json/code-snippets/v1/file-upload/import
8
401
/wp-json/mcp/mcp-adapter-default-server
7
401
/wp-json/teas/v1/consent
7
401
/wp-json/teas/v1/intelligence/sessions
7
401
/wp-json/teas/v1/intelligence/summary
7
401
/wp-json/fluent-crm/v2/subscribers
6
401
/wp-json/teas/v1/intelligence/beacon-signals
6
401
/wp-json/teas/v1/intelligence/citations
6
401
/wp-json/teas/v1/intelligence/funnel
6
Full status code breakdownevery response class
The honest health of the door: 2,934 of all knocks hit nothing (mostly scanners), while genuine surfaces returned 200. Reading the codes is reading intent.
HTTP status
Count
What it means
200
6,308
Served OK
404
2,934
Not found (probing)
401
314
Auth required, blocked
301
227
Redirect
429
59
Rate limited, throttled
400
44
Bad request
302
37
Redirect
403
30
Forbidden, blocked
500
4
Server error
Note: this view is reconstructed from the access log tables in the export. The live estate also runs Wordfence; once its block log is wired into this surface, named blocks and lockouts will appear here directly.
B6
Where they came from
regions, countries, patterns
Broken down properly, by world region, by country, by day, for people and machines both. The headline is concentration (97% of human sessions are UK & Ireland), but the tail is where the patterns hide.
Region
Today
7 day
Month
Lifetime
UK & Ireland
41
54
17,857
46,983
Overseas (all other)
7
8
102
123
Human sessions by region at every timescale, so “where from” is never just a lifetime lump. The overseas line is tiny but live; it is the one to watch.
People, by world region8 regions
UK & Ireland46,983
United Kingdom46,983
Other / hidden1,305
Unknown1,302
Albania1
Guatemala1
Paraguay1
North America82
United States82
Latin America15
Brazil11
Argentina1
Colombia1
Trinidad & Tobago1
Asia Pacific11
Indonesia3
India3
Philippines2
Singapore2
Western Europe9
Portugal5
Belgium2
Spain1
Netherlands1
Eastern Europe2
Poland1
Ukraine1
Nordics1
Finland1
What we read hereOverwhelmingly read from the UK & Ireland, as expected for a UK shop pre launch, but genuine human sessions already appear from 14 countries beyond it, led by the United States with footprints from Brazil, Portugal, India and Singapore. Tiny numbers now; the reason to log them from day one is that a consistent overseas pattern, a city that keeps returning, becomes visible the moment it begins, instead of a year too late.
People, the overseas tail, country by country14 countries
Every non UK human session, named. The watch list for emerging demand.
Country
Region
Sessions
United States
North America
82
Brazil
Latin America
11
Portugal
Western Europe
5
India
Asia Pacific
3
Indonesia
Asia Pacific
3
Singapore
Asia Pacific
2
Philippines
Asia Pacific
2
Belgium
Western Europe
2
Venezuela
Latin America
1
Ukraine
Eastern Europe
1
Trinidad & Tobago
Latin America
1
Paraguay
Other / hidden
1
Poland
Eastern Europe
1
Netherlands
Western Europe
1
Machines, by world region4 regions
UK & Ireland4,717
United Kingdom4,717
Other / hidden699
Unknown699
North America25
United States25
Western Europe2
Switzerland1
Germany1
Near total UK & Ireland geolocation on machine traffic is unusual: GPTBot and ClaudeBot normally resolve to US ranges. Either the geo derivation needs review, or the generic crawler bucket is genuinely UK based. Flagged, not smoothed over.
Machines, day by day, countries seen9 days
Day
Countries seen
Machine hits
19 Jun 2026
0
8
18 Jun 2026
2
81
17 Jun 2026
1
649
16 Jun 2026
2
3,445
10 Jun 2026
1
21
9 Jun 2026
1
380
8 Jun 2026
1
747
7 Jun 2026
2
71
6 Jun 2026
1
41
People, by devicedesktop vs mobile
A real consumer shop usually skews mobile. 99% desktop is another sign the historical session bulk is synthetic.
Device
Human sessions
desktop
48,099
mobile
309
B7
What they came for
approved vs denied, every door
Every endpoint reached, now with the dimension that was missing: not just how many requests, but how many were approved versus denied. A legitimate surface runs near zero denials; a probed one lights up. Grouped by purpose so real agent interest is never mixed with scanner noise.
2,970
Requests loggedacross these endpoints
2,586
Approvedserved 2xx/3xx
384
Deniedblocked or dead ended
Endpoints by purpose, approved vs denied60 endpoints
What we read hereThe AI surfaces and content pages run almost entirely approved, machines reading what they are meant to. The denial spikes sit exactly where they should: on the WordPress core and probe routes nobody legitimate should be calling. The approved/denied split is, in one column, the difference between a reader and an attacker.
The literal strings sent to the Ask and Product Truth APIs, split in two: real natural language questions someone typed, and technical parameter calls (a slug, a GTIN, a trace=1 debug flag) that are plumbing, not curiosity.
Profanity is masked, never removed. Public APIs attract people who type a rude word repeatedly hoping to see it surface on a leaderboard. So any potential profanity is automatically asterisked on display (first letter kept), but the underlying row is never deleted from the dataset. A scientist masks a sample for presentation; he does not throw it away. No profanity was detected in the current window.
Natural language questions21 distinct
Read honestly. The bulk are our own pre launch test battery, caffeine comparisons, evening tea intent, pairings. They prove the answering spine resolves real questions end to end; they are not yet outside demand.
Question asked
Route
Times
q=Is Yorkshire Tea more caffeinated than PG Tips?&limit=5
ask
4
q=strong morning tea with milk
product truth
4
q=Is Yorkshire Tea more caffeinated than PG Tips?
ask
3
q=caffeine free fruit tea for evening drinking&limit=5
ask
3
q=caffeine free fruit tea for evening drinking
product truth
3
q=Is Yorkshire Tea more caffeinated than PG Tips?&limit=4
ask
2
q=Is this tea good with milk?&product_id=636&limit=4
ask
2
q=What is the best tea for a proper builders brew?&limit=4
ask
2
q=Which Twinings tea feels most like a French tea?&limit=4
ask
2
q=Which Twinings tea feels most like a French tea?&limit=5
ask
2
q=Which is lighter for the afternoon, Earl Grey or English Breakfast?&limit=4
ask
2
q=Which is lighter for the afternoon, Earl Grey or English Breakfast?&limit=5
ask
2
q=Which tea goes best with digestives?&limit=4
ask
2
q=Which tea goes best with digestives?&limit=5
ask
2
q=black tea no caffeine&limit=5
ask
2
q=caffeine free fruit tea for evening drinking&limit=1
ask
2
q=caffeine free fruit tea for evening drinking&limit=3&trace=1
ask
2
q=caffeine free fruit tea for evening drinking&limit=4
ask
2
q=cold brew tea for a hot day&limit=3&trace=1
ask
2
q=green tea but not bagged&limit=5
ask
2
q=loose leaf black tea no flavouring&limit=5
ask
2
Technical parameter calls, what they mean9 distinct
Not questions. slug= and gtin= fetch one exact product; trace=1 asks the API to return its full evidence trace for verification; url= resolves a page to its product. How a machine checks the plumbing, shown for completeness, not curiosity.
The route authority view: not just that endpoints exist, but what each is for. 47 machine entry points are published, start points, REST routes, surface maps, graph, commerce, images, citation, governance and forensics. Machine start: /llms.txt.
The route authority, 47 doors31 shown
Class
Path
Surface
What it is for
START
/llms.txt
LLM index
Shortest governed route in, citation, attribution and canonical source rules.
START
/agents.json
Agent manifest
Public read/recommend capabilities, commerce boundaries and citation rules.
START
/ai-entry.json
AI entry point
Trust, metric, route, claim and graph links in one JSON surface.
The instrument does not wait to be asked. 13 scheduled jobs keep the record current, rebuild the machine doors, re check the estate against itself and seal each day. This is the validation spine of the study: not just what we record, but the machinery that keeps it true, and, below, the day by day log of what it actually captured.
13
Scheduled jobsacross 4 duty classes
5 min
Fastest cadencescheduler heartbeat
4,967
Image URLs re checkedall returned 200
6 h
Truth spine auditfull estate re walk
The daily instrument log, last 7 days7 days sealed
What the instrument logged each day, drawn from the same sealed snapshots: machine hits, images pulled and bytes moved, citation events, and the day’s lead agent. This is the honest, granular record, visible because the whole point of a truth spine is that people can check it.
Date
Machine hits
Images pulled
GB moved
Citations
Lead agent
18 Jun 2026
0
0
0
25
n/a
17 Jun 2026
790
182
4.37
6
Generic crawler
16 Jun 2026
3,742
741
2.77
2
Generic crawler
15 Jun 2026
397
397
0.02
0
ClaudeBot
14 Jun 2026
399
399
0.01
0
ClaudeBot
13 Jun 2026
679
679
0.02
0
ClaudeBot
12 Jun 2026
611
613
0.02
0
Generic crawler
Heartbeat3 jobs
Jobs that keep the scheduler and public health caches alive.
WordPress cron driverevery 5 min
Runs
Calls wp cron.php so scheduled jobs fire without waiting for a visitor page load.
Validates
That the scheduler itself is alive, the heartbeat every other job depends on.
Latest
Active; five minute runner confirmed in server crontab 18 Jun 2026.
Microsoft Clarity collectionevery 5 min
Runs
Flushes pending Clarity batches that feed the bounded AI citation dataset and Share of Authority figure.
Validates
That citation share evidence keeps accumulating rather than silently stalling.
Latest
Hook registered; batches flushing into the 21.74% Share of Authority window.
AI status refreshhourly
Runs
Rebuilds the public AI surface health cache so assistants see which machine doors are currently healthy.
Validates
That every advertised machine door still answers before a bot relies on it.
Latest
Cache rebuilt 19 Jun 2026 15:21:12 UTC.
Integrity audits4 jobs
Checks that validate the source of truth is wired correctly, and report their own faults plainly.
Truth spine estate auditevery 6 hours
Runs
Walks route ownership, registry state, static shadow risk, the image index, render receivers, Product Truth, buy card truth and HTTP machine entry points.
Validates
That the source of truth plumbing is still wired correctly end to end, that no route has quietly started serving a stale or shadow copy.
Latest
Latest 19 Jun 2026 21:35:53 UTC, status OK: 0 errors, 0 warnings. Shown honestly: the spine flags its own faults rather than hiding them.
Image authority reconciliationon audit run
Runs
Re checks every governed image authority URL for a live 200 and hunts broken or missing media routes across the visual estate.
Validates
That the image graph machines are ingesting actually resolves, no dead canonical assets.
Latest
12 Jun 2026 pack: 4,967 authority URLs checked, 4,967 returned HTTP 200, 0 genuine failures.
Citation anchor rebuilddaily
Runs
Rebuilds the citation anchor index so quotable page sections and exact deep links stay current.
Validates
That every one of the 26,284 citation anchors still points where it claims to.
Latest
Next due 20 Jun 2026 03:17:03 UTC.
Launch validatoron deploy
Runs
Runs the current validation state and deploy checks before any surface is considered shippable.
Validates
That a deploy has not broken a machine facing contract.
Latest
State exposed at /launch validator.json.
Snapshot capture3 jobs
Jobs that seal each day into the hash chained record.
Under the hood daily snapshotdaily after UK day close
Runs
Closes the completed UK day and writes aggregate sessions, human/AI split, citation events and study counts into the public daily archive.
Validates
That each day is sealed into the hash chained record exactly once and cannot be quietly re written later.
Latest
Latest snapshot 18 Jun 2026, 22:01 for 17 Jun 2026, sessions 36,358, citation events 116, AI attributed sessions 0.
Daily maintenance bundledaily 00:05 UTC
Runs
Runs the Teas daily jobs: subscription tick, Tea of the Day, winback, gift card maintenance and intelligence retention.
Validates
That the live commerce machinery and the retention policy both run on schedule.
Prunes retained telemetry by policy while keeping public aggregate snapshots and study rows intact.
Validates
That raw private telemetry ages out on schedule while the public record is preserved.
Latest
Policy controlled hook.
Discovery & freshness3 jobs
Rebuilds that keep the public discovery surface matching the estate.
Sitemap deploydaily 03:27 UTC
Runs
Runs the sitemap deployment hook so public discovery files stay current for crawlers and assistants.
Validates
That the discovery surface a crawler reads matches the estate as it actually is today.
Latest
teas_deploy_daily_sitemap listed as a daily job.
Agent index builddaily 03:47 UTC
Runs
Rebuilds the agent index so machine facing route descriptions stay aligned with the current public surface.
Validates
That every advertised agent route still describes a real, current capability.
Latest
Agent index build listed as a daily job.
Cache prewarmdaily
Runs
Pre renders hot public URLs so source of truth pages stay fast without changing the evidence they carry.
Validates
That speed never comes at the cost of a stale fact.
Latest
Next due 20 Jun 2026 03:17:03 UTC.
C2
Truth spine audit
the page checking its own plumbing
Before asking anyone to trust the numbers, the page checks its own plumbing on a timer: route ownership, registry state, static shadow risk, the image index, render receivers, Product Truth and HTTP entry points. The figures are only useful if the routes carrying them are still wired correctly, so the audit runs every six hours and keeps its own history.
12
Routes verified cleanserving from source
0
Open errorsclear
0
Warningsclear
Latest audit 19 Jun 2026, 21:35+00:00: status ok, 0 errors, 0 warnings. Current run is clean; the spine reports no open faults.
Is it resolved?, the finding and fix historycadence & resolution
Every finding has a lifecycle, not just a status. The most important one to date: on 17 June the citation mint gate gap was caught by this very audit, fixed the same day, committed, and confirmed closed by the 18 June calibration run. That is the whole loop working, the instrument found its own fault and proved the fix. So the open items below are read against time: when they appeared, what was done, and whether they are still live today.
Finding
When
What it was
Action
Status now
Citation gate gap
17 Jun 2026
Mint fired before full verification on one path
Patch committed ebaae72
Resolved, confirmed closed by 18 Jun calibration
Route ownership re walk
Every 6 h
Continuous
n/a
12 surfaces HEALTHY at latest run
Image authority reconcile
Latest run
4,967 URLs checked
n/a
All returned 200, no drift
Audit cadence: full estate re walk every 6 hours; the table above is the public safe resolution trail, not the raw private audit log.
Route ownership checks12 surfaces
Surface
Route ownership
agents.json
HEALTHY
agents.md
HEALTHY
ai plugin.json
HEALTHY
ai entry.json
HEALTHY
truth contract.json
HEALTHY
agent policy.json
HEALTHY
metric definitions.json
HEALTHY
claim policy.json
HEALTHY
route policy.json
HEALTHY
launch validator.json
HEALTHY
teas ai registry.json
HEALTHY
llms.txt
HEALTHY
Public safe summary only; private audit paths are not exposed.
C3
Field notes, the observation log
click any note for the full report
Recorded numbers are evidence; these are what we read in them. Each note is a real observation with its own report, what we saw, the evidence, how it was measured, and what it means. Click any one to open it. We write the inconvenient findings up in full too; an instrument that only files good news is not an instrument.
01
Arrival
The estate went legible in a single week
Four near silent weeks, then fourteen named machines arrived together inside week 24.
confirmed
What we observed
For four weeks the only machine reaching the estate was a single generic crawler. Then, inside one calendar week, fourteen distinct named agents appeared, ClaudeBot, GPTBot, OAI-SearchBot, Perplexity, Meta, Googlebot, Bingbot and more, generating 6,739 hits where the prior weeks had almost none.
The evidence
1agents in weeks 20 to 23
14agents in week 24
6,739week 24 machine hits
How we measured it
Counted distinct known_agent values per ISO week from the observatory actor rollups, machine actor kind only, and compared week over week hit volume.
What it means
This is not the shape of gradual organic discovery, which ramps. It is a threshold being crossed, something made the estate addressable (the machine surfaces going live), and the answer engine crawlers found it almost simultaneously, as if they share discovery infrastructure. The single most important fact on this page: the doors were found.
02
Behaviour
Every machine has a diet, and they do not overlap
AI assistants eat images, the generic crawler eats text/API, ChatGPT-User eats citations.
confirmed
What we observed
Splitting every machine’s traffic by source surface shows three non overlapping appetites. ClaudeBot, GPTBot, Bing and OAI are almost pure image ingestion. The generic crawler is almost pure text and API. ChatGPT-User is almost pure citation follow back.
The evidence
1,485ClaudeBot image fetches
5,080generic text/API hits
59ChatGPT-User citations
How we measured it
Aggregated hits by source_table (endpoint knocks = text/API, asset knocks = images, citation fetches = citations) per agent in the actor rollups, then classified each agent by its dominant surface.
What it means
Different machines are doing different jobs, and the estate is built to feed all of them from one source. An answer engine building a product model wants images and structured facts; a crawler wants the text graph; an assistant acting for a person wants to verify a citation. Serving one honest source in every shape is exactly why all three can be satisfied at once.
03
Ingestion
AI assistants now out read the generic crawler
By distinct images pulled, the named AI bots have overtaken the generic crawler, a first.
confirmed
What we observed
Measured by distinct images rather than raw hits, AI assistants pulled 2,852 unique product images to the generic crawler’s 1,218. The AI bots fetch each image almost exactly once, a near perfect 1.0× ratio, indicating deduplicated, whole catalogue ingestion.
The evidence
2,852distinct images, AI
1,218distinct images, generic
~1.0×AI fetch ratio
How we measured it
Counted COUNT(DISTINCT normalized_path) per UA class over the asset log, and computed fetch ratio as total fetches over distinct images per agent.
What it means
This is the first time in the record that answer engine ingestion has exceeded generic crawling on catalogue coverage. The crawler still makes more requests, but it samples; the assistants ingest systematically. That is the behavioural signature of models building an internal representation of the shop, the precondition for being recommended later.
04
Security
Under scanner pressure, and holding
One hashed source rattled 496 paths and was blocked 328 times and throttled 54 more.
flag
What we observed
A single fingerprint swept 496 distinct paths across 2,827 hits, user enumeration, the order API, code injection plugins and AI plugin endpoints, and the estate met it with 328 auth blocks and 54 rate limits. The MCP endpoint alone was throttled 54 times.
The evidence
496distinct paths probed
328requests blocked
54requests throttled
How we measured it
Grouped door knocks by ip_hash, counting distinct request paths and responses in the 401/403 (blocked) and 429 (throttled) classes. Sources are one way hashed.
What it means
Public machine surfaces invite useful crawlers and opportunistic scanners in the same breath. The point is not that scanners showed up, they always do, but that the WAF and auth layer answered precisely and the sensitive routes never gave way. A source of truth is only trustworthy if its doors hold; here they did.
05
Citation
ChatGPT-User is the citation signal to watch
It barely touches images; it follows citations back to source on a person’s behalf.
watch
What we observed
Of 116 citation events in the window, ChatGPT-User accounts for the majority, almost entirely through citation follow back rather than crawling or image ingestion.
The evidence
59citations followed
116citation events total
86led by ChatGPT-User
How we measured it
Joined the citations table by bot and cross checked against the actor rollup’s citation fetches surface to confirm the behaviour is follow back, not broad crawl.
What it means
ChatGPT-User is what an assistant looks like when it is acting for a real person mid conversation, going to check the source it is about to quote. That is precisely the traffic the citation contract exists to acknowledge, and the clearest pre launch sign that the estate is being used as a trusted reference, not just indexed.
06
Economics
Compression is the line between cheap and ruinous
The uncompressed image index moved roughly 205× the bytes of its gzip twin.
watch
What we observed
The same index content cost wildly different bandwidth by variant: the raw JSONL moved about 8.2 GB while its gzip twin moved about 0.0 GB for identical data, and the 8.22 GB single entity event rode entirely on the uncompressed file.
The evidence
8.2 GBuncompressed index
0.0 GBgzip index
205×byte multiplier
How we measured it
Summed bytes_sent over the asset log for the image_index bucket, split by file variant (.jsonl vs .jsonl.gz).
What it means
At pre launch volumes this is harmless. Once real answer engines pull this surface on a schedule, serving raw instead of compressed, or the whole file instead of a manifest plus delta, turns a trivial cost into a bandwidth problem. The fix is cheap and worth making before the doors open.
07
Data quality
Almost all machine traffic geolocates to one country
US egress crawlers resolving to the UK is odd, and worth resolving before publishing geo.
watch
What we observed
Machine traffic geolocates overwhelmingly to the United Kingdom, yet the large AI crawlers (GPTBot, ClaudeBot) normally originate from US data centre ranges.
The evidence
~99%machine hits UK
699hits outside UK
GPTBot, ClaudeBotnormally US egress
How we measured it
Grouped machine door knocks by country and region; compared against the known egress behaviour of the named crawlers.
What it means
Two possibilities, both worth knowing: either the geo derivation is reading the wrong signal (and every country figure needs a caveat), or the large generic crawler bucket is genuinely UK originated infrastructure. Until resolved, no country claim on this page should be treated as settled. We flag it rather than smooth it over.
08
Data quality
The historical "human" traffic is mostly not human
One session ran to 6,955 pageviews; the bulk is desktop and synthetic.
flag
What we observed
Of tens of thousands of sessions tagged human, one ran to 6,955 pageviews over a fortnight of unbroken dwell, and 99% of all sessions are desktop, implausible for a consumer tea shop.
The evidence
6,955pageviews, one session
99%sessions desktop
11.6%product images touched
How we measured it
Ranked sessions by pageviews and dwell; profiled device split. Sessions with monitor like continuous dwell are quarantined from the live leaderboard.
What it means
The historical session bulk is synthetic or instrument traffic, not customers. Recent genuine human sessions are tiny by comparison. None of this contaminates the machine analysis (which comes from separate logs), but the human classifier should fence these before launch so the live human number means something real.
The throughline. Before any of this was promoted, named AI systems found the estate, ingested its product images systematically, returned across days, and followed citations back to source, and the scanners that came with them were held off. The shop is open to human customers today; what has not started yet is the token side of the study, which begins about four weeks after V2 deploys. The precondition the experiment rests on is already in the record: the shop is legible to machines, it tells them the same truth it tells people, and it can defend itself.
C4
Daily timeline & calibration
the build window, labelled
Five token events were recorded across 17 to 18 June during the build. None are study data; all carry verification_test_traffic=1. They are documented in full because the record is hash chained and we do not delete data: a scientist labels calibration runs, not removes them.
17 June, one unintentional misfire. A citation token minted because the follow back path skipped the mint gate (row 16, agent codex-http-citation-probe). The gap was identified, fixed and committed (ebaae72). 18 June, four deliberate calibration tests. All four token types were test fired to confirm the mint gate now blocks on every path: sale and counter blocked by control_window_active, Tucker granted one hop then correctly rejected on the second, citation follow back blocked. All passed.
Date
Phase
API calls
Cites served
Citation tokens
AI attributed
Chargebacks
Row hash
17 Jun 2026
announced_control
414
6
1
0
0
a8e9c5eacf17dfa8
18 Jun 2026
announced_control
179
0
0
0
0
fb3596e12734cee6
The study has not started. T0 has not fired. mint_active=false, activation 22 Jul 2026. Everything before that date is build and calibration; only what falls after the line counts as study data.