tickfoundrypre-launch · v0.2
collectors online · capture running
◉ INCOMING · Q3 2026for traders · quants · researchers · funds · market-structure teams

Backtest prediction-market strategies on full historical tick data.

Research-grade Polymarket data for traders, quants, and researchers. Raw websocket capture, normalized L1 and L2 order books, executed trades, and reference data — built for backtesting, replay, and market microstructure research.

Lock in 25% off for life. 49 of 50 founder seats remaining.view pricing →see the data →
ticks captured so far · live
5,751,320,246
raw archive
786 GB
compressed · since 2026-05-11
markets in universe
55,923
open + resolved
capture live
recording the full universe right now
redundant collectors
multi-region, multi-IP, no rate-limit ceiling
raw + L1 + L2 + trades
every layer, every market-day
research-grade archives
manifest-tracked, hash-signed, immutable
settled-market workflows
resolved outcomes joined to every event
§ why this exists

Why not just use the public API or sniff the websocket yourself?

You can — and people do. Then they spend months on uptime, storage, normalization, and outcome joins instead of strategy. Reliable backtests need complete capture, normalized layers, and outcome-aware joins by default. tickfoundry removes that infrastructure burden so you can spend your time on the model, not the plumbing.

dimension
DIY / public API
tickfoundry
historical completeness
Public endpoints serve a thin tail of recent data — incomplete on resolved markets and silent on the rest.
Full archived capture from day one, every frame from every market, joined to settled outcomes.
reliability
You manage the collector. Rate limits, dropped connections, gaps, restarts — and you find out after the fact.
Redundant collectors across regions and egress IPs. Auto-reconnect, gap detection, manifest audit on every shard.
data prep
Raw websocket frames only — fragmented across event types, no order-book state machine, no trades atom.
Raw plus normalized L1 / L2 / trades as parquet, dedup contract, reference data joined in.
research readiness
You write the joins: token ↔ market ↔ event ↔ outcome ↔ resolution. Then you debug them.
Backtest-ready: resolved outcomes already joined to every event, every market, every series.
§ what you can do with it

Four jobs the data finishes on day one.

01 /

Backtest on resolved markets

Replay a strategy against settled outcomes — every series, every recurrence, every both-sided market. The 'did this work' answer in one query.

02 /

Replay order books, tick by tick

Walk the book around CPI prints, Fed decisions, debates, earnings. Reconstruct exactly what a trader would have seen at any moment, at L1 or L2 depth.

03 /

Study liquidity and price formation

Spread, depth, imbalance, and reprice latency through time. Microstructure research on a venue most people can only observe live.

04 /

Train models on historical behavior

Outcome-aware features for every settled event. Frame the prediction problem the way the market actually resolved it, not the way you guessed.

Ready to put the data to work?join the waitlist →view pricing →
§ worked example

Replay BTC 15M markets around CPI releases.

One series, three days, every CPI print in the window. Four steps from the moment you download the tarball to a backtest you can point at a settled outcome. Same shape works for Fed decisions, earnings, debates, or any event-anchored study you can name.

  1. 01pull
    One tarball, four datasets — raw, L1, L2, trades — for every recurrence of the BTC 15M series across the date window.
    > GET /atoms/series=btc-15m/date=2026-04-10..2026-04-12
  2. 02join
    Reference data joins every token back to its market, event, condition ID, and the timestamp the outcome resolved.
    > JOIN markets USING (token_id)  →  outcome_resolved_at
  3. 03window
    Anchor each replay on the CPI print timestamp; pull ±30 minutes of L2 depth around the event.
    > WHERE ts_recv BETWEEN cpi - 30m AND cpi + 30m
  4. 04study
    Walk the book tick by tick. Compute spread, depth imbalance, and reprice lag. Backtest a strategy against the settled outcome.
    > replay(l2) → features(spread, imbalance, lag) → pnl
Same recipe, different anchors — Fed decisions, earnings, debates, resolutions.join the waitlist →
§ what you'll actually pull

A peek at the schema.

L1 · top of book + trades · .parquet
recv_ns               mkt            bid    ask    last
1714521608142003821   BTC-100k-2025  .624   .626   .624
1714521608388119044   TRUMP-2024     .517   .519   .518
1714521608412881290   FED-25BPS-MAR  .708   .712   .710
1714521608501277013   ETH-5k-EOY     .411   .414   .412
1714521608611442009   OSCAR-BEST     .891   .894   .892
L2 · top-N depth · .parquet
BTC-100k-2025 @ 14:32:08.142
─────────────────────────
ASK  .629  ░░░░░░░  1,200
ASK  .626  ▓▓▓░░░░  3,100
──── spread ─────────────
BID  .624  ▓▓▓▓▓░░  4,200  ← best
BID  .621  ▓▓░░░░░  1,850

Waitlist subscribers get a free 5 MB sample tarball — one series × one day, full schema — once we open the gates. Inspect the atoms before you spend a dollar. Need just one series for one day for a backtest? À la carte: $9 gets you all four datasets in one tarball, no subscription.

§ founder rate · 50 seats · first-come first-served

First 50 signups lock in 25% off for life.

Every signup inside the first 50 automatically qualifies for the founder rate — no invites, no waiting. At launch we email you payment details. Pay 25% off year one and the discounted rate locks for the life of your subscription — same tier, same scope, no expiry, no year-two price hike. After the 50th signup the founder window closes and new signups go on the standard waitlist at launch prices. Cancel later and the rate is forfeit.

seats claimed1/50
49 open
tier · basic
  • 7 days of history included on signup
  • L1 top-of-book .parquet
  • trades .parquet
  • reference data (markets, events, condition IDs)
  • after-midnight delivery (T+1)
  • shared SFTP + email support
tier · pro
  • 7 days of history included on signup
  • everything in Basic
  • L2 depth .parquet (top-N levels)
  • priority email support
  • subscription dashboard + download history
tier · enterprise
  • 7 days of history included on signup
  • everything in Pro
  • raw .jsonl.zst feed capture
  • intraday snapshots (negotiable cadence)
  • dedicated SFTP + slack support
  • bespoke retention, scope, SLAs
  • first-mover venue access (Kalshi, Hyperliquid)
§ pricing matrix · monthly

Pick your scope. Pick your tier.

Scope is per series — a market line like BTC Up/Down 15M or Fed rate decision. One series covers every recurrence (so BTC 15M = every 15-minute window, every day) and both sides of every market underneath it. Buy the whole line, not the individual windows.

series scopeBasicProEnterprise
1 series
$29/mo
$39/mo launch
$59/mo
$79/mo launch
Contact us
bespoke scope · SLAs · raw feed
3 series
$74/mo
$99/mo launch
$149/mo
$199/mo launch
Contact us
bespoke scope · SLAs · raw feed
all series
$750/mo
$1,000/mo launch
$1,500/mo
$2,000/mo launch
Contact us
bespoke scope · SLAs · raw feed
Top number = founder rate. You pay this amount every month for the life of your subscription — no year-two step-up, no expiry. Struck = standard launch rate, what everyone else pays from day one. The founder window closes once 50 signups are reached; new signups after that join the standard waitlist. Enterprise contracts are bespoke and scoped 1:1; the founder discount does not apply to Enterprise.
§ history bundle
Every subscription ships with the most recent 7 days of history on signup — across whichever series and tier you pick. Need older data? Email us with the series and date range — extended backfill is priced individually per series-day.
7d included
§ à la carte
One-off purchase, no subscription. Pick a series, pick a day, pay $9 per (series × day) bundle — every recurrence of that series for that day, both sides of every market underneath, all four datasets (raw, L1, L2, trades) in one tarball.
$9 / series-day
§ roadmap · public · updated as we ship

What's done, what's left.

[ ✓ ]Live capture · the full Polymarket universeMultiple redundant collectors recording every market and event, multi-region, multi-IP egress. No rate-limit ceiling. 360M+ ticks archived and growing.
[ ✓ ]Raw archive · immutable + audit-trailedEvery inbound frame stored exactly as received — compressed, hash-signed, manifest-tracked. The system of record. Everything downstream is reproducible from it.
[ ✓ ]Reference data + observabilityFull Polymarket taxonomy in our database — any token ID maps back to a market, event, outcome, and lifecycle state. Auto-reconnect, gap detection, sequence-audit, per-shard freshness monitoring. Loud failures, no silent drops.
[ wip ]Normalized datasets · L1, L2, trades as .parquetDaily per-market files: top-of-book, orderbook depth, executed trades. The atoms you would actually query. In active development; backfill from raw on roll-out.
[ wip ]Customer delivery · catalog, checkout, downloadsStorefront, Stripe checkout + subscriptions, entitled SFTP and presigned-URL downloads, sample-data preview. Subscription and à la carte from launch.
[ ⌖ ]Public launch + multi-venue expansionPolymarket: Q3 2026. Waitlist invites go out in order. Kalshi beta: Q1 2027. Hyperliquid: Q2 2027. PredictIt and Manifold to follow.
§ common questions

What people ask before signing up.

When do you launch?

Q3 2026. The first 50 signups are already locked in for the founder rate — we'll email you at launch with payment details.

How do founder seats actually work?

Just sign up — that's it. The first 50 signups automatically qualify for the founder rate (25% off year one, locked for the life of the subscription). No invites, no waiting in line. At launch we email you payment details. After the 50th signup the founder window closes and new signups join the standard waitlist at launch prices.

Is the founder rate really for life?

Yes — you pay the 25%-off price every month for the life of your subscription. Not just year one: month 13, month 60, the rate is the same. Same tier, same scope, no expiry, no year-two step-up to launch pricing. Cancel and the rate is forfeit; re-subscribing later puts you at launch prices.

What if I sign up after the 50th person?

You're on the standard waitlist for launch. Same data, same tiers — just at launch rates instead of the founder discount. If a founder cancels we may offer the seat to the next person on the list at our discretion. There are only 50 founder seats by design.

What do people actually use this data for?

Backtesting trading strategies against fully-resolved outcomes (you know who won), studying how a market repriced around a debate / poll / weather event / Fed print, training models on settled probabilities, building liquidity and microstructure features, replaying a book tick-by-tick to debug an execution algo, or just answering "what was the market doing the hour Trump tweeted X." Historical L2 + raw is rare for prediction markets — most third-party feeds only cache last-trade. That gap is the product.

How are you different from a free WS sniffer?

Four redundant collectors, multi-IP egress so Cloudflare rate-limits don't cap us, sequence-gap audit on every file, full backfill from day one, normalized columnar atoms, signed manifests, deterministic L2 replay. The plumbing is the product.

What's the difference between the tiers?

Basic = L1 top-of-book + trades + reference data. Pro adds L2 depth (top-N levels). Enterprise adds the raw .jsonl.zst feed, intraday snapshots, dedicated SFTP, and bespoke retention/scope/SLAs — priced 1:1, contact info@tickfoundry.com. All tiers deliver after midnight UTC.

Will my email be sold or spammed?

No. The waitlist gets one launch email plus one short build update per month at most. One-click unsubscribe on every send.

What's a "series"? Why is pricing per series instead of per market?

A series is a market line — e.g. "BTC Up/Down 15M" or "Fed rate decision." One series can resolve to one market (a single named question that runs once) or to many recurring instances ("BTC 15M" spawns a new 15-minute up/down market every quarter hour, ~96 markets/day). Pricing the per-window instance would mean $864/day for BTC 15M, which isn't what anyone wants. Subscriptions and à la carte are both priced per series — you buy the whole line and get every recurrence, both sides of every market.

How is the data delivered?

SFTP at first — shared on Basic/Pro, dedicated on Enterprise. S3-presigned-URL download via a customer portal is coming alongside the storefront. Raw, L1, L2, and trades are separate files per (market, day), bundled into a tarball per (series, day) at delivery.

How much history do I get when I subscribe?

Every subscription ships with the most recent 7 days of history at signup, across whichever series and tier you pick — so you have something to query on day one, not "wait a week for the first nightly drop." Need older data (election day archives, last year's Fed meetings, anything pre-bundle)? Email info@tickfoundry.com with the series and date range — extended backfill is priced individually per series-day. À la carte at $9 is the simplest way to grab single days without subscribing.

Do I have to subscribe? Can I just buy one series-day?

Yes — à la carte at $9 per (series × day). You get every recurrence of that series for that day, both sides of every market underneath, and all four datasets (raw, L1, L2, trades) in a single tarball. Useful for one-off research, backtests, or "I just want to see this one Fed day." Subscriptions make sense once you cross ~10 series-days/month.

Why prediction markets — and what about other venues?

Prediction markets are the cleanest live laboratory for price discovery on real-world events — election odds, monetary policy, sports, weather, anything binary. We start with Polymarket today; Kalshi beta lands Q1 2027, Hyperliquid Q2 2027, PredictIt and Manifold to follow. Founders get free beta access during initial rollout.

Lock the founder rate.

49 of 50 founder seats still open · first-come first-served · 1 signups so far. 25% off, locked for the life of your subscription.