Each 5-minute BTC Up/Down binary market produces a timeseries.jsonl sampled at 1-second intervals. Each sample contains: Polymarket share price (YES token), Chainlink BTC oracle price, Binance Spot & Futures BTC price, ATR (rolling 14-candle), orderbook pressure ratios at ±1%/±2%/±5%/±10%/±20% depth ranges, and (for markets ≥ 2026-04-24) CVD, trade volume, whale score, and whale imbalance. Settlement is determined by the Chainlink oracle price at close vs. strike price from the market slug.
Fill model: All strategies use limit-chase fills — a trade is confirmed only when lastTradePrice moves at least 1 cent through the limit price. Touch-and-retreat does not fill. This matches realistic CLOB execution on Polymarket's binary prediction markets.
ATR-expiry strategies are the most reliable directional signal, achieving ~95% win rates by following momentum when ATR aligns with the favorite token direction, entering only when TTE creates irreversible expiry pressure.
Bilateral arb (buy YES cheap + buy NO cheap) only fires in 4–7% of markets at extreme thresholds. When both legs fill, the payout is guaranteed (86–90 cents). Near-parity thresholds fill often but produce negative EV after single-leg losses.
Backtest shows positive spread capture, but live performance suffers from single-leg inventory exposure. When only one leg fills, the unpaired position holds through settlement at full binary risk — the dominant live failure mode.
Trade flow and whale data covers 437 of 970 markets (~45%). Strategies requiring these signals gracefully skip markets without data. Full evaluation requires continued data collection.
Each family is a class of strategies sharing a core signal. Within each family, a grid search exhausts all parameter combinations. Tier 1 = consistently profitable; Tier 2 = conditional edge; Tier 3 = emerging/data-limited; Tier X = structural loser.
Enters when ATR crosses a threshold AND the favorite token's price momentum aligns with direction, timed within a narrow TTE window. The core thesis: once a binary market commits to a direction near expiry, ATR confirms volatility has priced in the move. Cancels early entries; holds to expiry or TP.
Extends atrExpiry by additionally requiring the "favorite" token (>50c) to be on your side. Reduces fill rate but increases precision. Strong near-expiry when market consensus and volatility agree on direction.
Like atrFavoriteExpiry but exits at a fixed take-profit tick count rather than holding to expiry. Captures faster mean-reversion when a volatility spike is temporary. Best for markets that whipsaw before final settlement.
Bets against current momentum when ATR is extreme, anticipating snap-back. Works in choppy mid-market conditions; loses badly when markets trend directly to expiry. Requires careful ATR threshold tuning.
Enters only in the final minutes when price has moved significantly from open. Follows direction and holds to expiry. Late-market momentum is rarely reversed — remaining time provides insufficient opportunity for correction.
Same as lateMomentumHold but exits at a TP tick target. Useful when the late move overshoots "fair" probability and mean-reverts before settlement. Frequently outperforms hold variant in liquid markets.
Fades late-market moves, betting on snap-back. Profitable in thin markets where price overshoots, but dangerous when Chainlink confirms the move's direction at settlement. Win rate highly sensitive to snap-back threshold.
Looks for divergence between Chainlink BTC price and Polymarket share price. When the oracle says BTC is above strike but YES trades at <40c, there's a dislocation to exploit. Edge exists but fills are infrequent near proper dislocations.
Fades extreme price movements in the settlement window, betting on Chainlink reversion. Works when BTC makes a temporary spike or dip near settlement. Edge disappears in markets where Chainlink and Binance align cleanly.
Enters when bid/ask ratio at ±1–5% depth shows strong imbalance. Effective in thick markets but degrades in thin Polymarket orderbooks where a single large order temporarily dominates the sentiment ratio.
Follows OB pressure across a wider ±10–20% depth range. Aggregates more of the true liquidity picture. More stable than narrow-range OB pressure but slower to signal — may enter after the optimal moment.
Buys YES/NO when share price crosses a static threshold. Fundamentally flawed: a low price is often low for a reason. No dynamic signal; fills frequently in the wrong direction. Structural loser in trending binary markets.
Like thresholdTakeProfit but holds to expiry. Adds TTE as a filter but fails to address the root issue: static price thresholds have no predictive value in binary markets with continuous Chainlink settlement.
Follows direction when cumulative volume delta AND whale imbalance agree. Requires both tradeFlowReady and whaleFlowReady flags. Only applicable to ~437/970 markets with full CVD/whale data. Requires more data for conclusive evaluation.
Simpler variant using only CVD ratio (buy pressure / total volume). Enters when |CVD ratio| exceeds threshold after minimum market age. No whale confirmation required. Broader market coverage than whaleMomentum within CVD-enabled markets.
Combines multiple signal families via genetic algorithm optimization. Searches for parameter sets that maximize risk-adjusted return across the full dataset. Risk: overfitting to historical data. Serves as an upper-bound benchmark.
In 5-minute BTC binary markets, price at T-60s or less is a strong predictor of settlement outcome. The probability of reversal drops exponentially as TTE approaches zero. Strategies entering in the final 60–90 seconds with trend alignment consistently achieve 90%+ win rates. Implication: enter late, not early.
When ATR exceeds a market-specific threshold AND aligns with price direction, the market has "committed" — remaining time is insufficient to absorb the volatility and reverse. ATR below threshold = choppy/undecided. ATR above threshold = directional conviction. This is the most statistically significant single signal in the dataset.
Buying YES because it's "cheap" (below 30c) has negative expected value. Low prices reflect accurate market consensus, not mispricing. thresholdTakeProfit and thresholdExpiryHold families confirm this empirically — structural losers across all parameter settings.
OB bid/ask imbalance at ±1–2% depth leads price movement by approximately 15–30 seconds in liquid 5m markets. Useful early signal when combined with ATR confirmation, but unreliable in isolation due to Polymarket's thin orderbook structure (single large orders dominate imbalance ratios).
In any 5m market, the probability that YES trades at ≤7c AND NO trades at ≤7c (YES ≥93c) is ~6.5%. When both legs fill, the 86c spread is locked in regardless of settlement outcome. However, single-leg exposure (95% of time) creates directional binary risk — the dominant P&L driver in spread harvest strategies.
Counter-intuitively, the highest EV combinations are at extreme thresholds (YES≤7c / NO≥93c, EV=1.28¢/market), not near parity (YES≤49c / NO≥51c, EV=−9.05¢/market). The 78% fill rate at parity cannot compensate for the near-zero 2c spread and massive single-leg losses when one side doesn't fill.
CVD and whale signals need minimum market age (60–120 seconds) to stabilize. Volume in the first 60 seconds is dominated by market-open noise — market makers placing initial quotes, bots reacting to Chainlink price initialization. Signal quality improves measurably after the first 90 seconds.
When YinYang's YES leg fills but NO leg does not, the system holds a directional position at full binary risk. If the hedge arrives late at a worse price, it may cost more than the spread earned. Current paper results show inventory imbalance is the primary driver of losses — not the pair cost model itself.
| # | Strategy | Family | Win Rate | Fill Rate | Trades | PnL % |
|---|
Monte Carlo sweep across all cent-step combinations (high=51–99, low=1–49, step=2). For each combination: buy YES when price ≤ low cents AND buy NO when YES ≥ high cents. If both fill in the same market, the pair cost is guaranteed below 1.0. If only one leg fills, hold to settlement at full binary risk.
| # | Combination | Spread | Both Fill Rate | EV / Market | Total PnL | Both Fills |
|---|
Each cell = expected value per market (cents). Green = positive EV. Red = negative.
Wide-spread combos (7c/93c) have the best EV per market but fire in only 6.5% of markets — over 93% of the time you hold a single-leg position at full binary risk. Analysis of single-leg settlements shows YES-only fills slightly favor settlement in the UP direction when YES is at extreme lows, and NO-only fills favor DOWN — suggesting extreme thresholds correlate with correct directional bets, explaining positive EV at extremes.
| # | Strategy | Win Rate | Fill Rate | Trades | PnL % |
|---|
YinYang shows positive PnL in backtest but struggles in live paper trading. The key difference: in backtest, fill confirmation is symmetric. In live markets, one leg often fills (e.g., YES at 7c when BTC dips) but NO never reaches 93c in the same market. The system holds a directional YES position — winning if BTC continues down but losing if it recovers. The same BTC move that triggered YES entry often prevents NO from filling: correlated leg failure.
cvdRatio = tradeCvd / tradeTotalAbsVolume. Range: −1.0 (all sells) to +1.0 (all buys). Threshold typically ±0.15 to ±0.50. Positive ratio = net buy pressure = follow UP. Computed since market open, stabilizes after ~90 seconds.
whaleScore aggregates large-trade directional imbalance. Score > 0 = whale buying. The whaleMomentum family requires CVD and whale to agree on direction before entry — dual confirmation reduces false positives at the cost of lower fill rate.
CVD/whale fields are collected by the Data Collector (App A, port 3444). Markets before 2026-04-24 do not have these fields. To properly evaluate CVD/whale strategies: minimum 2–3 more weeks of data collection (target: 2000+ CVD-enabled markets), backtest segmentation by CVD coverage, and cross-validation to detect overfitting to the 437-market subset.
Entering a directional trade too early (T-300s+) exposes the position to the full volatility of the 5m window. Price can whipsaw 20–30c before settling. Early entries have poor fill quality and high stop-loss rate. Mitigation: Enforce minimum TTE window (e.g., only enter T-120s to T-30s).
In Polymarket's CLOB, a single 1000-share market order can dominate the ±1% OB pressure ratio. This creates spurious signals where one participant's activity looks like market consensus. Mitigation: Use ±10–20% OB ranges or combine OB with ATR confirmation.
Polymarket share prices can lag or anticipate Chainlink oracle updates. A YES price at 80c doesn't guarantee BTC is above strike — Chainlink price used for settlement may differ from Binance spot by seconds. Mitigation: Use Chainlink directly as the settlement signal, not Binance price.
The event that causes YES to reach 7c (BTC dips sharply below strike) is the same event that makes NO unlikely to reach 7c (BTC would need to also trade far above strike). Extreme threshold combos are structurally limited by BTC's volatility profile. Mitigation: Accept low bilateral fill rate as inherent; focus EV on single-leg directional value.
With 46,000+ strategies tested on 970 markets, the risk of finding spurious correlations is high. Mitigation: Walk-forward validation (train on days 1–4, test on days 5–6), hold-out test sets, and cross-family correlation checks.
Split dataset into training (2026-04-23 to 2026-04-26) and test (2026-04-27 to 2026-04-28). Train strategy selection on 4 days, evaluate selected top-16 on held-out 2 days. Quantify overfitting risk for atrExpiry and lateMomentum families.
Continue data collection until 2,000+ markets with CVD/whale data are available. Re-run whaleMomentum and cvdMomentum backtests on the CVD-only subset and compare family ranking with and without CVD access.
Build per-leg P&L attribution: how much PnL comes from paired legs (true arb) vs. single-leg settlement (directional bet). If single-leg P&L is negative and dominates, the strategy is directional, not arb. If positive, single-leg exposure is a profitable side effect.
Compare live paper trading fill rates (from YinYang session logs) with backtest-predicted fill rates. If live rate is materially lower, the fill confirmation model needs adjustment for queue position and market impact.
Measure the exact latency between Binance BTC price moves and Chainlink oracle updates. If Chainlink lags Binance by a predictable window (3–8 seconds), near-settlement Binance moves can predict Chainlink settlement — creating a risk-free arb window for the atrExpiry signal.
Wire top Sentinel candidates into authenticated Polymarket CLOB order placement. Implement: click-to-place limit orders, automatic cancellation on signal reversal, partial fill tracking, real PnL accounting with 7.2% taker fee, and position size management (1% bankroll per trade).