Participation Over Permission
Compression-amplified provenance signal detection
for digital media.
The signal is not the perturbation.The signal is the system's response
to the perturbation over time.
courier over the city
What this is
A statistical perturbation scheme that embeds provenance signal into digital media. The signal survives aggressive JPEG compression — and strengthens under it. Marked corpus mean: 0.9988. Clean baseline mean: 0.0000. The distributions do not touch.
Not a watermark. Not steganography. Not DRM. Not a certificate authority. The signal proves participation. A matching service proves identity. These are different claims, and the difference matters.
At save time, force inter-channel distances at selected pixel positions to prime-valued gaps. Cost: ~0.01 joules. Imperceptible to the human eye.
The codec's block-based quantization penalizes the perturbation's local complexity. Each generation amplifies the variance anomaly at marker positions.
Measure statistical distributions at candidate vs. control positions. The image is its own control group. No reference image needed.
Positions derived from a 256-bit key via HMAC-SHA512. The position pattern is the fingerprint. C(4000,200) ≈ 10400 possible patterns.
The prime values. Destroyed by the first compression. Engineering: format-specific, replaceable.
The variance anomaly. Persists and amplifies. Physics: universal to any partition-transform-quantize system.
Defense in depth. Each layer operates independently in a different domain, with a validated operational range and honestly documented failure modes.
Replace JPEG quantization table entries with nearest primes. The table itself is the provenance signal. O(1) detection — scan 128 bytes without decoding. Operational: G0 only. Dies on any re-encode, which is expected: the absence of Layer A after processing is evidence of processing, not failure.
Twin-prime markers at known positions with AND logic. Operational: G0–G2. Frequency signal degrades under aggressive compression — at G4 detection is near zero. This is an honest result: Layer BC is a near-field detector. OR logic was tested and rejected (raised false positive rate to ~22% on controls).
KS test on local variance distributions. Blind — no reference image needed. Two prior versions failed: v1 measured JPEG's own blocking artifact (0.89 FP rate on clean images); v2 measured natural chromatic asymmetry (0.98 FP rate). Architectural constraint: Layer D enters the combined score only when a manifest-mode layer (E or F) has non-zero score. With this constraint, false positive rate collapses to 0%. Without it, Layer D cannot distinguish marked from naturally asymmetric images.
Mersenne-prime anchors (M=31 entry, M=7 exit) with correlated flanking pixels within 8×8 DCT blocks. The differential between positions survives when absolute values do not (intra-block correlation r=0.96 at Q40). Three detection tiers (T24/T16/T8) degrade gracefully. M=127 permanently excluded — 97.6% both-catastrophic rate due to JPEG's chroma gravity well at mid-range values. Operational: G0–G4. 98.6% at Q40, 28.4% tier demotion, >99% effective at any tier. Validated on 500 images. Zero false State D across 250 image-generation combinations.
24-bit payload (creator ID, hash fragment, protocol version, flags) encoded in positional offsets from natural section boundaries. 2 bits per section, ~108 sections, ~9 votes per bit position via majority vote. Positions survive JPEG. Values do not. Operational: G0–G4. 800/800 images: 100% recovery on all fields, mean bit margin 1.000 (unanimous), zero uncertain bits at Q40. This is not robustness — it is invariance.
All numbers from the DIV2K corpus through Q95 → Q85 → Q75 → Q60 → Q40 cascade. Codec: Pillow's libjpeg. No other implementations tested.
| Metric | Result | Corpus | Layer |
|---|---|---|---|
| Combined score — marked | 0.9988 | 50 images | All (combined harness) |
| Combined score — clean | 0.0000 | 50 images | All (combined harness) |
| False positives | 0 / 50 | 50 images | All (combined harness) |
| State B rate (800-image) | 99.5% (796/800) | 800 images | All |
| FP rate (800-image) | 0 / 800 | 800 images | All |
| Sentinel T24 intact at Q40 | 98.6% | 500 images | E |
| Effective detection (any tier) | >99% | 500 images | E |
| False State D | 0 / 250 combinations | 500 images | E |
| CID recovery at Q40 | 100% (800/800) | 800 images | F |
| Payload bit margin at Q40 | 1.000 (unanimous) | 800 images | F |
| Uncertain bits | 0 | 800 images | F |
| DQT detection at G0 | 100% | All | A |
| Layer BC at G4 | ~0% (near chance) | 500 images | BC |
Geometric transforms (rotation, non-integer scaling, affine). Neural codecs (HEIC, AVIF). Quality below Q10. Codec implementations beyond Pillow's libjpeg (MozJPEG, libjpeg-turbo, hardware encoders).
50-image FP confidence interval upper bound: 5.8%. 800-image bound: 0.36%. Neither sufficient for production at scale without further validation. Layer D cannot stand alone. Layer BC is not effective past Q75.
The full technical documentation, design history, and proofs behind every claim on this page.
Six wrong assumptions, the M=127 gravity well, relational encoding, and how every failure became a feature.
Layer E: from flat-line failure to 98.6% survival. The relational encoding breakthrough.
Formal definitions, novelty assessment, and why compression is the amplifier, not the enemy.
Rotation, slice-and-stitch, scale, Ship of Theseus. What survived, what broke, and the economics of suppression.
Seven failure modes and open problems. Seed compromise, no revocation, no forward secrecy — by design.
Part I: Discovery. Part II: Validation. Built in a single sustained conversation between a human and an AI.