JEPA · Phase 1 sanity check

Does context↔target signal exist on OnlyData company data? Run on Stu (Mac Studio · Apple MPS · all-MiniLM-L6-v2 · 2,000 companies).

Research sandbox only. JEPA output does not flow into the production Agent Readiness score — see /ar100 for the signature AR scorer.
✓ Signal present — Phase 1 passes
Cross-view cosine — mean0.1420
Cross-view cosine — std0.0928 (good variance)
Cross-view cosine — min / max−0.1136 / 0.4856
Pearson r vs AR score0.228 · p=5.3e−25
Spearman r vs AR score0.242 · p=5.7e−28
n2,000

Interpretation: context and target embeddings correlate meaningfully with the heuristic AR score. Variance is healthy (not collapsed at 1 or 0), so a learned predictor has room to pick up structure that the heuristic misses. Green-light to Phase 2 (learned projection head instead of frozen MiniLM).

Plots

sanity_check_correlation.png
sanity_check_correlation.png — cross-view cosine vs AR score
sanity_check_umap.png
sanity_check_umap.png — UMAP of context embeddings colored by industry + readiness

Full run log

sanity.log ↗