AI Builders · For Recruiters

Reach the AI people who just left a lab, founded something, or are building right now.

1,940 active AI researchers, engineers, founders, and product leaders — cross-referenced with their current company, transition state, public artifacts, and a direct contact path. One MCP call. One answer.

1,940
AI Builders
78%
Contactable
386
In transition
307
Founding
76
Recently left
781
GitHub-active 90d
Why this > LinkedIn Recruiter
Evidence-cited, not keyword-matched

Every row cites a public artifact — a paper, a commit history, a launch. No "AI enthusiast" resumes. The dataset is built from arXiv bylines, top OSS contributors, and verified lab teams.

Transition signals, surfaced

We scan every bio for "ex-", "founding", "building", "formerly at" — plus role-type cues — and expose the transition state as a filter. The highest-signal recruiter lead is someone mid-move. LinkedIn doesn't show you that.

Activity-verified

For everyone with a public GitHub, we check recent commit activity. "Is this person still active?" — answered before you reach out.

One MCP call, one answer

Post a job, get ranked candidates with contact paths — in a single tool call. Designed for recruiter agents, not sourcing tabs.

People own their narrative

Any person in the dataset can claim their profile (magic-link sign-up), set their own preferred contact, and broadcast "open to roles" signals. The system respects first-party overrides.

Webhooks, not scraping

Subscribe to new_transition / new_claim / new_match and your CRM gets pushed the right lead the moment it appears. HMAC-signed. No scraping your own data tab.

The 3 killer queries Paste into Claude with the OnlyData MCP connected.

1. In-flight & reachable Live

"Show me AI researchers who just left a lab and have a direct contact path."

Use OnlyData: query_custom_dataset({
  dataset_id: "432d841c-ef9c-41d8-9300-6004bc40a97e",
  transition_status: "recently_left",
  contactable: true,
  limit: 25
})

2. Target company talent graph Live

"Who at Anthropic is the most visible public signal? Who just joined? Who's founding?"

Use OnlyData: ai_minds_at_company({
  key: "anthropic"
})
// also: "openai.com", "google-deepmind"

3. Post a role, get the top 10 Live

"Semantic match my job description against the dataset — only in-transition, only contactable, ranked with rationale."

Use OnlyData: post_job_opening({
  company_name: "Contoso",
  title: "Founding Research Engineer",
  description: "Mech interp stack, SAE work..."
})
// then:
Use OnlyData: match_candidates_for_job({
  job_id: "<id>",
  only_transitioning: true,
  require_contact: true,
  limit: 10
})
Live sample — "Founding Research Engineer — Interpretability", only_transitioning + require_contact, top 3
1
John Yang Stanford University — ex-Princeton
recently left
linkedin.com/in/jyang20
2
Mikio Braun Recently left Principal Eng at Zalando/GetYourGuide
recently left
linkedin.com/in/mikiobraun
3
Buck Shlegeris Redwood Research — interpretability
founding
linkedin.com/in/buck-shlegeris
The full shortlist, in one prompt Paste into Claude with the OnlyData MCP connected. Runs the 6-step recruiter flow end-to-end.

The three killer queries above are drop-in. This is the end-to-end: wide net → transitions → detail → post the real job → run the match engine → render the shortlist with rationale. Swap in your role, paste, read the ranking.

Founding AI Engineer shortlist Copy-paste

Built from the canonical skill ai-minds-recruiter-shortlist — same workflow a Claude Code session runs internally. Swap the job title + description for yours.

I'm hiring a Founding AI Engineer at my early-stage startup. We're building
agentic infra. Find me 10 candidates from AI Builders — and explain why
each one fits.

What I want:
1. Use search_profiles with profile_source="ai_minds_harvest" to find
   high-fit candidates. Bonus points for has_contact=true so I can actually
   reach them.

2. Use query_custom_dataset with transition_status="recently_left" or
   "founding" — these people are ACTIVELY in motion, which the match engine
   says is worth +0.18 / +0.12 score boosts.

3. For my top 5 candidates, pull their detail — show me their why_in_db,
   evidence_url, GitHub activity, and primary_contact.

4. Post my job via post_job_opening, then run match_candidates_for_job.

   Job spec:
   - Title: Founding AI Engineer
   - Description: Building agent runtime + tool-use infrastructure. Strong
     LLM eval / agent framework experience required. Bonus: published work
     on tool use, MCP, or agentic systems. Remote OK, equity-heavy comp.
   - Filters: only_transitioning=true, require_contact=true.

5. Compare the search_profiles ranking vs the match_candidates_for_job
   ranking. Which surfaces better candidates and why?

6. Use ai_minds_at_company on "anthropic" and "openai" for a separate
   "high-credibility-but-flight-risk" shortlist.

For each candidate: name, current company, role_type, transition_status if
any, GitHub-active-90d count, primary contact channel + value, and a 2-line
"why this person" paragraph citing their why_in_db.
What this exercises

search_profiles, query_custom_dataset (with transition_status + contactable), post_job_opening, match_candidates_for_job, ai_minds_at_company — the full recruiter primitive set in one paste.

Why it's better than chat-only search

Chat ranking is vibes. Posting the job runs it through the same embedding + boost engine every other recruiter query uses — so the ranking you see is the ranking the product ships. Plus you get a durable job_id you can later request feedback on.

Adapt it

Swap the title + description for any AI role. For non-founding roles, drop the only_transitioning filter. For a known-company search, change step 6's slug. The skill (.claude/skills/ai-minds-recruiter-shortlist) codifies the canonical 6-step flow so Claude sessions follow the same path deterministically.

Filter gotchas — read before you tune the prompt
only_transitioning=true is too aggressive as a gate

Applied inside match_candidates_for_job, it collapses the scored pool so hard that the top hit often becomes a semantically-adjacent false positive. Run match_candidates_for_job with just require_contact=true, then read .transition.status per candidate. The boost engine (+0.18 recently_left) already privileges transitioners — you don't need to gate on top of that.

search_profiles is a filter, not a ranker

Its result order is claimed > verified > recency — not semantic fit. Use it to find people (by company, by contact, by source), use match_candidates_for_job to rank them. Never surface the top of a search_profiles list as a shortlist.

Three "founder artifact" classes the API auto-flags

founder_of_this_company = founded the queried company (Chris Olah, Daniela Amodei). historical_founding = founded a different past venture (Mike Krieger / Instagram, 2010). tenure_stale_recently_left = ≥18mo at current employer with a stale "ex-X" bio. summary.in_transition excludes all three. Ignore the raw transition on any row where one of these flags is true.

Low transition.confidence = possibly-stale bio

Text-pattern matches ("formerly at X", "ex-Y") fire at confidence≈0.7 and catch both real recent moves AND bios that haven't been updated in 2 years. Spot-check LinkedIn before cold-reaching on any < 0.8 signal. Beats random, not a planning-to-leave guarantee.

Why these candidates The ranking rationale is surfaced per candidate.

Every match is scored by semantic fit (cosine similarity of the job description against the person's embedded bio) plus boosts that reflect real recruiter priorities. The boosts are returned per candidate so you see exactly why they surfaced.

+0.18 recently left

Highest-signal transition state. Someone who just moved is often the warmest lead you'll find.

+0.25 "open to recruiting"

Profile owner explicitly broadcast the signal. Leapfrogs everything.

+0.12 founding / +0.12 building

Founder-mode signals picked up from "founding eng @", "building X", stealth language.

+0.12 claimed / +0.06 verified

The person owns their profile — you're messaging a verified human, not a guess.

+0.08 contactable

We have a direct channel. No "message via LinkedIn InMail and hope."

+0.04 GitHub-active 90d

Public commits in the last 90 days. They're shipping, not dormant.

Stay in the loop Push webhooks, not scrape runs.

Subscribe to events

Register a URL; we POST HMAC-SHA256-signed deliveries when someone in your watch list transitions, claims their profile, or matches a role you posted.

POST /api/ai-minds/webhooks
{
  "name": "My recruiter tool",
  "url":  "https://my-crm/webhooks/onlydata",
  "events": ["new_transition", "new_claim", "new_match"],
  "filters": {
    "company_slugs": ["anthropic", "openai"],
    "transition_status": ["recently_left", "founding"]
  }
}

Contribute talent back

Running an agent that finds great AI people? Submit them to the dataset via POST /api/ai-minds/suggest. We review and merge. Deduped by your source_agent + external_id.

POST /api/ai-minds/suggest
{
  "source_agent": "my-crm-v2",
  "external_id": "abc123",
  "row": {
    "name": "Jane Doe",
    "role_type": "research_engineer",
    "company": "Stealth",
    "linkedin_url": "https://...",
    "why_in_db": "Lead author of X at NeurIPS 2025.",
    "evidence_url": "https://arxiv.org/..."
  }
}

Connect the MCP

Claude Desktop, Claude Code, Cursor, Raycast — all support the OnlyData MCP. One connection, all tools available.

HTTP transport (hosted):  https://mcp.onlydata.club/mcp
OAuth sign-in: https://mcp.onlydata.club/authorize

Local stdio (npx):
  npx @onlydata/mcp-server
  # then connect via Claude Desktop MCP config
On provenance

Every AI Minds row is built from public artifacts the person chose to publish — arXiv papers, GitHub contributions, verified lab team pages. No scraped .edu directories, no guessed emails, no LinkedIn harvest. Anyone in the dataset can claim their profile and set their own narrative — that override wins over every enrichment pass. Anyone can also request removal.

Part of OnlyData Club · Built by Product Hacker · Browse AI Builders · Developer docs