Codequiry Libraries · Powered by Zoekt

Every scan checks your code against 47M+ indexed projects.

The code-search corpus that powers every Codequiry scan — public GitHub repositories, Stack Overflow snippets, and four years of anonymized cross-account submissions. Try the search yourself.

Live search across the corpus. Limited to 10 queries / hour per IP — this is a demo, not an API.
47M+
Indexed documents
GitHub · SO
Public-code sources
65+
Languages covered
< 1s
Median query latency
What's in the corpus

Code your students copy from — indexed before they submit.

Web search misses code on Stack Overflow gists, archived repos, and reposted snippets. The corpus catches what Google doesn't.

Public GitHub repositories

Continuously ingested from active public projects — code that lives in starred repos, course solutions, and homework helpers shows up in your match report.

Stack Overflow answers

Top-voted code snippets and accepted answers, indexed at the chunk level — paraphrased copies still match because the corpus is searched by token, not URL.

Cross-account submissions

Anonymized submissions from across the Codequiry network — the only way to catch contract-cheating rings and recycled coursework that never appeared on the public web.

Why a corpus beats web search alone

Three different problems, three different scan engines.

Codequiry Libraries fills the gap web scanners can't reach — it sees code the open web doesn't index.

What it catches
Web Scan
Libraries
Code on public GitHub repos
Partial
Stack Overflow snippets & paraphrased copies
Limited
Archived / deleted repositories
No
Cross-account / contract-cheating rings
No
Token-level matching (rename-resistant)
No
How it works

Built on Zoekt — the same engine Sourcegraph uses.

Trigram-indexed code search at sub-second latency. Every scan you run hits the same corpus you can query right above.

Continuously indexed

New repositories and answers ingested daily. The corpus grows ~hundreds of thousands of documents a week — your scans get smarter without any work on your end.

Sub-second matching

Trigram-indexed search returns top matches in under a second even across tens of millions of documents. Your submission gets graded faster.

Noise-filtered

Boilerplate (Supabase clients, framework templates, common stdlib idioms) is filtered out automatically. You see real matches, not "this file imports React."

Run a check against the full corpus.

Codequiry Libraries is enabled by default on every paid scan. Free Mode users get peer-similarity checks unlimited, and one credit unlocks a full Libraries scan.