Detect plagiarised and similar code across trillions of code sources on the web See what's new
David Kim

David Kim

Platform Engineer at Codequiry

David works on the scanning pipeline and API that power Codequiry checks at scale, from ingestion to results delivery.

Articles by David Kim

One Community College's Web Code Plagiarism Strategy Case Studies 2 min
David Kim David Kim 4 days ago

One Community College's Web Code Plagiarism Strategy

When intro programming students at a mid-sized community college were copying entire code snippets from Stack Overflow and GitHub, the department needed a scalable detection solution. By integrating Codequiry’s web-source matching into their grading pipeline, they reduced surface-level copy-paste incidents by 40% in a single semester while cutting manual review time by 60%.

How Winnowing Fingerprints Resist Variable Renaming General 8 min
David Kim David Kim 1 month ago

How Winnowing Fingerprints Resist Variable Renaming

Winnowing fingerprinting is a powerful technique for detecting code plagiarism that survives variable renaming, refactoring, and cosmetic changes. This case study examines how the algorithm works, where it succeeds, and where it falls short compared to AST-based approaches.

What Code Similarity Metrics Actually Measure in Student Work General 9 min
David Kim David Kim 1 month ago

What Code Similarity Metrics Actually Measure in Student Work

Not all code similarity is plagiarism, and not all plagiarism is caught by string matching. This article breaks down the three major detection techniques—AST comparison, token-based analysis, and algorithmic fingerprinting—and explains what each one actually reveals about student submissions.

An OSPO Lead's Map Through the GNU License Compliance Maze General 12 min
David Kim David Kim 1 month ago

An OSPO Lead's Map Through the GNU License Compliance Maze

Navigating the tangled web of GNU license compliance across thousands of repositories isn't an academic exercise—it's a daily operational challenge. This profile of a senior OSPO lead reveals the tools, triage workflows, and legal nuance that keep enterprise products out of litigation.

Teaching Students to Write Attribution Comments in Group Work Academic Integrity 10 min
David Kim David Kim 1 month ago

Teaching Students to Write Attribution Comments in Group Work

Attribution comments are a simple but powerful tool for teaching code integrity in collaborative programming projects. This article explains how to implement them effectively, what to include, and how they transform group work from a plagiarism minefield into a learning opportunity.

How Open Source License Auditing Actually Works General 7 min
David Kim David Kim 2 months ago

How Open Source License Auditing Actually Works

Open source license compliance is more than a legal checkbox; it's a critical engineering workflow. This guide walks through the concrete steps of a codebase audit, from initial inventory to resolving conflicts. You'll learn how to map dependencies, interpret license obligations, and build a sustainable compliance practice.

Your Website's HTML Was Stolen Yesterday General 5 min
David Kim David Kim 2 months ago

Your Website's HTML Was Stolen Yesterday

The code that makes your website unique is a prime target for theft. From entire HTML templates to critical JavaScript functions, web plagiarism is rampant and often invisible. This guide shows you where to look and how to fight back, protecting your intellectual property and your competitive edge.

The Hidden Plagiarism Your Static Analyzer Is Missing General 7 min
David Kim David Kim 2 months ago

The Hidden Plagiarism Your Static Analyzer Is Missing

Static analysis tools scan for bugs and smells, but they are blind to a pervasive form of intellectual property theft. Our analysis of 1,200 codebases reveals that 41% contain code plagiarized directly from Stack Overflow, GitHub gists, and commercial tutorials—code often carrying restrictive licenses. This is a legal and integrity blind spot that traditional scanners cannot see.

The 83% Illusion in Your Open Source Compliance General 7 min
David Kim David Kim 3 months ago

The 83% Illusion in Your Open Source Compliance

A 2025 audit of 500 enterprise codebases revealed that 83% contained open-source components with undetected license violations or security flaws. This isn't just a legal problem—it's a direct threat to product viability and company valuation. We analyzed the data to show where compliance tools fail and what effective scanning actually looks like.

Your Static Analysis Tool Is Missing the Real Security Flaws General 9 min
David Kim David Kim 3 months ago

Your Static Analysis Tool Is Missing the Real Security Flaws

Most static analysis tools generate hundreds of low-priority warnings while missing critical, exploitable vulnerabilities. This guide shows you how to reconfigure your scanning pipeline to prioritize the flaws that attackers actually use. We'll move beyond syntax checks to data flow analysis and taint tracking.

Your Codebase Is Full of Stolen Web Snippets General 7 min
David Kim David Kim 3 months ago

Your Codebase Is Full of Stolen Web Snippets

A developer copies a slick animation from Stack Overflow. Another pulls a "helper function" from a random GitHub repo. This is how technical debt and legal liability silently enter your codebase. We map the seven most common—and dangerous—patterns of web code plagiarism in professional software.

Your Open Source Dependencies Are a Legal Minefield General 5 min
David Kim David Kim 3 months ago

Your Open Source Dependencies Are a Legal Minefield

Your application is built on a mountain of open source code, each piece with its own legal requirements. Ignoring them is a ticking bomb. This guide shows you how to map your dependencies, understand their licenses, and build a compliance process that actually works before you get a cease-and-desist letter.