Detect plagiarised and similar code across trillions of code sources on the web See what's new

Category

Code Plagiarism

How to detect copied, reused and refactored source code across student and developer submissions.

Cross-Language Code Plagiarism Detection Methods Tested General 8 min
James Okafor James Okafor 1 week ago

Cross-Language Code Plagiarism Detection Methods Tested

A rigorous head-to-head comparison of three cross-language code plagiarism detection approaches—tokenization, AST matching, and semantic fingerprinting—tested on 100 student-style assignments translated between Java, Python, and C++. We reveal which method catches translated loops, renamed variables, and switched control flow, and which one drowns in false positives.

Can AST Comparison Survive Student Code Obfuscation General 3 min
Alex Petrov Alex Petrov 1 week ago

Can AST Comparison Survive Student Code Obfuscation

Students often try to hide copied code by renaming variables, restructuring loops, or inserting dead code. AST-based comparison resists many of these tricks, but some deliberate obfuscation—like flattening control flow or converting recursion to iteration—can still produce a false negative. This article examines where AST engines excel, where they fall short, and how combining structural matching with token signatures catches the most clever attempts.

K-gram Fingerprinting for Source Code Similarity Analysis General 9 min
Emily Watson Emily Watson 1 week ago

K-gram Fingerprinting for Source Code Similarity Analysis

K-gram fingerprinting is the backbone of modern code plagiarism detection. This step-by-step guide walks through tokenization, k-gram generation, hashing, winnowing, and comparison — the exact pipeline used by MOSS and Codequiry. Includes Python code examples, algorithmic tradeoffs, and real-world scaling numbers.

How Abstract Syntax Tree Comparison Detects Restructured Code General 1 min
Emily Watson Emily Watson 2 weeks ago

How Abstract Syntax Tree Comparison Detects Restructured Code

Abstract syntax tree (AST) comparison is a powerful technique for detecting code plagiarism that has been restructured through variable renaming, method reordering, and whitespace changes. This article explains how AST comparison works, its strengths and limitations, and when to combine it with token-based methods for best results.

Why Some CS Departments Are Moving Beyond Moss for Plagiarism Detection General 8 min
Dr. Sarah Chen Dr. Sarah Chen 2 weeks ago

Why Some CS Departments Are Moving Beyond Moss for Plagiarism Detection

Riverdale State University’s computer science department spent years relying on Moss to catch plagiarised assignments. But as student work grew more sophisticated — combining copied web code, heavy refactoring, and AI-generated fragments — the department realised token-based similarity alone was no longer sufficient. This case study covers how they transitioned to a multi-tool detection pipeline.

What Code Fingerprinting Is and How It Catches Plagiarism General 10 min
Marcus Rodriguez Marcus Rodriguez 2 weeks ago

What Code Fingerprinting Is and How It Catches Plagiarism

Source-code fingerprinting is the core technique behind every major plagiarism detection tool, from MOSS to Codequiry. This guide explains how it works at the algorithm level, shows you how to interpret its output, and offers practical strategies for designing assignments that resist its limitations.

How Static Analysis Catches Plagiarized Code Before It Ships General 11 min
Emily Watson Emily Watson 1 month ago

How Static Analysis Catches Plagiarized Code Before It Ships

Plagiarism isn't just a classroom problem. When code from Stack Overflow, GitHub repos, or contractor deliverables enters your production codebase without proper attribution, you risk license violations, IP disputes, and technical debt. This guide shows how static analysis tools detect copied code before it ships, using token matching, AST comparison, and dependency scanning.

How Winnowing Fingerprints Resist Variable Renaming General 8 min
David Kim David Kim 1 month ago

How Winnowing Fingerprints Resist Variable Renaming

Winnowing fingerprinting is a powerful technique for detecting code plagiarism that survives variable renaming, refactoring, and cosmetic changes. This case study examines how the algorithm works, where it succeeds, and where it falls short compared to AST-based approaches.

What Code Similarity Metrics Actually Measure in Student Work General 9 min
David Kim David Kim 1 month ago

What Code Similarity Metrics Actually Measure in Student Work

Not all code similarity is plagiarism, and not all plagiarism is caught by string matching. This article breaks down the three major detection techniques—AST comparison, token-based analysis, and algorithmic fingerprinting—and explains what each one actually reveals about student submissions.

What Pair Programming Looks Like in a Plagiarism Detector General 8 min
Marcus Rodriguez Marcus Rodriguez 1 month ago

What Pair Programming Looks Like in a Plagiarism Detector

Pair programming and plagiarism can look identical to automated detectors. This article explains the technical signals that distinguish collaborative work from unauthorized code sharing, and how educators can design assignments and detection workflows that respect both academic integrity and modern development practices.

How Cross-Language Code Plagiarism Detection Actually Works General 10 min
Rachel Foster Rachel Foster 1 month ago

How Cross-Language Code Plagiarism Detection Actually Works

Cross-language code plagiarism presents a growing challenge for programming educators as students discover they can translate solutions between languages to evade detection. This article explains the techniques—AST normalization, semantic fingerprinting, and intermediate representation comparison—that modern tools use to catch these sophisticated cases.

From Paper Traces to Abstract Syntax Trees: Code Similarity Then and Now General 9 min
Rachel Foster Rachel Foster 2 months ago

From Paper Traces to Abstract Syntax Trees: Code Similarity Then and Now

The history of code similarity detection is a story of escalating arms races. What started with professors reading printouts has evolved through Unix diffs, token-based fingerprinting, and into modern abstract syntax tree analysis. This retrospective traces the key technical shifts that shaped how we detect code plagiarism in programming courses today.