English

简体中文

日本語

English (UK)

Deutsch

Français

Español

Italiano

हिन्दी

한국어

Português

Русский

العربية

English (CA)

English (AU)

Español (MX)
Log in
Get Started Free

Code Plagiarism

How to detect copied, reused and refactored source code across student and developer submissions.

General 8 min

James Okafor • 3 weeks ago

Cross-Language Code Plagiarism Detection Methods Tested

A rigorous head-to-head comparison of three cross-language code plagiarism detection approaches—tokenization, AST matching, and semantic fingerprinting—tested on 100 student-style assignments translated between Java, Python, and C++. We reveal which method catches translated loops, renamed variables, and switched control flow, and which one drowns in false positives.

Read

Can AST Comparison Survive Student Code Obfuscation

General 3 min

Alex Petrov • 4 weeks ago

Can AST Comparison Survive Student Code Obfuscation

Students often try to hide copied code by renaming variables, restructuring loops, or inserting dead code. AST-based comparison resists many of these tricks, but some deliberate obfuscation—like flattening control flow or converting recursion to iteration—can still produce a false negative. This article examines where AST engines excel, where they fall short, and how combining structural matching with token signatures catches the most clever attempts.

Read

K-gram Fingerprinting for Source Code Similarity Analysis

General 9 min

Emily Watson • 1 month ago

K-gram Fingerprinting for Source Code Similarity Analysis

K-gram fingerprinting is the backbone of modern code plagiarism detection. This step-by-step guide walks through tokenization, k-gram generation, hashing, winnowing, and comparison — the exact pipeline used by MOSS and Codequiry. Includes Python code examples, algorithmic tradeoffs, and real-world scaling numbers.

Read

How Abstract Syntax Tree Comparison Detects Restructured Code

General 1 min

Emily Watson • 1 month ago

How Abstract Syntax Tree Comparison Detects Restructured Code

Abstract syntax tree (AST) comparison is a powerful technique for detecting code plagiarism that has been restructured through variable renaming, method reordering, and whitespace changes. This article explains how AST comparison works, its strengths and limitations, and when to combine it with token-based methods for best results.

Read

Why Some CS Departments Are Moving Beyond Moss for Plagiarism Detection

General 8 min

Dr. Sarah Chen • 1 month ago

Why Some CS Departments Are Moving Beyond Moss for Plagiarism Detection

Riverdale State University’s computer science department spent years relying on Moss to catch plagiarised assignments. But as student work grew more sophisticated — combining copied web code, heavy refactoring, and AI-generated fragments — the department realised token-based similarity alone was no longer sufficient. This case study covers how they transitioned to a multi-tool detection pipeline.

Read

What Code Fingerprinting Is and How It Catches Plagiarism

General 10 min

Marcus Rodriguez • 1 month ago

What Code Fingerprinting Is and How It Catches Plagiarism

Source-code fingerprinting is the core technique behind every major plagiarism detection tool, from MOSS to Codequiry. This guide explains how it works at the algorithm level, shows you how to interpret its output, and offers practical strategies for designing assignments that resist its limitations.

Read

How Static Analysis Catches Plagiarized Code Before It Ships

General 11 min

Emily Watson • 2 months ago

How Static Analysis Catches Plagiarized Code Before It Ships

Plagiarism isn't just a classroom problem. When code from Stack Overflow, GitHub repos, or contractor deliverables enters your production codebase without proper attribution, you risk license violations, IP disputes, and technical debt. This guide shows how static analysis tools detect copied code before it ships, using token matching, AST comparison, and dependency scanning.

Read

How Winnowing Fingerprints Resist Variable Renaming

General 8 min

David Kim • 2 months ago

How Winnowing Fingerprints Resist Variable Renaming

Winnowing fingerprinting is a powerful technique for detecting code plagiarism that survives variable renaming, refactoring, and cosmetic changes. This case study examines how the algorithm works, where it succeeds, and where it falls short compared to AST-based approaches.

Read

What Code Similarity Metrics Actually Measure in Student Work

General 9 min

David Kim • 2 months ago

What Code Similarity Metrics Actually Measure in Student Work

Not all code similarity is plagiarism, and not all plagiarism is caught by string matching. This article breaks down the three major detection techniques—AST comparison, token-based analysis, and algorithmic fingerprinting—and explains what each one actually reveals about student submissions.

Read

What Pair Programming Looks Like in a Plagiarism Detector

General 8 min

Marcus Rodriguez • 2 months ago

What Pair Programming Looks Like in a Plagiarism Detector

Pair programming and plagiarism can look identical to automated detectors. This article explains the technical signals that distinguish collaborative work from unauthorized code sharing, and how educators can design assignments and detection workflows that respect both academic integrity and modern development practices.

Read

How Cross-Language Code Plagiarism Detection Actually Works

General 10 min

Rachel Foster • 2 months ago

How Cross-Language Code Plagiarism Detection Actually Works

Cross-language code plagiarism presents a growing challenge for programming educators as students discover they can translate solutions between languages to evade detection. This article explains the techniques—AST normalization, semantic fingerprinting, and intermediate representation comparison—that modern tools use to catch these sophisticated cases.

Read

From Paper Traces to Abstract Syntax Trees: Code Similarity Then and Now

General 9 min

Rachel Foster • 2 months ago

From Paper Traces to Abstract Syntax Trees: Code Similarity Then and Now

The history of code similarity detection is a story of escalating arms races. What started with professors reading printouts has evolved through Unix diffs, token-based fingerprinting, and into modern abstract syntax tree analysis. This retrospective traces the key technical shifts that shaped how we detect code plagiarism in programming courses today.

Read