English

简体中文

日本語

English (UK)

Deutsch

Français

Español

Italiano

हिन्दी

한국어

Português

Русский

العربية

English (CA)

English (AU)

Español (MX)
Log in
Get Started Free

Alex Petrov

Detection Systems Engineer at Codequiry

Alex focuses on refactoring-resistant similarity detection and benchmarking Codequiry against tools like MOSS, JPlag and Dolos.

Articles by Alex Petrov

How Winnowing Fingerprints Resist Variable Renaming

General 12 min

Alex Petrov • 3 days ago

How Winnowing Fingerprints Resist Variable Renaming

Winnowing fingerprinting is the back‑bone of tools like MOSS that spot copied code even after students rename every variable and shuffle blocks. This deep‑dive unpacks the algorithm, its thresholds, and why a multimodal approach—token, AST, and web‑source checking—covers the gaps that fingerprinting alone leaves open.

Read

How Much Copied Stack Overflow Code Do Plagiarism Tools Actually Catch

General 10 min

Alex Petrov • 3 weeks ago

How Much Copied Stack Overflow Code Do Plagiarism Tools Actually Catch

Traditional similarity tools like MOSS and JPlag compare student submissions against each other but leave a massive blind spot: code copied directly from Stack Overflow, GitHub repositories, and online tutorials. This article examines how web source detection works, what it catches that peer comparison misses, and why both approaches together give you the real picture of code originality.

Read

How Code Similarity Checks Catch Open Source License Violations

General 9 min

Alex Petrov • 3 weeks ago

How Code Similarity Checks Catch Open Source License Violations

Code similarity analysis isn't just for catching student plagiarism. Organizations use the same techniques to identify GPL and other open source license violations in their proprietary codebases. This article walks through the algorithms, real-world cases, and practical workflows for automated license compliance auditing.

Read

Can AST Comparison Survive Student Code Obfuscation

General 3 min

Alex Petrov • 4 weeks ago

Can AST Comparison Survive Student Code Obfuscation

Students often try to hide copied code by renaming variables, restructuring loops, or inserting dead code. AST-based comparison resists many of these tricks, but some deliberate obfuscation—like flattening control flow or converting recursion to iteration—can still produce a false negative. This article examines where AST engines excel, where they fall short, and how combining structural matching with token signatures catches the most clever attempts.

Read

How to Design Assignments That Resist Code Plagiarism

Academic Integrity 9 min

Alex Petrov • 1 month ago

How to Design Assignments That Resist Code Plagiarism

Simple changes to assignment design—unique interfaces, randomized test harnesses, and automated similarity checks—drastically reduce code plagiarism. This guide walks through six concrete tactics with real code examples and grading workflows.

Read

What 4,200 Python Submissions Tell Us About Code Reuse

Case Studies 7 min

Alex Petrov • 1 month ago

What 4,200 Python Submissions Tell Us About Code Reuse

By aggregating similarity scores across 4,200 student Python submissions over three semesters, we uncovered distinct copy-paste behaviors tied to assignment type, submission deadline, and language features. This practical guide walks through the exact process of running a large-scale code reuse audit using Codequiry’s output and Python data analysis, then shows how to turn those numbers into actionable course design decisions.

Read

Automated Code Similarity Checks in a CI Lab Pipeline

Tutorials 7 min

Alex Petrov • 1 month ago

Automated Code Similarity Checks in a CI Lab Pipeline

Setting up automated code plagiarism and similarity checks inside a CI pipeline cuts manual grading time and catches copying that individual reviewers miss. This practical guide walks through the architecture, tooling choices, and honest tradeoffs of running MOSS, JPlag, or Codequiry’s API on every lab push.

Read

How Automatic Grading Evolved From Scripts to Integrity Pipelines

Academic Integrity 9 min

Alex Petrov • 2 months ago

How Automatic Grading Evolved From Scripts to Integrity Pipelines

A retrospective on automatic grading in computer science education—from shell scripts comparing output strings to modern platforms combining unit tests, static analysis, and code similarity detection. What we gained, what we lost, and why integrity pipelines matter more than ever.

Read

Building a Source Code Provenance Pipeline for Contractor Deliverables

Tutorials 10 min

Alex Petrov • 2 months ago

Building a Source Code Provenance Pipeline for Contractor Deliverables

When contractors deliver source code, verifying originality and license compliance is critical. This guide walks through building an automated provenance pipeline that checks for code similarity, license violations, and proper attribution before accepting deliverables into your codebase.

Read

How to Build a Source Code Similarity Pipeline for Detection

Tutorials 12 min

Alex Petrov • 2 months ago

How to Build a Source Code Similarity Pipeline for Detection

A step-by-step guide to building a source code similarity detection pipeline from scratch. Covers tokenization, AST comparison, Winnowing fingerprinting, and heuristic scoring. Includes working Python code and configuration strategies used by universities and enterprises.

Read

Your Open Source License Is a Social Contract, Not a Rulebook

General 6 min

Alex Petrov • 3 months ago

Your Open Source License Is a Social Contract, Not a Rulebook

We treat open source licenses like a tax code to be audited, scanning for SPDX tags and copyright headers. This legalistic approach is creating compliant but ethically bankrupt software. True compliance isn't about checking boxes—it's about understanding and honoring the social intent behind the GPL, MIT, or Apache licenses. It's time to scan for spirit, not just the letter.

Read

Your Static Analysis Tool Is Missing the Real Code Smells

General 8 min

Alex Petrov • 3 months ago

Your Static Analysis Tool Is Missing the Real Code Smells

Most static analysis tools flag trivial style issues while missing the architectural rot that cripples productivity. This guide shows you how to detect the five structural code smells that genuinely predict development slowdowns and defect clusters. We'll walk through real code, build custom detection rules, and integrate findings into your CI/CD pipeline.

Read