English

简体中文

日本語

English (UK)

Deutsch

Français

Español

Italiano

हिन्दी

한국어

Português

Русский

العربية

English (CA)

English (AU)

Español (MX)
Log in
Get Started Free

Emily Watson

Academic Integrity Specialist at Codequiry

Emily works with CS departments and teaching teams on assignment design and fair, defensible plagiarism workflows.

Articles by Emily Watson

Scanning a Codebase for GPL Violations Before Acquisition

General 11 min

Emily Watson • 3 days ago

Scanning a Codebase for GPL Violations Before Acquisition

When ZephyrCloud faced a pre-acquisition license audit, its engineering team turned to automated code similarity scanning after manual searches proved unreliable. The process uncovered several GPL-licensed snippets copied from Stack Overflow and GitHub, forcing a careful remediation effort that saved the deal. Here’s what they learned about using plagiarism detection for open-source compliance.

Read

Teaching Code Attribution Before Students Write a Single Line

Academic Integrity 11 min

Emily Watson • 3 weeks ago

Teaching Code Attribution Before Students Write a Single Line

Too many CS students treat code from Stack Overflow, GitHub, or AI tools as free for the taking. Teaching attribution as a core skill from the first assignment reduces plagiarism and builds professional habits. This article walks through concrete strategies, assignment patterns, and detection workflows that make attribution part of the learning process.

Read

What 1200 Python CS1 Submissions Reveal About AI-Written Code Signatures

Case Studies 9 min

Emily Watson • 3 weeks ago

What 1200 Python CS1 Submissions Reveal About AI-Written Code Signatures

We analyzed 1200 introductory Python submissions from three semesters, applying perplexity, burstiness, and token-frequency analysis to separate human-written code from AI-generated samples. The results reveal a consistent set of statistical signatures that can catch GPT-generated and Copilot-assisted assignments—with measured false-positive rates at each threshold.

Read

Automating Code Plagiarism Detection in Your Grading Workflow

Tutorials 8 min

Emily Watson • 1 month ago

Automating Code Plagiarism Detection in Your Grading Workflow

A practical walkthrough for CS instructors who want to wire code similarity checks directly into their grading workflow. Covers tooling choices, LMS integration, and how to layer in web-source and AI-generated code detection for a complete academic integrity pipeline.

Read

K-gram Fingerprinting for Source Code Similarity Analysis

General 9 min

Emily Watson • 1 month ago

K-gram Fingerprinting for Source Code Similarity Analysis

K-gram fingerprinting is the backbone of modern code plagiarism detection. This step-by-step guide walks through tokenization, k-gram generation, hashing, winnowing, and comparison — the exact pipeline used by MOSS and Codequiry. Includes Python code examples, algorithmic tradeoffs, and real-world scaling numbers.

Read

How Abstract Syntax Tree Comparison Detects Restructured Code

General 1 min

Emily Watson • 1 month ago

How Abstract Syntax Tree Comparison Detects Restructured Code

Abstract syntax tree (AST) comparison is a powerful technique for detecting code plagiarism that has been restructured through variable renaming, method reordering, and whitespace changes. This article explains how AST comparison works, its strengths and limitations, and when to combine it with token-based methods for best results.

Read

How One Bootcamp Built a Code Originality Pipeline

Case Studies 9 min

Emily Watson • 1 month ago

How One Bootcamp Built a Code Originality Pipeline

When CareerDevs Academy scaled from 30 to 200 students per cohort, their manual code review process couldn't keep up with plagiarism and improper code reuse. Here's how they built a tiered originality pipeline combining static analysis, similarity detection, and educational intervention — and what other programs can learn from their approach.

Read

How Static Analysis Catches Plagiarized Code Before It Ships

General 11 min

Emily Watson • 2 months ago

How Static Analysis Catches Plagiarized Code Before It Ships

Plagiarism isn't just a classroom problem. When code from Stack Overflow, GitHub repos, or contractor deliverables enters your production codebase without proper attribution, you risk license violations, IP disputes, and technical debt. This guide shows how static analysis tools detect copied code before it ships, using token matching, AST comparison, and dependency scanning.

Read

A Checklist for Evaluating AI Code Detection Tools

AI Detection 9 min

Emily Watson • 2 months ago

A Checklist for Evaluating AI Code Detection Tools

Not all AI detection tools are created equal, and a single "accuracy" number is dangerously misleading. This article provides a practical, seven-point checklist for evaluating AI-generated code detectors, covering everything from cross-language support and prompt sensitivity to campus-specific deployment constraints.

Read

Your Codebase Is Full of Stolen Web Snippets

General 8 min

Emily Watson • 3 months ago

Your Codebase Is Full of Stolen Web Snippets

A developer copies a slick animation from CodePen. Another integrates a jQuery plugin from a blog. These everyday acts are quietly filling your codebase with unlicensed, potentially toxic code. This guide shows you how to find it, assess the risk, and clean it up before it triggers a legal notice.

Read

The Assignment That Taught Students How to Cheat

Academic Integrity 6 min

Emily Watson • 3 months ago

The Assignment That Taught Students How to Cheat

A well-intentioned "cheat-proof" programming project at a top-tier university inadvertently became a masterclass in sophisticated plagiarism. The fallout revealed a critical gap in how we teach and assess code integrity, forcing a department-wide reckoning on what originality really means in software.

Read

AI Detection Is a Distraction From Real Code Integrity

Academic Integrity 5 min

Emily Watson • 4 months ago

AI Detection Is a Distraction From Real Code Integrity

The industry's panic over ChatGPT is a shiny object distracting us from the foundational rot in how we assess code quality and originality. We're chasing ghosts while ignoring the rampant, mundane plagiarism and technical debt that's been crippling software projects and student learning for decades. True integrity requires looking beyond the AI hype.

Read