You’ve just finished glancing at the third submission for the binary search tree implementation. The first two were messy, functional, and clearly student-written. The third is… different. It’s clean, uses advanced language features not yet covered, and has a comment style you’ve never seen in your class. Your gut twinges. You’re right to listen.
Automated code similarity checkers like Codequiry, MOSS, and JPlag are essential forensic tools. But the most effective plagiarism detection pipeline starts with human intuition. Seasoned computer science professors and TAs develop a sixth sense for submissions that don’t “smell” right. These aren't definitive proofs of cheating—they're high-probability signals that warrant a formal scan.
These code smells of plagiarism are less about bugs and more about authenticity. They reveal a disconnect between the student’s demonstrated ability and the submitted artifact. Catching them early saves hours of grading frustration and upholds academic standards. Let’s break down the eight most telling signs.
1. The Architectural Anomaly
This is the single strongest indicator. A student who struggles with basic array manipulation in Week 5 suddenly submits a perfectly abstracted, factory-pattern-using, interface-driven solution in Week 6. The complexity of the design radically exceeds the problem's requirements and the course's current learning objectives.
The giveaway is the use of concepts that haven't been taught. In an introductory Java course, you might see a student using Java Streams, lambda expressions, or custom annotations. In a first-year C++ course, a submission might feature perfect move semantics, template metaprogramming, or the Curiously Recurring Template Pattern (CRTP). The code works flawlessly, but it’s architecturally alien to your syllabus.
"When a student's solution employs design patterns from Chapter 24 of a textbook you're only on Chapter 8 of, you're not looking at a prodigy. You're looking at a copy." – Dr. Elena Rodriguez, CS Department Chair, University of Washington
Example: Your assignment asks for a simple `BankAccount` class with deposit/withdraw methods. A suspicious submission looks like this:
// Suspiciously over-engineered
public interface IFinancialEntity {
void applyTransaction(Transaction t);
BigDecimal getBalance(Currency c);
}
public abstract class AbstractAccount implements IFinancialEntity {
// Template method pattern, dependency injection...
}
public class BankAccount extends AbstractAccount {
// Uses a Strategy pattern for fee calculation
private FeeCalculationStrategy feeStrategy;
// Uses an Observer pattern for balance alerts
private List<BalanceObserver> observers;
}
The code is "better" than it needs to be. That’s the red flag.
2. The Stylistic Whiplash
Every programmer has a fingerprint: their style. Variable naming conventions (`camelCase` vs. `snake_case`), brace placement (K&R vs. Allman), comment style and frequency, use of whitespace, and preference for certain constructs (ternary operators vs. if-else).
Plagiarized code often contains stark, unmixed stylistic shifts within a single file. The main function might use `i`, `j`, `k` for loop variables (the student's style), while a helper function uses `idx`, `counter`, `iterator` (the source's style). One block has detailed Javadoc comments; another has none, or brief inline `//` comments.
This happens when a student copies a core algorithm or function from an external source and wraps it with their own boilerplate to make it compile. The seams are visible.
// Student's own code (messy, no comments)
void main() {
int x[10];
for(int i=0;i<10;i++) x[i]=i*2;
// COPIED BLOCK STARTS - different style, perfect formatting
/**
* Sorts array using optimized quicksort.
* @param arr The array to sort
* @param low Starting index
* @param high Ending index
*/
void quickSort(int arr[], int low, int high) {
if (low < high) {
int pi = partition(arr, low, high);
quickSort(arr, low, pi - 1);
quickSort(arr, pi + 1, high);
}
}
// COPIED BLOCK ENDS
print("done"); // Back to student's style
}
3. The Dependency Ghost
A student submits code that imports libraries, packages, or header files you never provided, mentioned, or that are irrelevant to the assignment's core goal. This is a clear sign the code was developed in a different environment, for a different purpose, and copied over.
In a Python data structures class, you might see `import numpy` for a simple linked list task. In a C++ graphics assignment requiring raw OpenGL, a submission might include `#include
These "ghost dependencies" often don't break the program if the copied code doesn't actually call them, or if the student removed those calls but forgot the import. They linger as artifacts of the original source.
4. The Error Message Echo
This is a subtle but brilliant clue. A student turns in code that contains commented-out debugging statements, `print`/`cout`/`console.log` lines, or variable values that reference problems, filenames, or function names completely unrelated to your assignment.
These are the digital equivalent of finding a receipt from a different store in a returned item's box. The student copied code that was part of a larger, different project and didn't fully sanitize it.
// Your assignment: Create a 'Circle' class.
// Suspicious comments found:
// TODO: Fix collision detection with Polygon class
// DEBUG: Texture loading fails here on 'spaceship.png'
// Error: Player score not updating in multiplayer mode
// Original file: 'GameEngine.java', line 143
class Circle {
// ...
}
The context is wrong. The comments tell a story that doesn't match your homework problem.
5. The Performance Paradox
The code solves the problem correctly but in a bizarrely inefficient or oddly specific way. It's as if the student knew the "answer" but not the "why." This often surfaces when code is copied from competitive programming sites, Stack Overflow snippets, or highly specialized tutorials.
For a problem easily solved with an O(n) linear scan, the submission implements a complex O(n log n) divide-and-conquer approach. For a task requiring a simple lookup, the student writes a full Trie or Bloom filter implementation. The solution is "over-indexed" on a specific technique, ignoring simpler, more appropriate tools taught in class.
It demonstrates knowledge of an advanced algorithm but a failure to apply the fundamental principle of choosing the right tool for the job.
6. The Variable Name Vestige
Variable, function, and class names that are semantically disconnected from your assignment prompt are huge red flags. If the assignment asks for a `Library` class managing `Book` objects, but the submission has a `Store` class managing `Product` objects, something is off.
This is the result of lazy find-and-replace. The student copied code solving a similar but different problem (e.g., an inventory system) and did a superficial rename. They changed `Store` to `Library` and `Product` to `Book`, but missed internal variables, method names, or comments.
// Assignment: Library System
class Library {
public void addBook(Book b) {
inventory.add(b);
updateShelfCount(); // Makes sense.
calculateSalesTax(b.getPrice()); // What sales tax? This is a library.
logTransaction("RETAIL_SALE", b.getId()); // RETAIL_SALE?
}
private List<Book> warehouse; // Libraries have stacks, not warehouses.
}
The semantics are all wrong. The logic fits a retail store, not a library.
7. The Defensive Over-Documentation
A sudden, dramatic increase in the quantity and quality of comments, especially in a submission from a student who previously submitted sparse or poor documentation. The comments are often excessively formal, use technical jargon beyond the course level, or explain trivial code while glossing over complex logic.
This can be a sign of copying from well-documented open-source projects or tutorials. The student leaves the insightful comments from the original author in place, creating a jarring contrast with their own coding voice. Sometimes, it's a deliberate but clumsy attempt to "make the code look more their own" by adding superfluous explanations.
8. The Collaborative Contradiction
This is a behavioral smell, not a code smell. During office hours or lab sessions, a student cannot explain their own submitted code. They falter on basic questions about how their functions work, why they chose a particular algorithm, or what a specific line does. Their verbal explanation contradicts the implementation.
They may have memorized high-level talking points about the algorithm but cannot trace execution. Ask them to modify their code slightly—add a new feature, handle a new edge case—and they are completely lost, often needing to "start over." This disconnect between submission and comprehension is one of the most reliable human indicators.
What To Do When You Smell Something
Your intuition is a powerful filter, but it's not evidence. When one or more of these smells trigger your suspicion, your next step is systematic.
- Don't Accuse. Use it as a hypothesis. "I noticed some interesting design choices in your code. Can you walk me through your thought process on this function?"
- Run the Scanner. This is where tools like Codequiry move from being a batch processor to a targeted forensic instrument. Take the suspicious submission and run it against the rest of the class, against previous semesters (if you have archives), and against known online repositories. The stylistic and structural anomalies you spotted will often manifest as high similarity scores against an external source.
- Check the Metadata. File creation/modification times, author tags in comments, and IDE project files accidentally included in a ZIP can provide corroborating evidence.
- Follow Institutional Policy. Document your observations, the scan results, and your conversation with the student. Present the facts to your department chair or honor council.
Automated detection is crucial for scale and proof. But the human eye—trained to spot the dissonance between a student's journey and their destination—remains the first and most critical line of defense in maintaining true academic integrity. Your gut is part of your toolkit. Learn to trust it.