A Checklist for Integrating Code Scanning Into Your CI Pipeline

CI pipeline scanning stages diagram — A typical code scanning pipeline spans static analysis, dependency scanning, secret detection, and custom policy enforcement.

You have a CI pipeline. It runs tests, builds artifacts, and deploys code. But does it scan your code for quality issues, security vulnerabilities, and integrity violations before shipping to production?

Most teams I've worked with have some form of code review. They rely on pull requests and manual eyeballing. That works for catching logic errors and design disagreements. It fails at scale for detecting copy-pasted code, leaked credentials, known vulnerable dependencies, or subtle security anti-patterns that look correct to a human reviewer.

This article presents a practical, five-stage checklist for building a code scanning pipeline. Each stage is optional but additive. Start where you are, add one stage at a time, and iterate. The goal is not perfection — it's progressive improvement.

"A scanning pipeline is not a replacement for code review. It's a force multiplier that lets reviewers focus on architecture and design, not on catching typos or known CVEs."

— Sarah Chen, Staff Engineer at a large e-commerce platform

Stage 1: Static Application Security Testing (SAST)

SAST tools analyze source code without executing it. They look for security vulnerabilities, coding standard violations, and common bugs. Think SQL injection patterns, buffer overflows, cross-site scripting vectors, and unsafe deserialization calls.

For a Python/JavaScript stack, I recommend starting with Semgrep or Bandit (Python-specific). For Java or C#, consider SpotBugs or SonarQube. These tools are free, actively maintained, and integrate well with GitHub Actions, GitLab CI, and Jenkins.

Configuration Example: Semgrep in GitHub Actions

# .github/workflows/semgrep.yml
name: Semgrep SAST Scan
on:
  push:
    branches: [main, develop]
  pull_request:
    branches: [main]
jobs:
  semgrep:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Run Semgrep
        uses: semgrep/semgrep-action@v1
        with:
          config: >-
            p/python
            p/owasp-top-ten
            p/secrets
            p/command-injection
          auditOn: push
          publishToken: ${{ secrets.SEMGREP_APP_TOKEN }}

This runs three rule packs: p/python for Python-specific patterns, p/owasp-top-ten for OWASP Top 10 vulnerabilities, and p/secrets for hardcoded credentials. You can customize rules further by writing your own Semgrep patterns.

Custom Rule Example

Suppose you want to flag any use of eval() in Python code. Semgrep lets you write a custom rule:

rules:
  - id: no-eval
    pattern: eval(...)
    message: "Using eval() is dangerous. Consider ast.literal_eval or a safer alternative."
    severity: WARNING
    languages: [python]

Store this as a YAML file in your repo under .semgrep/rules/no-eval.yaml and reference it in your workflow config.

What to Gate On

Fail the build on any ERROR-severity finding in target branches (main, release).
Warn on WARNING-severity findings and flag them for manual review.
Ignore INFO-level linting suggestions in your scanning gate, but track them in a backlog.

No team I've met can fix every finding on day one. Start by failing only on the most critical patterns, then expand scope as your team's maturity grows.

Stage 2: Dependency and Supply Chain Scanning

Modern applications pull in hundreds of transitive dependencies. A vulnerability in a nested package — think log4j, event-stream, or a compromised npm package — can compromise your entire application. Dependency scanning tools inventory your dependencies and cross-reference them against vulnerability databases.

The gold standard here is OWASP Dependency-Check for Java/.NET, or Snyk CLI for multi-language support. GitHub also has its own Dependabot for dependency alerts and auto-fixes.

Configuration Example: OWASP Dependency-Check in a Maven Project

<!-- pom.xml plugin configuration -->
<plugin>
    <groupId>org.owasp</groupId>
    <artifactId>dependency-check-maven</artifactId>
    <version>9.0.0</version>
    <configuration>
        <failBuildOnAnyVulnerability>true</failBuildOnAnyVulnerability>
        <cvssFailThreshold>7.0</cvssFailThreshold>
        <suppressionFiles>
            <suppressionFile>dependency-check-suppressions.xml</suppressionFile>
        </suppressionFiles>
    </configuration>
    <executions>
        <execution>
            <goals><goal>check</goal></goals>
        </execution>
    </executions>
</plugin>

This configuration does two things: it fails the build if any dependency has a vulnerability with a CVSS score of 7.0 or higher, and it reads a suppression file for known false positives you've already triaged.

Creating a Suppression File

False positives are inevitable. The National Vulnerability Database (NVD) sometimes flags a library version as vulnerable when your actual usage doesn't expose the vulnerable path. Create a suppression file to track these:

<?xml version="1.0" encoding="UTF-8"?>
<suppressions xmlns="https://jeremylong.github.io/DependencyCheck/dependency-suppression.1.3.xsd">
    <suppress>
        <notes>Suppress CVE-2024-12345 in H2 database — we only use H2 in memory for testing</notes>
        <cve>CVE-2024-12345</cve>
    </suppress>
</suppressions>

Commit this file to your repository and keep it under version control. Every suppression needs a comment explaining why it's safe to ignore. This prevents "I'll fix it later" from becoming "I forgot about it forever."

What to Monitor Beyond CVEs

License compatibility — Does a dependency's license (e.g., GPL-3.0) conflict with your project's license (e.g., MIT)?
Deprecated or orphaned packages — Packages that haven't been updated in two years or that have been taken over by a different maintainer.
Malicious package indicators — Tools like Socket flag typosquatting, protestware, and telemetry-collecting packages.

Stage 3: Secrets and Credential Scanning

Hardcoded API keys, database passwords, and SSH private keys end up in repositories more often than teams want to admit. The GitLeaks project reported finding over 100,000 unique secrets in public GitHub repositories in a single month. Scanning for secrets post-commit is reactive; scanning pre-commit is proactive.

Setup with GitLeaks and Pre-commit Hooks

Install pre-commit and add GitLeaks:

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/gitleaks/gitleaks
    rev: v8.18.0
    hooks:
      - id: gitleaks
        args: ['--verbose', '--redact']

Install the hook: pre-commit install. Now every commit scans for secrets before it reaches the remote repository. GitLeaks uses a configurable regex-based detector and also scans for high-entropy strings that look like tokens.

GitHub Actions Workflow for PR Scanning

# .github/workflows/gitleaks.yml
name: Secret Scanning with GitLeaks
on: [pull_request]
jobs:
  scan:
    name: gitleaks
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - uses: gitleaks/gitleaks-action@v2
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          GITLEAKS_LICENSE: ${{ secrets.GITLEAKS_LICENSE }}

This workflow runs on every PR and scans the diff between the PR branch and its base. It fails the check if any new secrets are introduced. Combine this with a .gitleaks.toml file in your repository root to whitelist known test keys or false-positive patterns.

Handling a Leaked Secret

If a secret is detected post-commit:

Rotate the credential immediately — revoke the old key, generate a new one, and update all services that use it.
Remove the secret from git history — use git filter-repo or BFG Repo-Cleaner to scrub the commit that contained the secret.
Notify the security team — document the incident and review why the secret made it past pre-commit hooks.

"We once found an AWS access key in a GitHub repository that had been there for three years. The key was for a production account. Rotating it took 45 minutes of coordinated work across six teams. Prevention is cheaper than incident response."

— Mike Alvarez, Cloud Security Engineer

Stage 4: Code Similarity and Originality Checks

This stage is especially relevant for teams that work with external contractors, open-source contributions, or in regulated industries. Code similarity detection flags code that was copied from internal repos, open-source projects, or — in the case of academic settings — between student submissions. For enterprise, it also detects potential intellectual property violations.

Tools like Codequiry's code plagiarism detection platform compare submissions against a reference corpus of code from the web, common open-source repositories, and your organization's internal codebase. They use techniques similar to MOSS (Measure of Software Similarity) — tokenization, winnowing, and AST fingerprinting — but add the ability to match against a larger corpus and across languages.

When to Run Similarity Checks

For every pull request from external contributors to detect copied code from other open-source projects that may have incompatible licenses.
On contractor deliverables to verify the work is original and not copy-pasted from Stack Overflow or a previous client project.
In academic settings — every assignment submission before grading, as part of an automated grading pipeline.

Integration into a Pipeline

Most similarity detection tools expose an API. Here's a conceptual integration pattern:

# pseudo-code for a CI job that checks code similarity
steps:
  - name: Checkout code
    run: git checkout ${{ github.event.pull_request.head.sha }}

  - name: Submit for similarity analysis
    run: |
      curl -X POST https://api.codequiry.com/v1/submissions \
        -H "Authorization: Bearer ${{ secrets.CODEQUIRY_API_KEY }}" \
        -F "file=@src/main.py" \
        -F "reference_corpus=internal_2024" \
        -F "threshold=0.7"

  - name: Check results
    run: |
      if [ "$(curl -s https://api.codequiry.com/v1/results/$SUBMISSION_ID | jq '.has_match')" = "true" ]; then
        echo "Code similarity detected above threshold. Manual review required."
        exit 1
      fi

Set your threshold based on your organization's tolerance. A 0.7 similarity score (70% match) is a good starting point for investigation. False positives happen — common boilerplate code, standard design patterns, and auto-generated code can trigger matches. Build a triage process, not a hard block.

Stage 5: Custom Policy Enforcement and Quality Gates

The final stage ties everything together. You have SAST results, dependency vulnerability data, secret scan outputs, and similarity reports. Now you need to enforce your organization's policies consistently.

Building a Quality Gate Script

Write a simple shell script that aggregates results from each scanning stage and makes a go/no-go decision:

#!/bin/bash
# quality-gate.sh — exit 0 if pass, exit 1 if fail

SAST_PASS=true
DEP_PASS=true
SECRET_PASS=true
SIMILARITY_PASS=true

# Stage 1: SAST
if [ -f "sast-report.json" ]; then
  CRITICAL=$(jq '.results | map(select(.severity == "ERROR")) | length' sast-report.json)
  if [ "$CRITICAL" -gt 0 ]; then
    echo "FAIL: $CRITICAL critical SAST findings"
    SAST_PASS=false
  fi
fi

# Stage 2: Dependency scan
if [ -f "dependency-check-report.json" ]; then
  HIGH_CVES=$(jq '.dependencies[].vulnerabilities[] | select(.cvssv3.baseScore >= 7.0) | .name' dependency-check-report.json | wc -l)
  if [ "$HIGH_CVES" -gt 0 ]; then
    echo "FAIL: $HIGH_CVES high-severity dependency vulnerabilities"
    DEP_PASS=false
  fi
fi

# Stage 3: Secrets
if [ -f "gitleaks-report.json" ]; then
  SECRETS=$(jq '. | length' gitleaks-report.json)
  if [ "$SECRETS" -gt 0 ]; then
    echo "FAIL: $SECRETS secrets detected"
    SECRET_PASS=false
  fi
fi

# Stage 4: Similarity
if [ -f "similarity-report.json" ]; then
  MATCHES=$(jq '.matches | length' similarity-report.json)
  if [ "$MATCHES" -gt 0 ]; then
    echo "FAIL: $MATCHES code similarity matches above threshold"
    SIMILARITY_PASS=false
  fi
fi

# Overall decision
if [ "$SAST_PASS" = true ] && [ "$DEP_PASS" = true ] && [ "$SECRET_PASS" = true ] && [ "$SIMILARITY_PASS" = true ]; then
  echo "PASS: All quality gates cleared"
  exit 0
else
  echo "FAIL: One or more quality gates not passed"
  exit 1
fi

This script is deliberately simple. In practice, you'll want to parallelize the scanning stages, handle timeouts, and produce a unified summary artifact for the developer. But this pattern gives you a starting point.

Policy Decisions to Make

Policy	Recommended Default	Rationale
Block merge on critical SAST findings	Yes	SQL injection or command injection patterns should never reach production
Block merge on CVSS ≥ 7.0 in new deps	Yes	Start with high severity; expand to medium as your team matures
Block merge on secrets in diff	Always	Secrets in code are the most common initial access vector
Block merge on code similarity > 80%	Warn, not block	Boilerplate and standard patterns cause false positives; require manual review
Block merge on license compliance failure	Yes	GPL code in a proprietary product is a legal risk

Operationalizing the Pipeline

You can add all five stages in a day and announce a new "security pipeline" to your team. That's a recipe for frustration and shadow workarounds. Instead, follow this rollout approach:

Start with stage 1 (SAST) on one target branch. Run it silently — collect results, but don't gate the build. Review findings for a sprint to tune rules and eliminate noise.
Add stage 2 (dependency scanning) once stage 1 is stable. Again, run silently first. Create your suppression file based on real findings.
Introduce stage 3 (secrets) as a pre-commit hook before running it in CI. This gives developers immediate feedback and builds buy-in.
Add stage 4 (similarity) if your organization deals with external code or compliance requirements. Most teams won't need this immediately.
Implement stage 5 (quality gate) last, after all other stages have run silently for at least one development cycle. Use it as a signal, not a decree.

Document every rule, every suppression, and every workaround. Your pipeline should be transparent — any developer should be able to look at the configuration and understand why a build failed.

Measuring Success

Six months after implementing this pipeline, you should be able to answer:

How many CVEs were caught before reaching production?
How many secrets were prevented from being committed?
What is the median time to triage a scanning finding?
What is the false-positive rate for each stage?

These metrics tell you where to invest next. If your SAST tool has a 60% false-positive rate, invest in rule tuning or a different tool. If dependency scanning catches dozens of issues but takes 20 minutes per build, optimize or cache the scan results.

Code scanning is not a one-time setup. It's an ongoing discipline that evolves with your codebase, your team's experience, and the threat landscape. Start small, iterate, and let the data guide you.