Your Open Source Dependencies Are a Legal Minefield

You shipped the feature. The deployment was smooth. Then, six months later, legal forwards you an email. It's from the Software Freedom Law Center. Your company is violating the GNU GPL license by using a modified library in your proprietary SaaS product without releasing the source. You have 30 days to comply. Suddenly, your elegant microservice architecture is a multi-million dollar liability.

This isn't a scare story. It happens. Companies like VMware, Cisco, and dozens of smaller firms have faced lawsuits and enforced compliance actions. Open source is the bedrock of modern software, but treating it as "free" in the legal sense is professional negligence.

"The most expensive software you'll ever use is the open source you didn't track." – An engineering director who spent 9 months on remediation.

The Licenses You Can't Afford to Ignore

Not all licenses are created equal. They fall into three rough categories:

  • Permissive (MIT, Apache 2.0, BSD): The "good neighbors." Use the code, just keep the copyright notice. Minimal obligations.
  • Weak Copyleft (LGPL, MPL): The "share-alike" friends. You can link to their library, but if you modify the library itself, you must share those modifications.
  • Strong Copyleft (GPL, AGPL): The "viral" licenses. If you use GPL code, your entire application may become a "derivative work," forcing you to release your entire source code under GPL. AGPL is even stricter, triggered by network use.

Mixing a GPL component into a proprietary codebase is like introducing a non-native species into an ecosystem. It takes over.

Step 1: Generate Your Software Bill of Materials (SBOM)

You can't manage what you can't see. Your first task is to automatically inventory every dependency, transitive or direct.

# Example using Syft to generate an SBOM for a Node.js app
syft dir:./my-app -o cyclonedx-json > sbom.json

# The output reveals the tree of dependencies you never knew you had.
{
  "bomFormat": "CycloneDX",
  "components": [
    {
      "type": "library",
      "name": "left-pad",
      "version": "1.3.0",
      "licenses": [ { "license": { "id": "MIT" } } ]
    },
    {
      "type": "library",
      "name": "some-crypto-lib",
      "version": "2.1.0",
      "licenses": [ { "license": { "id": "GPL-3.0-only" } } ] // RED FLAG
    }
  ]
}

Tools for the job: Use syft, tern (for containers), or language-specific scanners like license-maven-plugin for Java or license-checker for Node.js. Integrate this into your CI/CD pipeline on every build.

Step 2: Scan and Classify Licenses

Finding the packages is half the battle. You must accurately identify their licenses. This is harder than it sounds.

  • Metadata is often wrong. A package's package.json or pom.xml might list "MIT," but its source files contain GPL headers.
  • Multiple licenses apply. A component may be dual-licensed (OR) or multi-licensed (AND), changing your obligations.
  • Transitive dependencies are the killers. Your clean MIT-licensed library might depend on a GPL'd helper. You're now in the chain.

You need a scanner that looks at source files, not just metadata. Tools like FOSSA, Snyk Open Source, Black Duck, and ScanCode Toolkit do this deep inspection.

Step 3: Establish a Compliance Policy & Process

Scanning gives you data. Policy gives you control.

  1. Define a License Allow/Deny List. Most enterprises will ban AGPL and restrict GPL use to internal tooling only. MIT, Apache 2.0 are typically green-lit.
  2. Mandate Pre-Approval for Exceptions. If a developer must use a GPL library, require a ticket reviewed by engineering and legal. Document the rationale and the containment plan.
  3. Automate Enforcement in CI/CD. Break the build on policy violations.
    # Example GitLab CI rule
    license_scan:
      stage: test
      script:
        - fossa analyze --output
        - if grep -q "GPL-3.0" fossa_report.json; then exit 1; fi
  4. Maintain Attribution Notices. For permissive licenses, you must collect copyright notices. Automate this into a NOTICES.txt file that ships with your product.

Step 4: Don't Forget Your Own Code

Compliance isn't just about what you consume. It's also about what you publish and what you copy.

Developers often copy snippets from Stack Overflow or GitHub gists into the codebase. That snippet might be from a GPL-licensed project. Tools like Codequiry aren't just for academia; their code similarity scanning can be configured to check new commits against known open source repositories, flagging unattributed code ingestion that could introduce a license contaminant. This is a critical layer for code provenance.

Internal policy: All copied code must have a URL comment and a license check. No exceptions.

// BAD: Just pasted in.
function quicksort(arr) { ... }

// GOOD: Provenance documented.
// Source: https://github.com/someuser/algorithms/blob/main/quicksort.js
// License: MIT (confirmed in repository LICENSE file)
function quicksort(arr) { ... }

The Bottom Line

Open source license compliance isn't a one-time audit. It's an engineering discipline, as core as writing tests. The cost of retroactive compliance—refactoring to remove a core GPL dependency—can dwarf the initial development cost.

  • Start now. Run a scan on your main codebase today. You will be surprised.
  • Automate everything. Manual checks fail under the weight of thousands of dependencies.
  • Educate your team. Make "check the license" part of the code review checklist.

The goal isn't to avoid open source. It's to use it powerfully, respectfully, and safely. Your legal department will thank you. Your investors will sleep better. And you won't be the one explaining why the product's architecture needs a full rewrite.