Laptop displaying code with glasses on desk

Semgrep: Find Real Bugs with Pattern-Based Static Analysis

Security 2026-03-04 · 4 min read semgrep static-analysis security code-quality ci-cd devsecops
By DevTools Guide Editorial TeamSoftware engineers and developer advocates covering tools, workflows, and productivity for modern development teams.

Most linters catch style problems. Semgrep catches real bugs: SQL injection, hardcoded secrets, insecure deserialization, SSRF vulnerabilities, and misuse of cryptographic APIs. It works on source code using pattern matching that understands syntax — not just text — and supports 30+ languages out of the box.

Photo by Daniil Komov on Unsplash

The core idea: write patterns in YAML that look like the code you want to find. Semgrep handles the AST parsing and matching. You describe what you're looking for; Semgrep finds it across your entire codebase.

Why Semgrep vs Other Tools

vs grep/regex: Semgrep understands code structure. re.compile($VAR) in Semgrep matches any regex compilation from any variable, regardless of whitespace, line breaks, or variable names. The equivalent regex would miss most real cases.

vs ESLint/Pylint: Language-specific linters are great for style. Semgrep rules are portable across language variations and focus on security-relevant patterns that lint rules rarely address.

vs Sonarqube: SonarQube is heavyweight and license-constrained. Semgrep is lightweight, open source, and runs in CI without a server.

vs Snyk/Dependabot: Those tools scan dependencies. Semgrep analyzes your actual code.

Installation

# pip
pip install semgrep

# Homebrew (macOS)
brew install semgrep

# Docker
docker pull returntocorp/semgrep

First Run: Using Community Rules

Semgrep Registry has thousands of community-contributed rules. Start with the security audit rules for your language:

# Python security audit
semgrep --config p/python

# JavaScript/TypeScript
semgrep --config p/javascript
semgrep --config p/typescript

# Go
semgrep --config p/go

# Java
semgrep --config p/java

# Run on current directory
semgrep --config p/security-audit .

The p/ prefix fetches from Semgrep Registry. The output shows matching files, line numbers, rule IDs, and explanations.

Want more security guides? Get guides like this in your inbox — DevTools Guide delivers one free deep-dive every week.

Writing Custom Rules

Custom rules are YAML files that describe patterns. Here's a rule that detects hardcoded AWS credentials:

# rules/no-hardcoded-aws-keys.yaml
rules:
  - id: hardcoded-aws-access-key
    patterns:
      - pattern: |
          $VAR = "AKIA..."
    message: "Potential hardcoded AWS access key: $VAR"
    languages: [python, javascript, typescript, java, go]
    severity: ERROR
    metadata:
      category: security
      cwe: "CWE-798: Use of Hard-coded Credentials"

Run it:

semgrep --config rules/no-hardcoded-aws-keys.yaml ./src

Pattern Syntax

Semgrep's pattern language is powerful but readable:

Metavariables ($VAR): Match any expression and capture it:

pattern: requests.get($URL, verify=False)
# Matches: requests.get(url, verify=False)
# Also: requests.get(user_input, verify=False)

Ellipsis (...): Match any sequence of statements or arguments:

pattern: |
  cursor.execute($QUERY, ...)
# Matches: cursor.execute(query)
# Also: cursor.execute(query, params)

Pattern-not: Exclude patterns that are false positives:

patterns:
  - pattern: cursor.execute($QUERY)
  - pattern-not: cursor.execute("...")  # literal strings are fine

Pattern-either: Match any of several patterns:

pattern-either:
  - pattern: eval($X)
  - pattern: exec($X)
  - pattern: __import__($X)

Real-World Rule Examples

SQL Injection Detection

rules:
  - id: sql-injection-string-format
    languages: [python]
    patterns:
      - pattern: $DB.execute($QUERY % ...)
      - pattern-not: $DB.execute("..." % ...)
    message: "SQL query built with string formatting — use parameterized queries"
    severity: ERROR

SSRF via User-Controlled URL

rules:
  - id: ssrf-user-controlled-url
    languages: [python]
    patterns:
      - pattern-either:
          - pattern: requests.get($URL, ...)
          - pattern: requests.post($URL, ...)
          - pattern: urllib.request.urlopen($URL, ...)
      - pattern-inside: |
          @app.route(...)
          def $FUNC(...):
              ...
      - pattern: |
          $URL = request.$ATTR
    message: "Potential SSRF: HTTP request to user-controlled URL"
    severity: WARNING

Insecure Deserialization

rules:
  - id: unsafe-pickle-loads
    languages: [python]
    pattern-either:
      - pattern: pickle.loads(...)
      - pattern: pickle.load(...)
    message: "Unsafe deserialization with pickle — never deserialize untrusted data"
    severity: ERROR
    metadata:
      cwe: "CWE-502: Deserialization of Untrusted Data"

Taint Mode: Data Flow Analysis

Semgrep's taint mode tracks data from sources (user input) to sinks (dangerous operations), even across function calls:

rules:
  - id: taint-sql-injection
    mode: taint
    languages: [python]
    pattern-sources:
      - pattern: request.args.get(...)
      - pattern: request.form.get(...)
    pattern-sinks:
      - pattern: $DB.execute(...)
    message: "User input flows to SQL execution — use parameterized queries"
    severity: ERROR

Taint mode is more powerful but slower. Use it for high-value security rules where data flow matters.

CI/CD Integration

GitHub Actions

# .github/workflows/semgrep.yml
name: Semgrep
on:
  push:
    branches: [main]
  pull_request: {}

jobs:
  semgrep:
    name: semgrep/ci
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: returntocorp/semgrep-action@v1
        with:
          config: >-
            p/security-audit
            p/secrets
            rules/

This runs Semgrep on every PR and fails the check if errors are found.

Pre-commit Hook

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/returntocorp/semgrep
    rev: v1.70.0
    hooks:
      - id: semgrep
        args: ["--config", "p/security-audit", "--error"]

Ignoring False Positives

Add inline comments to suppress specific findings:

# nosemgrep: hardcoded-aws-access-key
TEST_ACCESS_KEY = "AKIAIOSFODNN7EXAMPLE"  # This is the example from AWS docs

Or add file-level ignores in .semgrepignore:

tests/
fixtures/
*.test.ts
vendor/

Autofix

Some rules can automatically fix what they find:

rules:
  - id: use-secrets-manager
    pattern: os.environ["$SECRET_NAME"]
    fix: get_secret("$SECRET_NAME")
    message: "Use get_secret() instead of directly reading environment variables"
    languages: [python]
    severity: WARNING

Apply fixes:

semgrep --config rules/ --autofix ./src

Preview changes before applying: semgrep --config rules/ --autofix --dryrun ./src

Semgrep OSS vs Semgrep Pro

Semgrep OSS Semgrep Pro
Price Free Paid (team tier)
Inter-file analysis No Yes
Cross-function taint Limited Full
Managed CI integration Self-configured Managed
SARIF output Yes Yes
Custom rules Unlimited Unlimited

Semgrep OSS handles 90% of use cases — custom rules, CI integration, community rule registry. The Pro tier adds inter-file data flow analysis that's valuable for large codebases with complex call graphs.

Building a Rule Library

Start with these steps:

  1. Run the language security pack (p/python, p/javascript, etc.) and fix real findings
  2. Add secrets detection (p/secrets) to catch hardcoded credentials
  3. Write custom rules for your specific frameworks and patterns (internal APIs, deprecated functions)
  4. Add to pre-commit and CI so rules run on every change

A well-curated rule library catches issues that code review misses: the SQL query built from user input three function calls deep, the API call that skips certificate verification in a utility function used everywhere, the dependency method that returns a coroutine that callers forget to await.

Semgrep doesn't replace code review — it complements it by automating the pattern-matching part of security review, so humans can focus on logic and design.

Get free weekly tips in your inbox. Subscribe to DevTools Guide

More security guides

One focused tutorial every week — no spam, unsubscribe anytime.

Opens Substack to confirm — no spam, unsubscribe anytime.

Before you go...

Get a free weekly guide from DevTools Guide — one focused topic, delivered every week. No spam.

Opens Substack to confirm — no spam, unsubscribe anytime.