Platform Services Why Vigilant Vigilant Hope Company Research
Request Demo
VIGILANT [RESEARCH]
We Scanned GitHub's Top 50K Repos for CI/CD Vulnerabilities — Here's What We Found

A comprehensive guide from Vigilant's security research team.

50,012 Repos Scanned
20,265 Repos Vulnerable
192,776 Vulnerabilities
590M+ Downstream Forks
Chris Nyhuis
CEO, Vigilant
This is the first in a 9-part research series on CI/CD pipeline security. See the full series below.

Two in five of the most popular open-source repositories on GitHub have exploitable CI/CD pipelines. The repos developers trust most are the most exposed.

The Scan

We pointed Runner Guard - Vigilant’s open-source CI/CD security scanner - at the 50,012 most-starred public repositories on GitHub and let it run. Every workflow file. Every action reference. Every permission block. Fourteen security rules covering injection, supply chain, privilege escalation, and a class of attack that didn’t exist two years ago: AI agent configuration injection.

49,849 repos completed successfully - a 99.7% completion rate. The 211 that didn’t make it had unparseable YAML, which is its own kind of finding.

What came back was 192,776 individual security findings across 20,265 vulnerable repositories. That’s a 40.6% vulnerability rate. Not 40.6% of the internet’s random hobby projects - 40.6% of the most starred, most forked, most depended-on repositories in the open-source ecosystem.

These are the repos that build the frameworks you import, the tools you deploy, and the infrastructure you trust with production credentials every time a CI pipeline runs.

The Numbers

Let’s start with the raw data.

Severity Breakdown

Severity Breakdown - 192,776 Findings Across 50K Repos
Severity Findings Share
Critical 6,790 3.5%
High 39,975 20.7%
Medium 146,011 75.7%
Total 192,776 -

The medium severity count is dominated by a single rule - RGS-007, unpinned third-party actions - which accounts for 143,616 findings on its own. We’ll get to why that matters. The 6,790 critical findings represent attack chains where untrusted input flows directly to code execution with secrets access. Those are the ones that keep security teams up at night.

Rules That Fired

Rule Description Findings Repos Affected
RGS-007 Unpinned third-party action (mutable tag) 143,616 19,005
RGS-008 Overly permissive GITHUB_TOKEN 11,658 7,236
RGS-002 Expression injection (secret exposure) 10,921 5,453
RGS-014 Expression injection in action inputs 7,833 4,702
RGS-004 Comment/issue trigger without auth check 5,660 3,701
RGS-012 Network exfiltration in privileged context 4,537 2,924
RGS-005 Excessive permissions on untrusted trigger 3,069 2,180
RGS-001 Script injection via expression interpolation 2,555 1,777
RGS-009 Unsafe checkout of fork code 1,258 786
RGS-006 Curl-pipe-bash remote code execution 824 685
RGS-003 Filename injection via git diff 538 408
RGS-010 AI agent config file in fork checkout 5 4
RGS-015 Debug logging enabled 302 254
RGS-011 MCP config file in fork checkout 0 0
Rule Distribution - RGS-007 Dominates at 74.5%

RGS-007 dominates because the supply chain problem dominates. Nearly three-quarters of all findings come down to one thing: repositories trusting mutable version tags instead of pinning to immutable SHA hashes. We cover this in depth in The Software Supply Chain Crisis.

RGS-011 - MCP configuration injection - returned zero findings. That’s not a false negative. MCP in CI/CD simply hasn’t reached mainstream adoption yet. The rule is forward-looking, and given the trajectory of AI tooling in development workflows, it won’t stay at zero.

The Popularity Penalty

Here’s the finding that should unsettle every developer who thinks “popular means secure”:

The Popularity Penalty - More Stars = More Vulnerable
Star Bracket Repos Vuln Rate
100K+ 89 65.2%
50K-100K 288 68.1%
10K-50K 4,080 59.4%
5K-10K 5,280 48.2%
1K-5K 24,060 40.5%
100-1K 16,069 33.0%

The 50K-100K star bracket - repos powering frameworks, platforms, and toolchains - has the highest vulnerability rate at 68.1%. The pattern is consistent: more stars means more complex CI/CD pipelines, more action dependencies, more attack surface. The repos developers trust most are, on average, the most exposed.

The 100K+ bracket dips slightly to 65.2% - the very largest projects often have dedicated security teams. But two in three of the most starred repositories on GitHub still have at least one CI/CD security finding.

The repos developers trust most are, on average, the most exposed.

Organizations vs. Individuals

Owner Type Repos Vuln Rate
Organization ~22,000 54.0%
User ~28,000 29.0%

Organizations run more complex CI/CD - multi-stage builds, matrix testing, deployment pipelines, release automation. That complexity creates surface area. An individual developer running npm test in a single workflow is inherently less exposed than an organization building, testing, and deploying across five platforms.

The Trust Paradox

This is the finding that reframes everything: the organizations developers trust most have some of the most exposed CI/CD pipelines in the dataset.

Big Tech's Blind Spot - Org Vulnerability Rates
Organization Type Vuln Repos Total Repos Vuln Rate
Major Cloud Platform 129 349 37.0%
Major OSS Foundation 112 172 65.1%
Major Search/Cloud Company 92 281 32.7%
Major Social Media Company 43 65 66.2%
Major Java Framework Org 26 28 92.9%
GitHub (the company) 25 48 52.1%

A major Java framework organization - the foundation that maintains one of the most widely used server-side frameworks in the world - has a 92.9% vulnerability rate across its repos. 26 out of 28 repositories have CI/CD security findings.

A major open-source software foundation, home to critical data processing and cloud infrastructure projects, sits at 65.1%. A major social media company at 66.2%.

GitHub itself has 25 repos with findings - including its documentation repository, its gitignore templates, and its starter-workflows repo (the templates new users clone to set up CI/CD for the first time).

The brand name on a repository is not a security guarantee.

In many cases, it’s the opposite signal - larger organizations run more complex automation, and more complex automation means more ways in.

The Fork Blast Radius

20,265 vulnerable repositories have a combined 590 million downstream forks.

That number deserves a moment. When a repository has a vulnerability in its workflow files, every fork inherits those files. Even if the parent repository gets fixed, forks carry the vulnerability forward unless individually updated. This is the multiplier that makes CI/CD vulnerabilities a systemic problem, not a collection of isolated issues.

The most forked vulnerable repos:

Category Forks Critical Findings
Popular first-contributions tutorial ~99K 3
Popular AI chat interface ~60K 6
Major tech company’s AI learning repo ~58K 15
Popular workflow automation platform ~56K 20
Popular UI component library ~55K 3

That first-contributions tutorial - the repo beginners use for their first pull request - has 99,000 forks and 3 critical findings. Beginners aren’t just learning Git. They’re learning vulnerable CI/CD patterns from day one. Every one of those forks copied the vulnerable workflow files. This is how insecure defaults propagate through the next generation of developers.

The 2020 Inflection

The 2020 Inflection - GitHub Actions Changed Everything
Year Created Vuln Rate
2015-2017 31.1-31.8%
2018-2019 37.9-42.4%
2020 50.3%
2021 55.2%
2022 57.6%
2023 53.7%
2024 55.3%
2025 59.4%

November 2019: GitHub Actions hit general availability. The inflection is immediate and persistent. Repos created in 2020 and later have vulnerability rates between 50% and 60%, compared to 30-35% for older repos. The generation born into GitHub Actions inherited the supply chain problem from day one.

2025 repos are at 59.4% - the newest projects have the highest vulnerability rates, confirming that the ecosystem hasn’t learned. The defaults haven’t changed. The docs haven’t changed. The templates new developers start from haven’t changed.

The Zombie Supply Chain

809 archived repositories in our dataset have a combined 6.5 million forks - and vulnerabilities that will never be fixed. Nobody’s home to accept a pull request.

Repo Stars Forks Findings Critical
LibreSpark/LibreTV 13,190 26,944 30 0
solana-labs/solana 14,803 5,557 26 4
matrix-org/synapse 12,020 2,118 134 0

LibreSpark/LibreTV has 26,944 forks. Every one inherited 30 vulnerable workflow findings. Solana’s original monorepo is archived with 4 critical findings and 5,557 forks - those vulnerabilities will propagate indefinitely. Matrix migrated from Synapse to a new repo, but the 2,118 forks of the old one still carry 134 findings.

These are the unreachable long tail of supply chain risk. No automated remediation campaign can touch them. They’ll still be forked, cloned, and depended on years from now, carrying vulnerabilities that were never fixed because the maintainers moved on. We explore this further in What’s Next - Fixing 50K Repos.

The Compound Problem

Individual findings tell one story. Compound findings tell a different, more dangerous one.

6,983 repos have vulnerabilities across two or more rule categories. The most common combination is RGS-007 (unpinned action) plus RGS-008 (overly permissive GITHUB_TOKEN), appearing in 3,172 repos. That’s the complete attack chain in a single repo: a compromised action can both steal secrets and push code.

Rule Combination Repos
RGS-007 + RGS-008 (unpinned + write perms) 3,172
RGS-002 + RGS-007 (injection + unpinned) 2,210
RGS-004 + RGS-007 (dangerous trigger + unpinned) 1,564
RGS-007 + RGS-012 (unpinned + network exfil) 1,334

611 repos have the triple compound: unpinned action + write permissions + dangerous trigger. These are the most dangerous repositories in the dataset - repos where an attacker can compromise an action, execute code with write access, and trigger it all from a pull request or issue comment.

541 repos have taint-to-execution chains - untrusted input from PR titles, issue bodies, or branch names flowing directly to shell execution. We break down exactly how these chains work in Anatomy of a CI/CD Chain Attack.

Version Sprawl - The Hidden Attack Surface

The attack surface for a single action isn’t one version tag - it’s dozens.

Action Unique Refs Total Repos
dtolnay/rust-toolchain 44 590
codecov/codecov-action 35 1,049
docker/build-push-action 28 1,268

dtolnay/rust-toolchain is referenced across 44 unique mutable tags - @stable, @master, @nightly, @v1, plus dozens of specific version strings. 201 of those repos pin to @master, meaning every single push to that repository executes new code in 201 CI pipelines. codecov/codecov-action spans 35 versions from @v1.0.0 to @v5.5.2 across 1,049 repos - 35 separate trust points, none of them immutable.

An attacker doesn’t need to compromise the latest version. Any mutable ref is a valid target. Old versions (@v1, @v2) are less monitored but still actively used - 85 repos still run codecov/codecov-action@v1. And refs that look specific, like @v5.5.2, create an illusion of pinning - they’re still mutable tags, not SHA hashes.

The newest generation of developer tooling is already repeating the pattern. astral-sh/setup-uv - the action for Python’s fastest-growing package manager - has accumulated 276 unpinned references across 12 version tags. oven-sh/setup-bun is at 195 repos. Version sprawl is happening in real time with new tools, not just legacy actions.

AI in the Pipeline

claude-code-action Adoption - Nearly Doubled During Scan

AI tools in CI/CD are no longer experimental - they’re mainstream. anthropics/claude-code-action is used unpinned in 175 repos, nearly doubling from 77 repos at the 40% scan mark to 175 at completion. The breakdown: @v1 in 140 repos, @beta in 35, @eap in 3, @main in 1.

The action itself isn’t the vulnerability - claude-code-action does what it’s designed to do. The vulnerability is that 175 repos trust mutable tags instead of pinning to a SHA. If Anthropic’s GitHub organization were compromised, every one of those repos would run the attacker’s code on their next pull request.

More concerning: RGS-010 - our detection rule for AI agent configuration injection - found 5 findings across 4 repos, including a leading Python AI framework. The attack vector: an attacker submits a PR with a malicious CLAUDE.md or .cursorrules file, the AI agent loads it during automated review on pull_request_target, and follows the attacker’s instructions with full CI secrets access. The AI agent designed to improve code quality becomes the attack vector.

Runner Guard is the only scanner detecting this class of vulnerability. As AI in CI/CD scales from hundreds to thousands of repos, this attack surface will scale with it. Our deep dive on this is in AI Agents as Force Multipliers.

The High-Profile Findings

Category Stars Critical Findings Attack Vector
Cloud ML examples repo 1K-5K 267 Secret exposure
Autonomous AI agent 180K+ 23 Expression injection
Workflow automation platform 175K+ 20 Expression injection
ML framework 90K-100K 12 Expression injection
AI model hub’s transformers library 150K+ 9 Expression injection
Git version control project 55K-60K 9 Injection + secret exposure

A popular autonomous AI agent - one of the most starred repositories on GitHub - has 23 critical expression injection findings. A major ML framework powering much of the AI industry has 1,217 total findings across 7 rule categories. The Git project itself has 9 critical findings.

The 11-rule repo - a major Kubernetes networking project (20K-25K stars) - triggers 11 of 13 firing Runner Guard rules with 222 findings, including 48 critical. This is core cloud infrastructure powering service meshes in production, and it has the broadest vulnerability profile in our dataset.

The Recursive Supply Chain

The most dangerous finding in this dataset isn’t a single vulnerable repo - it’s a pattern. The tools developers use to build GitHub Actions are themselves running vulnerable CI/CD pipelines. Think of it as turtles all the way down. You pin your action to a SHA, but the action’s own build pipeline uses unpinned dependencies. The chain of trust has no foundation.

We identified 187 repos in our dataset that are themselves GitHub Actions - installable components consumed by other repositories’ workflows. 145 of them - 77.5% - have their own CI/CD vulnerabilities.

GitHub Action Repo Findings Impact
Popular self-hosted Git platform 202 Builds the platform AND its Actions
nektos/act 33 The most popular local Actions runner
aws-actions/configure-aws-credentials 19 Major cloud provider’s official Action
peter-evans/create-pull-request 10 Widely-used PR automation Action
shivammathur/setup-php 6 THE PHP setup Action - 199 repos depend on it

nektos/act - the tool developers use to test GitHub Actions locally - has 33 findings in its own CI. aws-actions/configure-aws-credentials, the standard way thousands of repos authenticate with cloud services, has 19 findings. The action that handles your cloud credentials has vulnerable CI.

These are the highest-priority supply chain risks. If a repo IS a GitHub Action AND has its own vulnerabilities, an attacker could compromise the action itself, cascading into every downstream consumer. The full recursive supply chain analysis is in The Software Supply Chain Crisis.

The Ironic RCE

A popular penetration testing framework (30K-40K stars, 15K forks) - used by security professionals worldwide to test for exactly these kinds of vulnerabilities - has code injection and self-hosted runner vulnerabilities in its own CI/CD pipeline. An attacker could achieve persistent remote code execution on the vendor’s build infrastructure through the very tool designed to test for such weaknesses.

Security scanners running unpinned present another layer of irony. aquasecurity/trivy-action, a vulnerability scanner, is pinned to @master in 19 repos. trufflesecurity/trufflehog, a secret scanner, is pinned to @main in 8 repos. The tools designed to find security issues are deployed in the most insecure way possible.

Where the Vulnerabilities Live

Workflow file names tell their own story. The most vulnerable workflows cluster around two files:

Workflow File Repos with Findings
release.yml 3,143
ci.yml 3,100

Release pipelines and CI pipelines - the two workflow types with the most sensitive access - are where the findings concentrate. Release workflows handle publishing credentials, signing keys, and registry tokens. CI workflows run on every push and pull request, processing untrusted input. Both are high-value targets, and both are the most commonly misconfigured.

The AGPL Paradox

The AGPL Paradox - License vs. CI/CD Security

Vulnerability rates by license tell an unexpected story:

License Repos Vuln Rate
AGPL-3.0 1,097 65.5%
Apache-2.0 8,697 49.1%
GPL-3.0 3,642 46.8%
MIT 19,855 42.0%
BSD-2-Clause 484 35.0%

AGPL-3.0 repos - the most restrictive license about code freedom and sharing - have the highest vulnerability rate at 65.5%. BSD-2-Clause - the most permissive - has the lowest at 35.0%. The pattern: more complex licensing tends to correlate with more complex projects, which drive more complex CI/CD pipelines, which create more attack surface.

AGPL repos in our dataset tend to be self-hosted platforms - Git hosting, workflow automation, low-code tools - with elaborate multi-stage build and deployment pipelines. Simpler projects under permissive licenses run simpler CI.

The Liability Gap - You’re on Your Own

Here’s what every open-source license in that table has in common - MIT, Apache, GPL, BSD, AGPL, all of them:

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND… IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY…

That’s from the MIT license - the most common in our dataset at 19,855 repos. Apache-2.0, GPL-3.0, and every other license in the table contain equivalent language. When you write uses: softprops/action-gh-release@v2 in your workflow file, you’re running code from a single-maintainer GitHub account inside your CI/CD pipeline - with access to your secrets, your deployment credentials, and your source code - under a license that explicitly says nobody is liable if something goes wrong.

This isn’t a criticism of open source. The disclaimers exist for good reason - maintainers can’t guarantee the security of code used in contexts they’ve never seen. But the practical implication is stark:

You are the last line of defense for your own pipeline. Not the maintainer, not the license, not GitHub.

The same principle applies to commercial software that depends on open-source CI/CD components - and virtually all of it does. When your vendor ships software built through a GitHub Actions pipeline with unpinned actions and overly permissive tokens, the vendor’s liability to you is limited by their terms of service. The open-source maintainer’s liability to the vendor is zero. The chain of accountability has gaps at every link.

This is why “always review code” isn’t optional advice - it’s the only strategy that works when every license in the ecosystem tells you explicitly that you’re on your own. SHA pinning is the mechanical implementation of that strategy: it forces you to review and approve a specific commit before trusting it in your pipeline, rather than implicitly trusting whatever a tag happens to point to today.

The Fix It guide covers how to consume open source safely - not just as a developer using actions directly, but as an organization evaluating the security posture of the technology you purchase.

The Auto-Fix Opportunity

Here’s the finding that matters most for what comes next: 246,496 findings across the dataset are auto-fixable. The vast majority can be resolved by replacing mutable action version tags with immutable SHA pins - a mechanical transformation that Runner Guard’s autofix engine already performs.

The fix is simple. The adoption is near zero. GitHub ships Dependabot with built-in support for SHA pinning GitHub Actions, yet virtually no one configures it. The tooling exists. The awareness doesn’t.

The gap between how easy these fixes are and how rarely they’re applied is the most frustrating finding in the entire dataset. We lay out the exact fixes - step by step, with before-and-after examples - in Fix It - SHA Pinning, Least Privilege, and the 5-Minute Security Upgrade.

Deep Dives

This article presents the 30,000-foot view. Each major finding cluster gets its own deep analysis:

The Software Supply Chain Crisis - 74.5% of Findings Are Unpinned Actions

Three-quarters of all findings come from a single rule: RGS-007. Nearly 20,000 repos trust mutable version tags for third-party actions - tags that can be moved silently to point at different code. The tj-actions/changed-files incident in March 2025 proved this isn’t theoretical. We map the most commonly unpinned actions, the single-maintainer bottlenecks, and the version sprawl that makes each action dozens of separate attack surfaces.

Anatomy of a CI/CD Chain Attack - From Recon to Exfiltration

A chained exploit links multiple small weaknesses into a full attack path. In CI/CD, the chain runs from reconnaissance (scanning public repos) through compromise (one maintainer account) to payload injection, zero-click propagation, and lateral movement into cloud infrastructure. We walk through each step using real data from our scan - showing how the 3,172 repos with compound vulnerabilities map to a complete attack playbook.

The Docker Chokepoint - One Org, Six Actions, Thousands of Pipelines

Docker’s official GitHub Actions - login, buildx, qemu, build-push, metadata - are the most commonly unpinned actions in the dataset. docker/login-action alone appears unpinned in 1,848 of GitHub’s top repos. A single compromise of Docker’s GitHub organization would give an attacker code execution inside the container build pipelines of the open-source ecosystem’s most critical projects. One org, six actions, thousands of pipelines.

AI Agents as Force Multipliers

The chain attack from our anatomy piece is currently manual and human-paced. AI agents automate every step and remove the skill barrier. The tj-actions incident was the dress rehearsal - an AI-orchestrated version would be the main event, faster, stealthier, and hitting orders of magnitude more targets simultaneously. Meanwhile, AI tools deployed to improve CI/CD security are themselves creating new attack surfaces through configuration injection vectors our scan detected for the first time.

The Language Risk Matrix - Why Rust Repos Are the Most Vulnerable

Rust - the language famous for eliminating memory bugs at compile time - has the highest CI/CD vulnerability rate in our dataset at 77.9%. Nearly four out of five Rust repos have vulnerable pipelines. The paradox makes sense once you understand the toolchain: Rust’s cross-compilation needs drive complex CI matrices with heavy third-party action dependencies, concentrated on a handful of single-maintainer projects. Your Rust code is memory-safe. Your Rust CI is not.

Fix It - SHA Pinning, Least Privilege, and the 5-Minute Security Upgrade

Every finding class Runner Guard detects has a fix. Most of them take less than five minutes. SHA pinning replaces mutable tags with immutable commit hashes. Permission scoping replaces blanket write access with explicit per-job grants. Input validation prevents untrusted data from reaching shell execution. We walk through every fix, with before-and-after examples and the Dependabot configuration that keeps SHA pins current.

What’s Next - Fixing 50K Repos, One PR at a Time

Finding vulnerabilities is step one. We’re submitting pull requests to fix them - automated remediation at a scale that manual security review can’t match. Each PR includes the finding, the risk, and the exact fix. But 809 archived repos and hundreds of abandoned projects will never accept a PR. This is the responsible disclosure story, and a frank assessment of what can and can’t be fixed at ecosystem scale.

Beyond Snapshots - Why CI/CD Security Needs Continuous Monitoring

This entire research campaign is a single snapshot. By the time you read this, repositories have added new workflows, changed action versions, and introduced new injection sinks. A scan captures a moment - continuous monitoring captures the trajectory. We built Runner Guard as the free, open-source entry point. ThreatCert’s Continuous Evolving Risk Telemetry platform runs every 60 minutes, correlating CI/CD findings with network, DNS, TLS, dark web, and social media intelligence across your entire vendor chain.

Methodology

Runner Guard scans GitHub Actions workflow files (.github/workflows/*.yml) for 14 security rule categories covering injection, supply chain trust, permission misconfigurations, and AI agent configuration risks. The scan targeted the 50,012 most-starred public repositories on GitHub as of March 2026, using GitHub’s Contents API to retrieve workflow files without cloning repositories. 49,849 completed successfully (99.7%). 211 were excluded due to unparseable YAML.

Findings are classified by severity (critical, high, medium) based on exploitability and potential impact. Rule definitions, detection logic, and the scanner itself are open-source at Runner Guard.

This is first-party research. We built the scanner. We ran the scan. We’re submitting the PRs to fix what we found. And the tool is free for anyone to run against their own repos.



Scan your repos today. Runner Guard is Vigilant’s free, open-source CI/CD security scanner - the same tool that powered this research. Install it in under a minute:

brew install Vigilant-LLC/tap/runner-guard
runner-guard scan github.com/owner/repo

14 security rules. Zero configuration. One command.


The detection capabilities described above are active across Vigilant client environments today. If your organization wants to assess its current exposure to this attack chain — or understand how our managed services align to your specific environment — contact your Vigilant account team or reach us at vigilantdefense.com.

This event reinforces what Vigilant has long asserted:

Nation-state adversaries are not probing our networks — they are preparing battlefields.

Stay alert, stay aggressive, stay Vigilant,

Chris Nyhuis

CEO, Vigilant

Chris Nyhuis

Vigilant, 7570 Bales Street

Suite 250, West Chester

Ohio 45069, United States

855-238-4445

Background

Chris Nyhuis is Co-Founder and CEO of Vigilant, a cyber security services and technology company that specializes in identifying, isolating, and mitigating threats with unprecedented precision. A sought-after speaker and instructor on Cyber-warfare, Chris's extensive experience has prepared him to understand today's cybersecurity landscape and the intricacies of cyber threats facing organizations. Chris holds multiple patents which ensure data integrity for trusted evolving detection systems.

Disclaimer: Any indicators of compromise (IOCs), YARA signatures, or MITRE mappings provided in this document are for informational and defensive purposes only. They are derived from open-source intelligence and Vigilant's internal threat modeling. Due diligence should be exercised when integrating them into production environments. Effectiveness may vary depending on network posture, tooling, and adversary behavior.

Recent Research
Stay ahead of emerging threats, get Vigilant's research delivered to your inbox.