Back to Blog
    Kubernetes
    DevOps
    Infrastructure

    We Have 2,000+ Service Accounts and No One Knows Who Owns Them - The Multi-Cloud IAM Crisis Nobody Wants to Admit

    February 18, 2026
    6 min read read
    **“We Have 2,000+ Service Accounts and No One Knows Who Owns Them” — The Multi-Cloud IAM Crisis Nobody Wants to Admit** There’s a certain kind of panic that only shows up in large cloud environments. It doesn’t happen on day one. Or even year one. It happens around year three. After the third acquisition. After the fifth Kubernetes cluster. After someone says, “Wait… how many service accounts do we actually have?” In this case? 2,000+ machine identities across AWS, Azure, and GCP . No centralized inventory. No consistent rotation policy. A mix of IAM roles, service principals, workload identities, and Kubernetes pod identities . That’s not just “a little messy.” That’s a governance time bomb. And the constraints are brutal: - Automated discovery of machine identities - Rotation without app downtime - Least privilege recommendations based on actual usage - CI/CD integration (Jenkins, GitHub Actions) - API-first architecture - No agents - No code changes - No six-month science projects Also: not PAM for humans. This is machine identity lifecycle at scale . This is where most enterprises quietly stall out. Let’s unpack what’s actually realistic. --- ## The First Hard Truth: There Is No Magic Unified Button You’ve already seen it in your evaluations. **CyberArk**? Powerful. Expensive. Feels like bringing a tank to a knife fight. **HashiCorp Vault? Solid. But someone has to run it. HA clusters. Storage backends. Secret engines. Policy drift. That’s a team, not a tool. Cloud-native secrets managers like **AWS Secrets Manager** and **Azure Key Vault**? Functional, but fragmented. You end up managing three different control planes. And fragmentation is exactly what got you here. The industry still hasn’t produced a clean, vendor-neutral “multi-cloud machine identity governor” that requires zero operational lift. So what are enterprises actually doing? They’re not solving it with one product. They’re solving it with architecture. --- ## The Pattern That’s Quietly Winning: Kill Static Credentials Look at the strongest signal in the discussion: > Workload identity + federation for cross-cloud calls. Avoid storing keys or static credentials. That’s not a product recommendation. That’s a strategy. Instead of: - Rotating long-lived access keys - Tracking secret sprawl - Managing password lifecycle They’re eliminating the problem entirely. In Kubernetes, this usually means: - Use OIDC federation - Let workloads assume roles dynamically - Exchange short-lived tokens across clouds AWS supports this with IAM roles for service accounts. GCP supports workload identity federation. Azure supports federated credentials for service principals. When you wire them together correctly, pods don’t store credentials. They request them. They expire. They refresh automatically. No rotation downtime. No agent. No code changes if you’re already using SDK-default credential providers. That’s why this approach scales. --- ## But What About Discovery? This is where things get real. You don’t have centralized inventory . Before you optimize rotation, you need visibility. Enterprises are handling this in three main ways: ### 1. Cloud-Native Inventory + Aggregation Pull identity data from: - AWS IAM - Azure Entra ID - GCP IAM - Kubernetes API Feed it into a central data platform (even something as simple as scheduled exports into a warehouse). Not glamorous. But it gives you: - A list of machine identities - Role bindings - Last used timestamps - Attached policies You can build least-privilege recommendations on top of this. Is it turnkey? No. Is it faster than deploying a massive PAM platform? Yes. --- ### 2. Policy-as-Code + Drift Monitoring Enterprises leaning into API-first models are treating IAM like infrastructure. Terraform state + cloud logs + usage metrics = effective permissions map. From there: - Identify unused actions - Detect over-privileged roles - Automatically open pull requests with reduced policies It’s not fully autonomous, but it scales better than manual review. And it fits your “API-first” requirement . --- ### 3. Identity Graph Tools (Emerging Category) There’s a newer class of tools building identity graphs across clouds. They ingest: - Role assignments - Trust policies - Federation relationships - Usage telemetry Then surface: - Excess privilege - Cross-cloud trust misconfigurations - Dormant service accounts These are lighter weight than full PAM suites and often agentless. The tradeoff? They’re governance visibility tools first. Lifecycle automation sometimes comes second. --- ## Why Vault Feels Heavy (And Why It Still Wins Sometimes) Vault gets dismissed as “operational overhead.” And it is. But the reason some enterprises still choose it isn’t secrets storage. It’s dynamic credentials. Vault can: - Generate short-lived database credentials - Issue temporary cloud access tokens - Enforce TTL and renewal policies If you’re willing to run it properly, it becomes a machine identity broker. But that’s a commitment. If you don’t have staff for it, it will hurt. That’s the real dividing line. Not features. Staffing appetite. --- ## The CI/CD Integration Reality You need Jenkins and GitHub Actions integration . Most enterprises solve this by: - Using OIDC federation from GitHub Actions into cloud roles - Letting Jenkins assume cloud roles dynamically - Removing static CI credentials entirely This reduces: - Credential leakage risk - Manual rotation cycles - Pipeline secret sprawl And it aligns with your “no code changes” constraint. Because modern SDKs already support environment-based token injection. --- ## What Enterprises Actually Deploy in 2026 Here’s the honest answer: They combine: 1. Workload identity federation everywhere possible 2. Short-lived credentials instead of rotation 3. Centralized identity inventory reporting 4. Policy analytics tooling 5. Minimal secret managers only where federation isn’t possible They don’t unify everything under one mega-platform. They standardize patterns. The shift isn’t “which vendor.” It’s “how do we eliminate static machine credentials.” --- ## What You Probably Shouldn’t Do Don’t: - Try to centralize all secrets into one vault in six months - Force applications to change authentication patterns - Deploy agents across 2,000 workloads - Attempt to manually right-size every IAM policy That’s how timelines explode. Your constraints are clear . So the design has to respect them. --- ## If This Were My Environment With 2,000+ service accounts across three clouds , I’d prioritize: **Phase 1 (90 days):** - Inventory machine identities across clouds - Enable workload identity federation for new workloads - Remove new static credentials from CI/CD **Phase 2 (Next 90 days):** - Replace long-lived keys with federated tokens where possible - Introduce usage-based policy trimming - Build dashboards for identity ownership **Phase 3:** - Evaluate whether a lightweight identity governance platform adds value Notice what’s missing? Massive platform rollout. Because your real enemy isn’t lack of tooling. It’s sprawl. --- ## The Uncomfortable Conclusion Multi-cloud machine identity governance isn’t a product problem. It’s an architecture problem. Enterprises that win here: - Stop rotating secrets and start eliminating them - Stop centralizing manually and start federating automatically - Stop thinking “vault everything” and start thinking “trust exchange” The future isn’t better password rotation. It’s fewer passwords. And the teams that internalize that early? They’re the ones who stop waking up at 3 a.m. wondering which forgotten service account still has admin access to production.