Shadow AI Discovery
Techniques for discovering unauthorized AI deployments within organizations, including network scanning, API traffic analysis, cloud resource enumeration, and SaaS inventory assessment.
Shadow AI refers to the use of AI tools, services, and models within an organization without the knowledge, approval, or oversight of IT, security, or governance teams. Just as shadow IT created unmanaged risk in previous decades, shadow AI introduces AI-specific risks including data leakage, compliance violations, and unmonitored attack surface expansion. For red teamers, shadow AI discovery is both a reconnaissance technique and a valuable assessment deliverable.
The Shadow AI Landscape
Categories of Shadow AI
| Category | Description | Risk Level | Examples |
|---|---|---|---|
| Direct AI service use | Employees using external AI services directly | High | ChatGPT, Claude, Gemini for work tasks |
| AI-embedded SaaS | Existing SaaS tools that have added AI features | Medium | Notion AI, Grammarly, Microsoft Copilot |
| Self-hosted models | Teams running their own models on company infrastructure | High | Open-weight models on internal GPU servers |
| API integrations | Developers integrating AI APIs into applications without approval | Critical | OpenAI API keys in production code |
| Browser extensions | AI-powered browser extensions with access to work content | Medium | AI writing assistants, code completion extensions |
| Mobile AI apps | AI applications on personal or company mobile devices | Medium | AI assistants on phones used for work |
Why Shadow AI Matters for Red Teams
| Risk Dimension | Impact |
|---|---|
| Data leakage | Proprietary data, source code, customer information sent to external AI services |
| Compliance violation | AI processing of regulated data (HIPAA, PCI, GDPR) without required controls |
| Expanded attack surface | Unmonitored AI endpoints vulnerable to prompt injection, data extraction |
| Supply chain risk | Unvetted AI providers with unknown security posture |
| Intellectual property | Trade secrets and competitive information potentially used for model training |
| Audit gaps | AI-influenced decisions with no documentation or oversight |
Network-Based Discovery
Traffic Analysis for AI Service Detection
Monitor network traffic for connections to known AI service endpoints:
| AI Provider | Domains to Monitor | Traffic Patterns |
|---|---|---|
| OpenAI | api.openai.com, chat.openai.com, cdn.oaiusercontent.com | HTTPS POST with JSON bodies, streaming SSE responses |
| Anthropic | api.anthropic.com, claude.ai | HTTPS POST with JSON bodies |
generativelanguage.googleapis.com, gemini.google.com, aistudio.google.com | HTTPS POST/GET | |
| Azure OpenAI | *.openai.azure.com | HTTPS POST with API-key headers |
| AWS Bedrock | bedrock-runtime.*.amazonaws.com | AWS SigV4 authenticated requests |
| Hugging Face | api-inference.huggingface.co, huggingface.co | HTTPS POST/GET |
| Replicate | api.replicate.com | HTTPS POST with token auth |
| Together AI | api.together.xyz | HTTPS POST with API-key headers |
| Mistral | api.mistral.ai | HTTPS POST with API-key headers |
| Cohere | api.cohere.ai | HTTPS POST with API-key headers |
Detection Techniques
DNS query analysis
Monitor DNS queries from corporate networks for AI service domains. This works even when traffic content is encrypted.
Implementation:
- Configure DNS logging on corporate DNS resolvers
- Build watchlists of known AI service domains
- Alert on new domains matching AI service patterns
- Track query volume and source hosts
TLS certificate inspection
On networks with TLS inspection capability, examine connection metadata for AI service indicators.
Indicators:
- SNI (Server Name Indication) values matching AI service domains
- Certificate subjects and issuers associated with AI providers
- Connection patterns (long-lived SSE connections typical of streaming AI responses)
Proxy log analysis
If the organization uses web proxies, analyze proxy logs for AI service connections.
Analysis dimensions:
- Volume of requests to AI service endpoints
- Source users and departments
- Request sizes (large requests suggest document or code submission)
- Time patterns (sustained use vs one-time exploration)
Traffic pattern recognition
Even without decrypting traffic, AI service usage creates distinctive patterns.
Distinguishing patterns:
- Long-held HTTP/2 connections with server-sent events (streaming responses)
- Asymmetric request/response sizes (small request, large response for generation)
- Periodic polling patterns (checking async generation results)
- Large upload followed by series of queries (document upload then Q&A)
Cloud Resource Enumeration
Discovering AI Resources in Cloud Environments
Organizations often discover that teams have provisioned AI resources in cloud accounts without governance team awareness:
| Cloud Provider | AI Services to Enumerate | Discovery Method |
|---|---|---|
| AWS | SageMaker, Bedrock, Comprehend, Rekognition, Lex, Polly | AWS Config, CloudTrail, resource inventory APIs |
| Azure | Azure OpenAI, Cognitive Services, ML Studio, AI Search | Azure Resource Graph, Activity Log |
| GCP | Vertex AI, AI Platform, Cloud Vision, Natural Language, Gemini API | Cloud Asset Inventory, Audit Logs |
Cloud Discovery Checklist
| Discovery Activity | What to Look For |
|---|---|
| Service enablement | Which AI services are enabled across all accounts/subscriptions/projects? |
| Resource inventory | Deployed AI models, endpoints, training jobs, datasets |
| API key inventory | Active API keys for AI services, key age, last usage |
| IAM analysis | Who has permissions to create and manage AI resources? |
| Cost analysis | AI service spending may reveal undiscovered usage |
| Network analysis | VPC endpoints or private links to AI services |
| Container analysis | Docker images containing ML frameworks or model weights |
| GPU allocation | GPU instances that may be running inference workloads |
Identifying Self-Hosted Models
Teams may run open-weight models on company infrastructure without governance awareness:
| Indicator | Detection Method |
|---|---|
| GPU utilization on non-designated servers | Infrastructure monitoring, GPU metrics |
| ML framework processes (PyTorch, TensorFlow, vLLM, Ollama) | Process monitoring, container scanning |
| Large model weight files on shared storage | File system scanning for .safetensors, .gguf, .bin files |
| Inference server processes (vLLM, TGI, Ollama, LocalAI) | Port scanning, process enumeration |
| Internal API documentation referencing AI endpoints | Documentation search, wiki scanning |
| Hugging Face model cache directories | File system scanning for .cache/huggingface |
SaaS and Application Discovery
AI-Embedded SaaS Inventory
Many SaaS tools have added AI features, sometimes enabled by default. Discovery requires systematic inventory:
| SaaS Category | AI Features to Check | Data Risk |
|---|---|---|
| Productivity (Google Workspace, Microsoft 365) | Gemini, Copilot integrations | Document content processed by AI |
| Communication (Slack, Teams) | AI summarization, search, bots | Message content processed by AI |
| Development (GitHub, GitLab) | Copilot, code suggestions | Source code processed by AI |
| CRM (Salesforce, HubSpot) | Einstein AI, ChatSpot | Customer data processed by AI |
| Project management (Notion, Asana, Monday) | AI assistants, auto-summaries | Project details processed by AI |
| Design (Figma, Canva) | AI generation features | Design assets processed by AI |
| HR (Workday, BambooHR) | AI screening, analytics | Employee data processed by AI |
| Support (Zendesk, Intercom) | AI chatbots, auto-responses | Customer interactions processed by AI |
Browser Extension Audit
| Discovery Method | Implementation |
|---|---|
| Endpoint management | Query MDM/endpoint management for installed browser extensions |
| Group policy | Review extension whitelists/blocklists for AI-related extensions |
| Network analysis | Monitor for traffic to known AI extension backend services |
| User survey | Ask users about AI tool usage (combine with technical discovery) |
Developer Tool Discovery
Developers often integrate AI services without security review:
| What to Search For | Where to Search | Tool |
|---|---|---|
| API keys for AI services | Source code repositories, environment variables, configuration files | Secret scanning tools (GitLeaks, TruffleHog) |
| AI SDK imports | Source code (Python: openai, anthropic, langchain; JS: openai, @anthropic-ai/sdk) | Code search, dependency analysis |
| AI service URLs | Configuration files, environment variables, deployment scripts | Grep/search across repositories |
| Model files | Shared storage, artifact repositories | File system scanning by extension and size |
| AI-related dependencies | requirements.txt, package.json, Pipfile | Dependency analysis tools |
Assessment Methodology
Shadow AI Discovery Engagement Framework
Scope and authorize
Define the scope of shadow AI discovery, including which networks, cloud accounts, and organizational units are included. Obtain authorization for all scanning and monitoring activities.
Passive discovery
Begin with non-intrusive techniques that analyze existing data sources: DNS logs, proxy logs, cloud billing, SaaS inventories, and code repositories. This phase identifies the most visible shadow AI usage.
Active discovery
Conduct active scanning and enumeration: cloud resource inventory, endpoint scanning, container analysis, and developer tool assessment. This phase identifies resources that passive monitoring may miss.
Risk assessment
For each discovered shadow AI deployment, assess the risk based on data sensitivity, regulatory implications, and security posture. Prioritize findings by risk level.
Reporting and remediation
Document all discoveries with risk assessments and remediation recommendations. Include both immediate actions (blocking high-risk usage) and strategic recommendations (building an approved AI toolkit).
Risk Scoring for Shadow AI Findings
| Factor | Score 1 (Low) | Score 3 (Medium) | Score 5 (High) |
|---|---|---|---|
| Data sensitivity | Non-sensitive internal data | Business confidential | Regulated data (PII, PHI, PCI) |
| Usage scale | Individual, occasional use | Team-level, regular use | Department or organization-wide |
| Service security | Reputable provider with enterprise security | Unknown provider, standard terms | No terms, data retention unclear |
| Governance gap | Not in inventory but low risk | No security review conducted | Violates existing policy |
| Regulatory exposure | No regulatory implications | Potential compliance concerns | Clear regulatory violation |
Remediation Recommendations
Strategic Recommendations
| Recommendation | Purpose | Implementation |
|---|---|---|
| Approved AI catalog | Provide sanctioned alternatives to shadow AI | Evaluate and approve AI tools for common use cases |
| AI acceptable use policy | Set clear expectations for AI tool usage | Define what data can be processed by which AI services |
| Technical controls | Prevent unauthorized AI service access | URL filtering, DLP for AI services, API gateway controls |
| Monitoring program | Continuously detect new shadow AI usage | Ongoing network monitoring, periodic cloud enumeration |
| Training and awareness | Help employees understand shadow AI risks | Regular training on AI data risks, approved alternatives |
Immediate Actions for High-Risk Discoveries
| Finding | Immediate Action |
|---|---|
| Regulated data sent to unauthorized AI service | Assess breach notification obligations, block access, notify compliance |
| API keys for AI services embedded in source code | Rotate keys immediately, add to secret scanning rules |
| Self-hosted models processing customer data | Isolate from production data, assess data handling |
| AI browser extensions with broad permissions | Deploy removal via endpoint management |
| Unauthorized AI chatbots in customer-facing applications | Remove from production, assess customer data exposure |
Shadow AI discovery is increasingly becoming a standard component of AI red team engagements. Organizations that proactively discover and govern their AI usage are significantly better positioned to manage AI risk than those who wait for incidents to reveal unauthorized deployments.