ChatGPT Data Leak (March 2023)

intermediate8 min readUpdated 2026-03-15

Analysis of the March 2023 ChatGPT incident where a Redis client library bug caused users to see other users' conversation titles, partial chat history, and payment information. Covers root cause, impact, and lessons for AI application security.

incident-analysis chatgpt data-leak privacy

On March 20, 2023, OpenAI temporarily took ChatGPT offline after users reported seeing other users' conversation titles in their chat sidebar. Investigation revealed a deeper issue: under specific conditions, users could also see another user's first and last name, email address, payment address, the last four digits of a credit card number, and credit card expiration date. The root cause was a bug in an open-source Redis client library, not in the AI model itself.

Incident Timeline

Date	Event
March 20, 2023 (morning)	Users begin reporting seeing unfamiliar conversation titles in their ChatGPT sidebar
March 20, 2023 (afternoon)	Reports spread on social media with screenshots showing other users' conversation history titles
March 20, 2023 (evening)	OpenAI takes ChatGPT offline for investigation
March 21, 2023	OpenAI identifies the root cause in the Redis client library `redis-py`
March 22, 2023	OpenAI brings ChatGPT back online with the fix deployed
March 24, 2023	OpenAI publishes a detailed incident report
March 24, 2023	OpenAI discloses that payment information was also exposed for approximately 1.2% of ChatGPT Plus subscribers

Root Cause Analysis

Immediate Cause

The Redis client library redis-py had a bug in its connection handling. When a request was cancelled after the connection was placed back in the pool but before the response was received, the connection could return stale data from a different user's request to the next user who received that connection.

Technical Mechanism

Normal flow:
User A requests conversation list → Redis connection 1 → Response A → User A sees their data

Bug flow:
User A requests conversation list → Redis connection 1 → Request cancelled
User B requests conversation list → Redis connection 1 (reused) → Stale Response A → User B sees User A's data

The race condition occurred specifically during a period of high server load when OpenAI was scaling their Redis cluster. Request cancellations were more frequent during this period, increasing the probability of the bug triggering.

Contributing Factors

Level	Contributing Factor
Infrastructure	Redis client library bug in open-source dependency
Application	No data isolation validation between cache and response delivery
Architecture	Shared Redis connection pool serving multiple users without per-response integrity checks
Operational	Scaling operations during a period of high traffic increased connection reuse and cancellation rates
Testing	Race condition not covered by existing test suites

What Was NOT the Cause

This incident was explicitly not caused by:

A vulnerability in the GPT model
Prompt injection or jailbreaking
Intentional data collection or sharing
A breach of OpenAI's systems by an external attacker

Impact Assessment

Data Exposure

Data Type	Scope	Severity
Conversation titles	Unknown number of users (visible in sidebar)	Medium -- titles may reveal topics but not full conversations
First message of conversations	Small number of users during the active bug window	High -- conversation content may contain sensitive information
Payment information (name, email, last 4 digits of CC, expiry)	~1.2% of ChatGPT Plus subscribers active during a 9-hour window	Critical -- financial PII exposure

Broader Impact

User trust: Significant erosion of trust in ChatGPT's privacy protections. Users who shared sensitive information in conversations questioned whether that data was secure.
Regulatory attention: The incident attracted attention from privacy regulators, particularly in the EU, where it contributed to Italy's temporary ban on ChatGPT.
Industry-wide effect: Raised awareness that AI applications face the same infrastructure security challenges as traditional web applications, plus additional privacy risks from conversation data.

Lessons Learned

For AI Application Developers

Cache isolation is critical. Multi-tenant AI applications that cache user data must implement strict cache isolation. Every cache response should be validated against the requesting user's identity before delivery.
Conversation data is PII. Chat conversations with an AI system are sensitive personal data. They may contain health information, financial details, legal questions, and personal opinions. Treat conversation storage with the same rigor as any other PII database.
Dependency security matters. The vulnerability was in a third-party open-source library, not in OpenAI's code. AI applications must audit their dependency chains for security issues, especially in components that handle user data.

For Red Teams

This incident reveals a category of test that should be included in AI application security assessments:

Test Category	Specific Test
Session isolation	Can one user's session data leak to another user through caching, connection pooling, or shared state?
Error handling	What happens when requests are cancelled, time out, or fail? Are error responses properly scoped to the requesting user?
Concurrent access	Under high load, do race conditions expose cross-user data?
Payment data isolation	Is payment information accessible through any pathway other than the intended payment management interface?

For Organizations Deploying AI

Assume AI conversations contain sensitive data. Users will share confidential information with AI assistants. Design data handling accordingly.
Monitor for data leakage. Implement anomaly detection that identifies when a user receives data that does not match their expected profile.
Have an incident response plan for AI-specific incidents. AI data leaks have unique characteristics (conversation content, model behavior) that require specialized response procedures.

Relevance to Red Teaming

This incident underscores that AI red teaming must extend beyond model-layer attacks:

Infrastructure testing should include cache isolation, connection pooling behavior, and session management under load
Multi-tenant testing should verify that user data boundaries are enforced at every layer, not just the application layer
Dependency auditing should cover all libraries in the AI application's dependency chain, with special attention to data-handling components
Load testing should include security-relevant scenarios: do security properties hold under high concurrency?

Incident Analysis Methodology - Framework applied in this analysis
Cloud ML Platforms - Infrastructure security for AI deployments
Full-Stack AI Exploitation - Multi-layer exploitation including infrastructure
Lessons Learned - Cross-incident pattern analysis

References

"March 20 ChatGPT Outage: Here's What Happened" - OpenAI Blog (March 24, 2023) - Official incident report from OpenAI
"ChatGPT Bug Exposed Users' Conversation Histories and Payment Details" - Ars Technica (March 2023) - Detailed technical coverage of the incident
"Italian Data Protection Authority Bans ChatGPT" - Garante per la protezione dei dati personali (March 2023) - Regulatory action partly influenced by this incident
"redis-py Issue #2624" - GitHub (2023) - The specific bug report in the Redis client library

Knowledge Check

What was the root cause of the March 2023 ChatGPT data leak?

ChatGPT Data Leak (March 2023)

Related articles

ChatGPT Data Leak (March 2023)

Related articles