Incomplete Denylist to Cross-Site Scripting
Description
Incomplete Denylist to Cross-Site Scripting is a compound vulnerability where software uses a denylist-based protection mechanism to prevent XSS (Cross-Site Scripting) attacks, but the denylist fails to cover all possible XSS attack vectors. Denylists attempt to filter known dangerous patterns like <script> tags or javascript: URLs, but browsers parse web content in highly variable ways with numerous encoding options and tag variations. Since no denylist can anticipate all attack variations, this approach is fundamentally flawed for XSS prevention, allowing attackers to craft novel payloads that bypass the incomplete filter.
Risk
Denylist-based XSS prevention creates a false sense of security while remaining vulnerable. Attackers actively research new bypass techniques, including character encoding variations, HTML parsing quirks, event handlers, CSS-based attacks, and browser-specific behaviors. Each new browser feature potentially introduces new XSS vectors that denylists don't cover. The approach fails particularly against: mixed-case tags (<ScRiPt>), alternate encodings (UTF-7, HTML entities), lesser-known event handlers (onerror, onload), and novel HTML5 features. This creates ongoing vulnerability as attackers continuously find bypasses while defenders play catch-up updating denylists.
Solution
Use allowlist-based input validation instead of denylists—define exactly what characters and patterns are permitted and reject everything else. Apply context-aware output encoding appropriate to where data is rendered (HTML body, attribute, JavaScript, CSS, URL). Use established security libraries like OWASP Java Encoder or DOMPurify rather than custom filtering. Implement Content Security Policy (CSP) as defense in depth. Use templating engines with automatic encoding. Never rely solely on input filtering—always encode output. Consider using structured data formats and APIs that don't interpret HTML.
Common Consequences
| Impact | Details |
|---|---|
| Confidentiality | Scope: Confidentiality Read Application Data - XSS allows attackers to steal session cookies, credentials, and sensitive data. |
| Integrity | Scope: Integrity Modify Application Data - Attackers can modify page content, inject forms, or alter application behavior. |
| Access Control | Scope: Access Control Gain Privileges or Assume Identity - Session hijacking enables attackers to impersonate legitimate users. |
Example Code
Vulnerable Code
<?php
// Vulnerable: Incomplete denylist only removes <script> tag
function vulnerable_sanitize_script_only($input) {
// Only removes exact <script> tags
$sanitized = preg_replace('/<script>/i', '', $input);
$sanitized = preg_replace('/<\/script>/i', '', $sanitized);
return $sanitized;
}
// Bypasses:
// <SCRIPT>alert(1)</SCRIPT> - different case (if not using /i)
// <script >alert(1)</script> - space in tag
// <scr<script>ipt>alert(1)</scr</script>ipt> - nested tags
// <img onerror="alert(1)" src=x> - event handlers
// <body onload="alert(1)"> - different tags
// Vulnerable: Removes common dangerous patterns but misses others
function vulnerable_sanitize_incomplete($input) {
$denylist = array(
'<script', '</script>',
'javascript:',
'onclick', 'onerror', 'onload'
);
$sanitized = str_ireplace($denylist, '', $input);
return $sanitized;
}
// Bypasses:
// <img src=x oOnError="alert(1)"> - mixed case event handler
// <svg onmouseover="alert(1)"> - unlisted event handler
// <a href="javascript:alert(1)"> - HTML entities
// <script/src="evil.js"> - no space needed
// <IMG SRC=/ onerror="alert(1)"> - unlisted src variant
?>
<!-- Vulnerable: Output without proper encoding -->
<div class="user-content">
<?php echo vulnerable_sanitize_incomplete($_GET['comment']); ?>
</div>
// Vulnerable: JavaScript denylist filter
function vulnerableSanitize(input) {
// Vulnerable: Only checks for common patterns
const denylist = [
/<script[\s\S]*?>[\s\S]*?<\/script>/gi,
/javascript:/gi,
/on\w+=/gi // Tries to catch event handlers
];
let sanitized = input;
for (const pattern of denylist) {
sanitized = sanitized.replace(pattern, '');
}
return sanitized;
}
// Bypasses:
// <img src="x" onerror ="alert(1)"> - space before equals
// <svg><script>alert(1)</script></svg> - SVG context
// java\nscript:alert(1) - newline in URL
// <iframe srcdoc="<script>alert(1)</script>"> - srcdoc attribute
// <math><mtext><table><mglyph><style><img src=x onerror=alert(1)></style></mglyph></table></mtext></math> - parser confusion
// Vulnerable: Using innerHTML with "sanitized" content
function displayComment(comment) {
const sanitized = vulnerableSanitize(comment);
document.getElementById('comments').innerHTML += sanitized;
}
# Vulnerable: Python denylist approach
import re
def vulnerable_sanitize(user_input):
"""Incomplete denylist for XSS prevention."""
# Vulnerable: Pattern list is incomplete
dangerous_patterns = [
r'<script.*?>.*?</script>',
r'javascript:',
r'on\w+=',
r'<iframe',
r'<object',
r'<embed'
]
sanitized = user_input
for pattern in dangerous_patterns:
sanitized = re.sub(pattern, '', sanitized, flags=re.IGNORECASE)
return sanitized
# Bypasses include:
# <img src=x onerror=alert(1)> - img not blocked
# <svg/onload=alert(1)> - svg not blocked
# <body background="javascript:alert(1)"> - background attribute
# <input onfocus=alert(1) autofocus> - autofocus trick
# <marquee onstart=alert(1)> - marquee tag
# <details ontoggle=alert(1) open> - HTML5 details element
from flask import Flask, request, render_template_string
app = Flask(__name__)
@app.route('/comment')
def show_comment():
comment = request.args.get('text', '')
sanitized = vulnerable_sanitize(comment)
# Vulnerable: Rendering "sanitized" content
return render_template_string(f'<div>{sanitized}</div>')
// Vulnerable: Java servlet with incomplete filtering
import javax.servlet.http.*;
import java.io.*;
public class VulnerableCommentServlet extends HttpServlet {
private static final String[] DENYLIST = {
"<script", "</script>",
"javascript:",
"onerror=", "onclick=", "onload="
};
private String vulnerableSanitize(String input) {
String result = input;
for (String pattern : DENYLIST) {
result = result.replaceAll("(?i)" + pattern, "");
}
return result;
}
@Override
protected void doGet(HttpServletRequest request,
HttpServletResponse response)
throws IOException {
String comment = request.getParameter("comment");
String sanitized = vulnerableSanitize(comment);
response.setContentType("text/html");
PrintWriter out = response.getWriter();
// Vulnerable: Writing "sanitized" content directly
out.println("<html><body>");
out.println("<div class='comment'>" + sanitized + "</div>");
out.println("</body></html>");
}
}
Fixed Code
<?php
// Fixed: Use proper output encoding, not denylist
function secure_output_html($input) {
// Fixed: Encode for HTML context
return htmlspecialchars($input, ENT_QUOTES | ENT_HTML5, 'UTF-8');
}
function secure_output_attribute($input) {
// Fixed: Encode for HTML attribute context
return htmlspecialchars($input, ENT_QUOTES | ENT_HTML5, 'UTF-8');
}
function secure_output_javascript($input) {
// Fixed: Encode for JavaScript string context
return json_encode($input, JSON_HEX_TAG | JSON_HEX_APOS |
JSON_HEX_QUOT | JSON_HEX_AMP);
}
// For rich text, use established library
function secure_rich_text($input) {
// Use HTMLPurifier library
require_once 'HTMLPurifier.auto.php';
$config = HTMLPurifier_Config::createDefault();
$config->set('HTML.Allowed', 'p,b,i,u,a[href],ul,ol,li');
$purifier = new HTMLPurifier($config);
return $purifier->purify($input);
}
?>
<!-- Fixed: Proper encoding in template -->
<div class="user-content">
<?php echo secure_output_html($_GET['comment']); ?>
</div>
<!-- Fixed: Attribute context -->
<input type="text" value="<?php echo secure_output_attribute($value); ?>">
<!-- Fixed: JavaScript context -->
<script>
var userInput = <?php echo secure_output_javascript($data); ?>;
</script>
// Fixed: Use proper encoding and safe DOM manipulation
function secureTextContent(input) {
// Fixed: Use textContent which doesn't parse HTML
const div = document.createElement('div');
div.textContent = input; // Automatically escaped
return div.innerHTML; // Get the escaped version
}
// Fixed: Use DOM methods instead of innerHTML
function displayCommentSecure(comment) {
const container = document.getElementById('comments');
const commentDiv = document.createElement('div');
commentDiv.className = 'comment';
commentDiv.textContent = comment; // Safe: doesn't parse HTML
container.appendChild(commentDiv);
}
// Fixed: For rich text, use DOMPurify library
function displayRichComment(htmlContent) {
// Use DOMPurify for HTML sanitization
const clean = DOMPurify.sanitize(htmlContent, {
ALLOWED_TAGS: ['b', 'i', 'u', 'a', 'p', 'br'],
ALLOWED_ATTR: ['href']
});
document.getElementById('comments').innerHTML = clean;
}
// Fixed: Context-aware encoding function
const encode = {
forHTML: (str) => {
return str
.replace(/&/g, '&')
.replace(/</g, '<')
.replace(/>/g, '>')
.replace(/"/g, '"')
.replace(/'/g, ''');
},
forAttribute: (str) => {
return str.replace(/[^a-zA-Z0-9]/g, (char) => {
return '&#' + char.charCodeAt(0) + ';';
});
},
forJavaScript: (str) => {
return JSON.stringify(str);
}
};
# Fixed: Use proper encoding with Flask/Jinja2
from flask import Flask, request, render_template, Markup
from markupsafe import escape
import bleach
app = Flask(__name__)
# Fixed: Automatic escaping with Jinja2
@app.route('/comment')
def show_comment():
comment = request.args.get('text', '')
# render_template automatically escapes variables
return render_template('comment.html', comment=comment)
# comment.html template:
# <div>{{ comment }}</div> <!-- Automatically escaped -->
# Fixed: For rich text, use allowlist with bleach
ALLOWED_TAGS = ['p', 'b', 'i', 'u', 'a', 'ul', 'ol', 'li', 'br']
ALLOWED_ATTRS = {'a': ['href', 'title']}
def secure_rich_text(html_input):
"""Sanitize HTML using allowlist approach."""
return bleach.clean(
html_input,
tags=ALLOWED_TAGS,
attributes=ALLOWED_ATTRS,
strip=True
)
@app.route('/rich-comment')
def show_rich_comment():
comment = request.args.get('text', '')
# Clean with allowlist, then mark as safe for template
clean_comment = secure_rich_text(comment)
return render_template('comment.html',
comment=Markup(clean_comment))
# Fixed: CSP header for defense in depth
@app.after_request
def add_security_headers(response):
response.headers['Content-Security-Policy'] = \
"default-src 'self'; script-src 'self'"
return response
// Fixed: Java with OWASP Encoder
import org.owasp.encoder.Encode;
import javax.servlet.http.*;
import java.io.*;
public class SecureCommentServlet extends HttpServlet {
@Override
protected void doGet(HttpServletRequest request,
HttpServletResponse response)
throws IOException {
String comment = request.getParameter("comment");
response.setContentType("text/html; charset=UTF-8");
response.setHeader("Content-Security-Policy",
"default-src 'self'");
PrintWriter out = response.getWriter();
out.println("<html><body>");
// Fixed: Use OWASP Encoder for HTML context
out.print("<div class='comment'>");
out.print(Encode.forHtml(comment));
out.println("</div>");
// For attributes
out.print("<input value='");
out.print(Encode.forHtmlAttribute(comment));
out.println("'>");
// For JavaScript
out.print("<script>var data = '");
out.print(Encode.forJavaScript(comment));
out.println("';</script>");
out.println("</body></html>");
}
}
CVE Examples
- CVE-2007-5727: XSS filter bypass—denylist only removed
<SCRIPT>tags, allowing other vectors. - CVE-2006-3617: Web application XSS filter only blocked
<SCRIPT>tags. - CVE-2006-4308: XSS filter only checked for "javascript:" pattern, allowing variations.
References
- MITRE Corporation. "CWE-692: Incomplete Denylist to Cross-Site Scripting." https://cwe.mitre.org/data/definitions/692.html
- OWASP. "XSS Prevention Cheat Sheet." https://cheatsheetseries.owasp.org/cheatsheets/Cross_Site_Scripting_Prevention_Cheat_Sheet.html
- OWASP. "Java Encoder Project." https://owasp.org/owasp-java-encoder/