Improper Validation of Unsafe Equivalence in Input
Description
Improper Validation of Unsafe Equivalence in Input occurs when a product accepts an input used as a resource identifier but fails to validate that it's equivalent to potentially dangerous values. Attackers can sometimes bypass input validation schemes by finding inputs that appear to be safe, but will be dangerous when processed at a lower layer or by a downstream component. A practical example: XSS filters using case-sensitive matching for <script> tags can be bypassed with <ScrIpT> since HTML rendering is case-insensitive.
Risk
Unsafe equivalence bypass has severe security implications. Input filters can be circumvented. XSS attacks may succeed. Path traversal possible. Access controls bypassable. Injection attacks enabled. Security policies ineffective. Denylists become useless. Downstream processing exploited.
Solution
Use "accept known good" validation: maintain strict allowlists of acceptable inputs conforming to specifications. Reject non-conforming data. Consider length, type, value ranges, syntax, and business logic. Normalize inputs before validation where appropriate. Use case-insensitive comparison for case-insensitive protocols. Denylists alone are insufficient - they cannot cover all equivalences.
Common Consequences
| Impact | Details |
|---|---|
| Other | Scope: Other Security bypass depending on exploitation method. |
| Access Control | Scope: Access Control Filters and restrictions can be bypassed. |
| Integrity | Scope: Integrity Malicious input may reach downstream components. |
Example Code
Vulnerable Code
# Vulnerable: Case-sensitive XSS filter
import re
# VULNERABLE: Case-sensitive blacklist
DANGEROUS_TAGS = ['<script>', '</script>', '<iframe>', '<object>']
def vulnerable_xss_filter(input_html):
"""Filter XSS - VULNERABLE to case variation bypass."""
result = input_html
# VULNERABLE: Case-sensitive matching
for tag in DANGEROUS_TAGS:
result = result.replace(tag, '')
return result
# Attacker uses: <ScRiPt>alert(1)</sCrIpT>
# Filter doesn't catch mixed case
# Browser still executes it (HTML is case-insensitive)
# VULNERABLE: Case-sensitive file extension check
def vulnerable_file_upload(filename):
"""Check file extension - VULNERABLE to case bypass."""
BLOCKED_EXTENSIONS = ['.exe', '.bat', '.cmd', '.ps1']
# VULNERABLE: Case-sensitive comparison
for ext in BLOCKED_EXTENSIONS:
if filename.endswith(ext):
return False
return True # Allow upload
# Attacker uploads: malware.EXE or malware.Exe
# Filter doesn't catch uppercase extensions
# VULNERABLE: URL encoding not considered
def vulnerable_path_check(path):
"""Check for directory traversal - VULNERABLE to encoding bypass."""
# VULNERABLE: Only checks literal "../"
if '../' in path:
return False
return True
# Attacker uses: %2e%2e%2f (URL-encoded ../)
# Or: ..%2f (partial encoding)
# Or: ..\ (backslash on Windows)
// Vulnerable: JavaScript with equivalence issues
// VULNERABLE: Case-sensitive hostname check
function vulnerableCheckRedirect(url) {
// VULNERABLE: Case-sensitive comparison
if (url.includes('malicious.com')) {
return false;
}
return true;
// Attacker uses: MALICIOUS.COM or Malicious.Com
// DNS is case-insensitive, redirect succeeds
}
// VULNERABLE: Whitespace equivalence not considered
function vulnerableValidateUrl(url) {
const allowedHosts = ['example.com', 'trusted.com'];
// Parse URL
const parsed = new URL(url);
// VULNERABLE: Doesn't consider whitespace variants
if (!allowedHosts.includes(parsed.hostname)) {
return false;
}
return true;
// Attacker uses: "https://evil.com%20.example.com"
// Or encoded whitespace that affects parsing
}
// VULNERABLE: Unicode equivalence
function vulnerableUsernameCheck(username) {
const blockedUsers = ['admin', 'root', 'administrator'];
// VULNERABLE: Doesn't normalize Unicode
if (blockedUsers.includes(username.toLowerCase())) {
return false;
}
return true;
// Attacker uses: "ɑdmin" (Latin alpha instead of 'a')
// Or: "аdmin" (Cyrillic 'а' instead of Latin 'a')
}
// Vulnerable: C with equivalence bypass issues
#include <string.h>
#include <ctype.h>
// VULNERABLE: Case-sensitive command check
int vulnerable_check_command(const char* cmd) {
// VULNERABLE: Only blocks exact lowercase match
if (strcmp(cmd, "shutdown") == 0 ||
strcmp(cmd, "reboot") == 0 ||
strcmp(cmd, "format") == 0) {
return 0; // Blocked
}
return 1; // Allowed
// Attacker uses: "SHUTDOWN", "Shutdown", "sHuTdOwN"
}
// VULNERABLE: Path check without normalization
int vulnerable_check_path(const char* path) {
// VULNERABLE: Only checks literal pattern
if (strstr(path, "..") != NULL) {
return 0; // Blocked
}
if (strstr(path, "/etc/") != NULL) {
return 0; // Blocked
}
return 1;
// Attacker uses:
// - "....//etc/passwd" (double dots)
// - "/etc//passwd" (double slashes)
// - "/etc/./passwd" (current directory reference)
}
// VULNERABLE: Extension check without case normalization
int vulnerable_check_extension(const char* filename) {
const char* ext = strrchr(filename, '.');
if (ext == NULL) return 0;
// VULNERABLE: Case-sensitive comparison
if (strcmp(ext, ".php") == 0 ||
strcmp(ext, ".asp") == 0 ||
strcmp(ext, ".jsp") == 0) {
return 0; // Blocked
}
return 1;
// Attacker uploads: shell.PHP, shell.Php, shell.pHp
}
Fixed Code
# Fixed: Proper equivalence handling
import re
import html
import urllib.parse
import unicodedata
# FIXED: Case-insensitive XSS filter with normalization
def secure_xss_filter(input_html):
"""Filter XSS with proper case handling."""
# FIXED: Use allowlist approach instead of denylist
# Only allow specific safe tags
# Approach 1: Escape all HTML
escaped = html.escape(input_html)
return escaped
# Approach 2: If HTML needed, use proper sanitizer
# import bleach
# return bleach.clean(input_html, tags=['p', 'b', 'i', 'em', 'strong'])
# FIXED: Case-insensitive file extension check
def secure_file_upload(filename):
"""Check file extension with case normalization."""
ALLOWED_EXTENSIONS = {'.jpg', '.jpeg', '.png', '.gif', '.pdf', '.txt'}
# FIXED: Normalize to lowercase before checking
normalized = filename.lower()
# FIXED: Extract extension properly
if '.' not in normalized:
return False
ext = '.' + normalized.rsplit('.', 1)[-1]
# FIXED: Use allowlist, not denylist
if ext not in ALLOWED_EXTENSIONS:
return False
# FIXED: Also check for double extensions
if normalized.count('.') > 1:
# Additional scrutiny for files like "image.jpg.php"
parts = normalized.split('.')
for part in parts[:-1]:
if part in ['php', 'asp', 'jsp', 'exe', 'bat']:
return False
return True
# FIXED: Path check with normalization
import os
def secure_path_check(path, base_dir):
"""Check path with proper normalization."""
# FIXED: URL decode first
decoded = urllib.parse.unquote(path)
# FIXED: Normalize path (resolves .., ., multiple slashes)
normalized = os.path.normpath(decoded)
# FIXED: Resolve to absolute path
full_path = os.path.abspath(os.path.join(base_dir, normalized))
base_path = os.path.abspath(base_dir)
# FIXED: Ensure path is under base directory
if not full_path.startswith(base_path + os.sep):
return False
return True
# FIXED: Unicode normalization for username
def secure_username_check(username):
"""Check username with Unicode normalization."""
# FIXED: Normalize Unicode (NFKC normalizes look-alikes)
normalized = unicodedata.normalize('NFKC', username)
# FIXED: Case-insensitive comparison
normalized = normalized.lower()
# FIXED: Check against blocked users
blocked_users = {'admin', 'root', 'administrator', 'system'}
if normalized in blocked_users:
return False
# FIXED: Additional checks for confusable characters
# Remove non-ASCII characters that could be confusables
ascii_only = normalized.encode('ascii', 'ignore').decode('ascii')
if ascii_only != normalized:
# Contains non-ASCII - additional scrutiny
if ascii_only in blocked_users:
return False # "ɑdmin" normalizes to "admin"
return True
// Fixed: JavaScript with proper equivalence handling
// FIXED: Case-insensitive hostname check
function secureCheckRedirect(url) {
try {
const parsed = new URL(url);
// FIXED: Normalize hostname to lowercase
const hostname = parsed.hostname.toLowerCase();
// FIXED: Use allowlist instead of denylist
const allowedHosts = ['example.com', 'trusted.com', 'api.example.com'];
if (!allowedHosts.includes(hostname)) {
return false;
}
// FIXED: Also check protocol
if (parsed.protocol !== 'https:') {
return false;
}
return true;
} catch (e) {
// Invalid URL
return false;
}
}
// FIXED: URL validation with normalization
function secureValidateUrl(url) {
try {
// FIXED: Normalize the URL
const parsed = new URL(url);
// FIXED: Check for whitespace in hostname (suspicious)
if (/\s/.test(parsed.hostname)) {
return false;
}
// FIXED: Normalize and check hostname
const hostname = parsed.hostname.toLowerCase().trim();
const allowedHosts = ['example.com', 'trusted.com'];
// FIXED: Exact match or subdomain match
for (const allowed of allowedHosts) {
if (hostname === allowed ||
hostname.endsWith('.' + allowed)) {
return true;
}
}
return false;
} catch (e) {
return false;
}
}
// FIXED: Username check with normalization
function secureUsernameCheck(username) {
// FIXED: Normalize Unicode using String.normalize()
const normalized = username.normalize('NFKC').toLowerCase();
const blockedUsers = new Set(['admin', 'root', 'administrator', 'system']);
if (blockedUsers.has(normalized)) {
return false;
}
// FIXED: Check if contains only allowed characters
if (!/^[a-z0-9_-]+$/.test(normalized)) {
// Contains characters outside basic ASCII
// Could be Unicode confusables
return false;
}
return true;
}
// Fixed: C with proper equivalence handling
#include <string.h>
#include <ctype.h>
#include <stdbool.h>
#include <stdlib.h>
// FIXED: Case-insensitive command check
bool secure_check_command(const char* cmd) {
// FIXED: Convert to lowercase for comparison
char* lower = strdup(cmd);
if (lower == NULL) return false;
for (char* p = lower; *p; p++) {
*p = tolower(*p);
}
// FIXED: Use allowlist approach
const char* allowed_commands[] = {"list", "status", "help", "info"};
bool allowed = false;
for (size_t i = 0; i < sizeof(allowed_commands)/sizeof(allowed_commands[0]); i++) {
if (strcmp(lower, allowed_commands[i]) == 0) {
allowed = true;
break;
}
}
free(lower);
return allowed;
}
// FIXED: Path check with normalization
bool secure_check_path(const char* path, const char* base_dir,
char* result, size_t result_size) {
// FIXED: Allocate buffer for normalized path
char* normalized = realpath(path, NULL);
if (normalized == NULL) {
// Path doesn't exist or is invalid
return false;
}
// FIXED: Get canonical base directory
char* base_canonical = realpath(base_dir, NULL);
if (base_canonical == NULL) {
free(normalized);
return false;
}
// FIXED: Check that normalized path starts with base directory
size_t base_len = strlen(base_canonical);
bool safe = (strncmp(normalized, base_canonical, base_len) == 0) &&
(normalized[base_len] == '/' || normalized[base_len] == '\0');
if (safe && result != NULL) {
strncpy(result, normalized, result_size - 1);
result[result_size - 1] = '\0';
}
free(normalized);
free(base_canonical);
return safe;
}
// FIXED: Extension check with case normalization
bool secure_check_extension(const char* filename) {
const char* ext = strrchr(filename, '.');
if (ext == NULL) return false;
// FIXED: Convert extension to lowercase
char lower_ext[16];
size_t i;
for (i = 0; ext[i] && i < sizeof(lower_ext) - 1; i++) {
lower_ext[i] = tolower(ext[i]);
}
lower_ext[i] = '\0';
// FIXED: Use allowlist of safe extensions
const char* safe_extensions[] = {".txt", ".pdf", ".jpg", ".jpeg", ".png", ".gif"};
for (size_t j = 0; j < sizeof(safe_extensions)/sizeof(safe_extensions[0]); j++) {
if (strcmp(lower_ext, safe_extensions[j]) == 0) {
return true;
}
}
return false;
}
// FIXED: Helper function for case-insensitive string comparison
int strcasecmp_safe(const char* s1, const char* s2) {
while (*s1 && *s2) {
int c1 = tolower((unsigned char)*s1);
int c2 = tolower((unsigned char)*s2);
if (c1 != c2) return c1 - c2;
s1++;
s2++;
}
return tolower((unsigned char)*s1) - tolower((unsigned char)*s2);
}
CVE Examples
- CVE-2021-39155: Hostname comparison using case-sensitive matching bypassed authorization checks.
- CVE-2020-11053: HTML-encoded whitespace bypassed redirect URL validation.
- CVE-2005-0269: File upload filter checked only lowercase extensions, allowing bypass with uppercase.
- CVE-2001-1238: Process names with uppercase letters couldn't be terminated by filter.
- CVE-2004-2214: Mixed-case URIs bypassed access restrictions.
Related CWEs
- CWE-20: Improper Input Validation (parent)
- CWE-41: Improper Resolution of Path Equivalence (related)
- CWE-178: Improper Handling of Case Sensitivity (related)
- CWE-1215: Data Validation Issues (category)
References
- MITRE Corporation. "CWE-1289: Improper Validation of Unsafe Equivalence in Input." https://cwe.mitre.org/data/definitions/1289.html
- OWASP. "Input Validation Cheat Sheet"
- Unicode Consortium. "Unicode Security Considerations"