Configuration Management / HTML Uses Unrecognized Charset

Web and API

Description

Applications may specify a non-standard character set as a result of typographical errors within the code base, or because of intentional usage of an unusual character set that is not universally recognized by browsers. If the browser does not recognize the character set specified by the application, then the browser may analyze the HTML and attempt to determine which character set it appears to be using. Even if the majority of the HTML actually employs a standard character set such as UTF-8, the presence of non-standard characters anywhere in the response may cause the browser to interpret the content using a different character set.

Risk

In most cases, the absence of a valid charset directive does not constitute a security flaw, particularly if the response contains static content. However, sometimes this can have unexpected results, and can lead to cross-site scripting vulnerabilities in which non-standard encodings like UTF-7 can be used to bypass the application's defensive filters.

Solution

For every response containing HTML content, the application should include within the Content-type header a directive specifying a standard recognized character set, for example charset=ISO-8859-1.

Curious? Convinced? Interested?

Arrange a no-obligation consultation with one of our product experts today.