CWE-176 – Improper Handling of Unicode Encoding

Read Time:50 Second

Description

The software does not properly handle when an input contains Unicode encoding.

Modes of Introduction:

– Implementation

 

 

Related Weaknesses

CWE-172

 

Consequences

Integrity: Unexpected State

 

Potential Mitigations

Phase: Architecture and Design

Description: 

Avoid making decisions based on names of resources (e.g. files) if those resources can have alternate names.

Phase: Implementation

Description: 

Phase: Implementation

Description: 

Inputs should be decoded and canonicalized to the application’s current internal representation before being validated (CWE-180). Make sure that the application does not decode the same input twice (CWE-174). Such errors could be used to bypass allowlist validation schemes by introducing dangerous inputs after they have been checked.

CVE References

  • CVE-2000-0884
    • Server allows remote attackers to read documents outside of the web root, and possibly execute arbitrary commands, via malformed URLs that contain Unicode encoded characters.
  • CVE-2001-0709
    • Server allows a remote attacker to obtain source code of ASP files via a URL encoded with Unicode.

CWE-175 – Improper Handling of Mixed Encoding

Read Time:1 Minute, 12 Second

Description

The software does not properly handle when the same input uses several different (mixed) encodings.

Modes of Introduction:

– Implementation

 

 

Related Weaknesses

CWE-172

 

Consequences

Integrity: Unexpected State

 

Potential Mitigations

Phase: Architecture and Design

Description: 

Avoid making decisions based on names of resources (e.g. files) if those resources can have alternate names.

Phase: Implementation

Description: 

Phase: Implementation

Description: 

Use and specify an output encoding that can be handled by the downstream component that is reading the output. Common encodings include ISO-8859-1, UTF-7, and UTF-8. When an encoding is not specified, a downstream component may choose a different encoding, either by assuming a default encoding or automatically inferring which encoding is being used, which can be erroneous. When the encodings are inconsistent, the downstream component might treat some character or byte sequences as special, even if they are not special in the original encoding. Attackers might then be able to exploit this discrepancy and conduct injection attacks; they even might be able to bypass protection mechanisms that assume the original encoding is also being used by the downstream component.

Phase: Implementation

Description: 

Inputs should be decoded and canonicalized to the application’s current internal representation before being validated (CWE-180). Make sure that the application does not decode the same input twice (CWE-174). Such errors could be used to bypass allowlist validation schemes by introducing dangerous inputs after they have been checked.

CVE References

CWE-174 – Double Decoding of the Same Data

Read Time:1 Minute, 56 Second

Description

The software decodes the same input twice, which can limit the effectiveness of any protection mechanism that occurs in between the decoding operations.

Modes of Introduction:

– Implementation

 

 

Related Weaknesses

CWE-172
CWE-675

 

Consequences

Access Control, Confidentiality, Availability, Integrity, Other: Bypass Protection Mechanism, Execute Unauthorized Code or Commands, Varies by Context

 

Potential Mitigations

Phase: Architecture and Design

Description: 

Avoid making decisions based on names of resources (e.g. files) if those resources can have alternate names.

Phase: Implementation

Description: 

Phase: Implementation

Description: 

Use and specify an output encoding that can be handled by the downstream component that is reading the output. Common encodings include ISO-8859-1, UTF-7, and UTF-8. When an encoding is not specified, a downstream component may choose a different encoding, either by assuming a default encoding or automatically inferring which encoding is being used, which can be erroneous. When the encodings are inconsistent, the downstream component might treat some character or byte sequences as special, even if they are not special in the original encoding. Attackers might then be able to exploit this discrepancy and conduct injection attacks; they even might be able to bypass protection mechanisms that assume the original encoding is also being used by the downstream component.

Phase: Implementation

Description: 

Inputs should be decoded and canonicalized to the application’s current internal representation before being validated (CWE-180). Make sure that the application does not decode the same input twice (CWE-174). Such errors could be used to bypass allowlist validation schemes by introducing dangerous inputs after they have been checked.

CVE References

  • CVE-2004-1315
    • Forum software improperly URL decodes the highlight parameter when extracting text to highlight, which allows remote attackers to execute arbitrary PHP code by double-encoding the highlight value so that special characters are inserted into the result.
  • CVE-2004-1939
    • XSS protection mechanism attempts to remove “/” that could be used to close tags, but it can be bypassed using double encoded slashes (%252F)
  • CVE-2004-1938
    • “%2527” (double-encoded single quote) used in SQL injection.
  • CVE-2005-0054
    • Browser executes HTML at higher privileges via URL with hostnames that are double hex encoded, which are decoded twice to generate a malicious hostname.

CWE-173 – Improper Handling of Alternate Encoding

Read Time:1 Minute, 17 Second

Description

The software does not properly handle when an input uses an alternate encoding that is valid for the control sphere to which the input is being sent.

Modes of Introduction:

– Implementation

 

 

Related Weaknesses

CWE-172
CWE-289

 

Consequences

Access Control: Bypass Protection Mechanism

 

Potential Mitigations

Phase: Architecture and Design

Description: 

Avoid making decisions based on names of resources (e.g. files) if those resources can have alternate names.

Phase: Implementation

Description: 

Phase: Implementation

Description: 

Use and specify an output encoding that can be handled by the downstream component that is reading the output. Common encodings include ISO-8859-1, UTF-7, and UTF-8. When an encoding is not specified, a downstream component may choose a different encoding, either by assuming a default encoding or automatically inferring which encoding is being used, which can be erroneous. When the encodings are inconsistent, the downstream component might treat some character or byte sequences as special, even if they are not special in the original encoding. Attackers might then be able to exploit this discrepancy and conduct injection attacks; they even might be able to bypass protection mechanisms that assume the original encoding is also being used by the downstream component.

Phase: Implementation

Description: 

Inputs should be decoded and canonicalized to the application’s current internal representation before being validated (CWE-180). Make sure that the application does not decode the same input twice (CWE-174). Such errors could be used to bypass allowlist validation schemes by introducing dangerous inputs after they have been checked.

CVE References

CWE-172 – Encoding Error

Read Time:57 Second

Description

The software does not properly encode or decode the data, resulting in unexpected values.

Modes of Introduction:

– Implementation

 

 

Related Weaknesses

CWE-707
CWE-22
CWE-41

 

Consequences

Integrity: Unexpected State

 

Potential Mitigations

Phase: Implementation

Description: 

Phase: Implementation

Description: 

While it is risky to use dynamically-generated query strings, code, or commands that mix control and data together, sometimes it may be unavoidable. Properly quote arguments and escape any special characters within those arguments. The most conservative approach is to escape or filter all characters that do not pass an extremely strict allowlist (such as everything that is not alphanumeric or white space). If some special characters are still needed, such as white space, wrap each argument in quotes after the escaping/filtering step. Be careful of argument injection (CWE-88).

Phase: Implementation

Description: 

Inputs should be decoded and canonicalized to the application’s current internal representation before being validated (CWE-180). Make sure that the application does not decode the same input twice (CWE-174). Such errors could be used to bypass allowlist validation schemes by introducing dangerous inputs after they have been checked.

CVE References

CWE-170 – Improper Null Termination

Read Time:3 Minute, 6 Second

Description

The software does not terminate or incorrectly terminates a string or array with a null character or equivalent terminator.

Null termination errors frequently occur in two different ways. An off-by-one error could cause a null to be written out of bounds, leading to an overflow. Or, a program could use a strncpy() function call incorrectly, which prevents a null terminator from being added at all. Other scenarios are possible.

Modes of Introduction:

– Implementation

 

Likelihood of Exploit: Medium

 

Related Weaknesses

CWE-707
CWE-120
CWE-126
CWE-147
CWE-464
CWE-463
CWE-20

 

Consequences

Confidentiality, Integrity, Availability: Read Memory, Execute Unauthorized Code or Commands

The case of an omitted null character is the most dangerous of the possible issues. This will almost certainly result in information disclosure, and possibly a buffer overflow condition, which may be exploited to execute arbitrary code.

Confidentiality, Integrity, Availability: DoS: Crash, Exit, or Restart, Read Memory, DoS: Resource Consumption (CPU), DoS: Resource Consumption (Memory)

If a null character is omitted from a string, then most string-copying functions will read data until they locate a null character, even outside of the intended boundaries of the string. This could: cause a crash due to a segmentation fault cause sensitive adjacent memory to be copied and sent to an outsider trigger a buffer overflow when the copy is being written to a fixed-size buffer.

Integrity, Availability: Modify Memory, DoS: Crash, Exit, or Restart

Misplaced null characters may result in any number of security problems. The biggest issue is a subset of buffer overflow, and write-what-where conditions, where data corruption occurs from the writing of a null character over valid data, or even instructions. A randomly placed null character may put the system into an undefined state, and therefore make it prone to crashing. A misplaced null character may corrupt other data in memory.

Integrity, Confidentiality, Availability, Access Control, Other: Alter Execution Logic, Execute Unauthorized Code or Commands

Should the null character corrupt the process flow, or affect a flag controlling access, it may lead to logical errors which allow for the execution of arbitrary code.

 

Potential Mitigations

Phase: Requirements

Description: 

Use a language that is not susceptible to these issues. However, be careful of null byte interaction errors (CWE-626) with lower-level constructs that may be written in a language that is susceptible.

Phase: Implementation

Description: 

Ensure that all string functions used are understood fully as to how they append null characters. Also, be wary of off-by-one errors when appending nulls to the end of strings.

Phase: Implementation

Description: 

If performance constraints permit, special code can be added that validates null-termination of string buffers, this is a rather naive and error-prone solution.

Phase: Implementation

Description: 

Switch to bounded string manipulation functions. Inspect buffer lengths involved in the buffer overrun trace reported with the defect.

Phase: Implementation

Description: 

Add code that fills buffers with nulls (however, the length of buffers still needs to be inspected, to ensure that the non null-terminated string is not written at the physical end of the buffer).

CVE References

  • CVE-2000-0312
    • Attacker does not null-terminate argv[] when invoking another program.
  • CVE-2003-0777
    • Interrupted step causes resultant lack of null termination.
  • CVE-2004-1072
    • Fault causes resultant lack of null termination, leading to buffer expansion.
  • CVE-2001-1389
    • Multiple vulnerabilities related to improper null termination.
  • CVE-2003-0143
    • Product does not null terminate a message buffer after snprintf-like call, leading to overflow.
  • CVE-2009-2523
    • Chain: product does not handle when an input string is not NULL terminated (CWE-170), leading to buffer over-read (CWE-125) or heap-based buffer overflow (CWE-122).

CWE-168 – Improper Handling of Inconsistent Special Elements

Read Time:53 Second

Description

The software does not properly handle input in which an inconsistency exists between two or more special characters or reserved words.

An example of this problem would be if paired characters appear in the wrong order, or if the special characters are not properly nested.

Modes of Introduction:

– Implementation

 

 

Related Weaknesses

CWE-159
CWE-703

 

Consequences

Availability, Access Control, Non-Repudiation: DoS: Crash, Exit, or Restart, Bypass Protection Mechanism, Hide Activities

 

Potential Mitigations

Phase:

Description: 

Developers should anticipate that inconsistent special elements will be injected/manipulated in the input vectors of their software system. Use an appropriate combination of denylists and allowlists to ensure only valid, expected and appropriate input is processed by the system.

Phase: Implementation

Description: 

Phase: Implementation

Description: 

Inputs should be decoded and canonicalized to the application’s current internal representation before being validated (CWE-180). Make sure that the application does not decode the same input twice (CWE-174). Such errors could be used to bypass allowlist validation schemes by introducing dangerous inputs after they have been checked.

CVE References

CWE-167 – Improper Handling of Additional Special Element

Read Time:1 Minute, 13 Second

Description

The software receives input from an upstream component, but it does not handle or incorrectly handles when an additional unexpected special element is provided.

Modes of Introduction:

– Implementation

 

 

Related Weaknesses

CWE-159
CWE-703

 

Consequences

Integrity: Unexpected State

 

Potential Mitigations

Phase:

Description: 

Developers should anticipate that extra special elements will be injected in the input vectors of their software system. Use an appropriate combination of denylists and allowlists to ensure only valid, expected and appropriate input is processed by the system.

Phase: Implementation

Description: 

Phase: Implementation

Description: 

While it is risky to use dynamically-generated query strings, code, or commands that mix control and data together, sometimes it may be unavoidable. Properly quote arguments and escape any special characters within those arguments. The most conservative approach is to escape or filter all characters that do not pass an extremely strict allowlist (such as everything that is not alphanumeric or white space). If some special characters are still needed, such as white space, wrap each argument in quotes after the escaping/filtering step. Be careful of argument injection (CWE-88).

Phase: Implementation

Description: 

Inputs should be decoded and canonicalized to the application’s current internal representation before being validated (CWE-180). Make sure that the application does not decode the same input twice (CWE-174). Such errors could be used to bypass allowlist validation schemes by introducing dangerous inputs after they have been checked.

CVE References

  • CVE-2002-2086