Baeldung Pro – SQL – NPI EA (cat = Baeldung on SQL)
announcement - icon

Learn through the super-clean Baeldung Pro experience:

>> Membership and Baeldung Pro.

No ads, dark-mode and 6 months free of IntelliJ Idea Ultimate to start with.

1. Introduction

SQL injection remains a persistent threat in many applications. Even though numerous guidelines emphasize escaping untrusted data, certain MySQL settings allow attackers to bypass well-known functions such as mysql_real_escape_string(). Some character encodings, for instance, can reinterpret escaped bytes in ways that reopen the door to injection. Additionally, specific SQL modes negate backslash escaping entirely.

In this article, we’ll go through the most obscure corner cases that enable malicious inputs to slip past mysql_real_escape_string().

2. Vulnerabilities of the mysql_real_escape_string() Function

The mysql_real_escape_string() function escapes special characters in a string that could be used to manipulate SQL queries. It handles the following characters:

  • Single quotes (‘)
  • Double quotes (“)
  • Backslashes (\)
  • NULL bytes
  • Carriage returns (\r)
  • Line feeds (\n)
  • Control-Z (\x1a)

For example:

$input = "O'Reilly; DROP TABLE users;";
$escaped = mysql_real_escape_string($input);

This results in:

O\'Reilly; DROP TABLE users;

Several edge cases enable attackers to bypass mysql_real_escape_string(), primarily through mismatched character sets or certain SQL modes that invalidate its escaping mechanism.

Let’s examine the key elements that make these attacks possible, including the use of SET NAMES gbk, the misunderstanding of client vs. server encodings, PDO’s emulated prepared statements, and the NO_BACKSLASH_ESCAPES SQL mode.

2.1. Character Encoding Vulnerabilities

We’ll begin exploring character encoding vulnerabilities with the following code:

mysql_query('SET NAMES gbk');
$var = mysql_real_escape_string("\xbf\x27 OR 1=1 /*");
mysql_query("SELECT * FROM test WHERE name = '$var' LIMIT 1");

When the application sets NAMES gbk, the server starts interpreting incoming strings as GBK-encoded data, while the client library in PHP might still believe it is using latin1. Consequently, mysql_real_escape_string() inserts a backslash in front of 0x27, thinking that it is neutralizing a single quote in latin1. However, once the server receives the query, the backslash can merge with preceding bytes under GBK, leaving the ‘ (single quote) free to break out of the string literal. This behavior creates an opportunity to append OR 1=1 /* within the query, ultimately causing more rows to be returned.

Furthermore, emulated prepared statements in PDO can also cause the problem. By default, PDO transforms bound parameters into escaped strings client-side rather than performing a true server-side prepare:

$pdo->query('SET NAMES gbk');
$stmt = $pdo->prepare('SELECT * FROM test WHERE name = ? LIMIT 1');
$stmt->execute(["\xbf\x27 OR 1=1 /*"]);

In this scenario, PDO itself may apply mysql_real_escape_string() under the wrong assumptions, which leads to the same bug. Disabling emulation by executing $pdo->setAttribute(PDO::ATTR_EMULATE_PREPARES, false) forces server-side prepared statements and avoids the client-side encoding mismatch.

2.2. Attacks Under NO_BACKSLASH_ESCAPES

Another frequently cited issue arises when the server runs with NO_BACKSLASH_ESCAPES enabled:

mysql_query('SET SQL_MODE="NO_BACKSLASH_ESCAPES"');
$var = mysql_real_escape_string('" OR 1=1 -- ');
mysql_query('SELECT * FROM test WHERE name = "'.$var.'" LIMIT 1');

In this mode, MySQL ignores backslashes in quoted strings, so mysql_real_escape_string() loses its primary mechanism for escaping quotes. The function still inserts \ before characters like or , but those escapes have no effect once NO_BACKSLASH_ESCAPES is active. As a result, any malicious payload that starts with a double quote will remain unescaped inside the final query, enabling injection. This behavior was filed as a bug and only fully resolved in MySQL 5.7.6 through the introduction of mysql_real_escape_string_quote().

2.3. Impact of Older MySQL Releases

Earlier versions of MySQL (before 4.1.20, 5.0.22, or 5.1.11) contain bugs in mysql_real_escape_string() that worsen these issues. For instance, older implementations treat invalid multibyte sequences incorrectly, allowing an attacker to slip malicious bytes into otherwise “escaped” strings.

Moreover, older PDO releases (prior to PHP 5.3.6) fail to expose mysql_set_charset(), making it difficult to align both client and server encoding without resorting to SET NAMES. Because SET NAMES adjusts only the server-side configuration, the client can still misunderstand how to escape incoming data.

All these factors combine to create a perfect storm for attackers who understand how MySQL handles partial multibyte sequences or ignores backslashes. Upgrading to newer MySQL versions, disabling PDO emulation, using modern encodings, and carefully verifying SQL modes significantly reduce the likelihood of injection.

3. Using mysql_real_escape_string() Securely

Despite these pitfalls, multiple techniques eliminate or drastically reduce the risk posed by these attacks. Let’s examine the strategies below to address different configurations, from modern MySQL releases to older versions that cannot be upgraded.

3.1. Using a Non-Vulnerable Character Set

One direct solution is selecting an encoding that lacks the problematic partial-byte interpretations of gbk, big5, cp932, and similar character sets:

mysql_query('SET NAMES utf8mb4');
$var = mysql_real_escape_string("\xbf\x27 OR 1=1 /*");
mysql_query("SELECT * FROM test WHERE name = '$var'");

With a reliable encoding like utf8 or utf8mb4, those ambiguous multibyte edges do not arise, so the inserted backslash in front of 0x27 remains meaningful. Consequently, the payload does not break the string.

3.2. Correctly Matching Client and Server Encodings

Another crucial approach involves calling mysql_set_charset() or the equivalent in newer libraries:

mysql_set_charset('gbk'); 
$var = mysql_real_escape_string("\xbf\x27 OR 1=1 /*");
mysql_query("SELECT * FROM test WHERE name = '$var'");

When both client and server truly agree on gbk, the escaping function can accurately interpret the payload, thus preventing partial multibyte injection. This alignment applies to any chosen encoding, but both the client call and the server settings must remain consistent.

3.3. Disabling Emulated Statements in PDO

In case of relying on PDO we can set the following attribute:

$pdo->setAttribute(PDO::ATTR_EMULATE_PREPARES, false);
$pdo->query('SET NAMES gbk');
$stmt = $pdo->prepare('SELECT * FROM test WHERE name = ?');
$stmt->execute(["\xbf\x27 OR 1=1 /*"]);

Disabling emulation causes the statement to be prepared on the server, ensuring that user input never modifies the query structure. When the PDO driver does not attempt to escape strings itself, it avoids the mismatch problem entirely.

3.4. Avoiding NO_BACKSLASH_ESCAPES

Even when NO_BACKSLASH_ESCAPES is disabled, mysql_real_escape_string() can still function correctly under most conditions. If we must keep NO_BACKSLASH_ESCAPES for other reasons, then we should ensure that queries consistently use single quotes, or switch to prepared statements entirely.

For MySQL 5.7.6 and later, mysql_real_escape_string_quote() solves this issue by receiving the necessary context regarding which quotes the string will use.

4. Conclusion

As we’ve seen, under specific conditions, mysql_real_escape_string() may fail to prevent SQL injection.

However, these vulnerabilities vanish when modern best practices are followed. Relying on true prepared statements (MySQLi or PDO with emulation disabled) ensures query strings remain intact, and user data never shifts the query’s logic. Alternatively, consistently matching client/server character sets (for example, through mysql_set_charset() or a DSN parameter) and selecting encodings that lack problematic edge cases (such as utf8mb4) ensures that escaping remains valid.

Finally, double-checking SQL modes to avoid NO_BACKSLASH_ESCAPES or using single quotes for string literals provides an additional safeguard. In current MySQL releases (≥5.7.6), developers can adopt mysql_real_escape_string_quote() if required. By following these guidelines, we can avoid all known pitfalls, even though mysql_real_escape_string() has historically been considered a reliable escape function.