Learn through the super-clean Baeldung Pro experience:
>> Membership and Baeldung Pro.
No ads, dark-mode and 6 months free of IntelliJ Idea Ultimate to start with.
Last updated: July 7, 2024
When managing text data in a database, selecting the appropriate datatype is important. SQL Server offers several datatypes for storing character data, including CHAR, NCHAR, VARCHAR, and NVARCHAR. Although these types all store textual information, they differ in terms of storage size, flexibility, and Unicode support.
In this tutorial, let’s look at the difference between these four datatypes in SQL Server. We’ll test the queries in SQL Server 2022, but they should work similarly in other versions as well.
Before exploring the differences, let’s define a table with columns for each of these datatypes:
CREATE TABLE CharacterTypesDemo (
char_column CHAR(10),
nchar_column NCHAR(10),
varchar_column VARCHAR(10),
nvarchar_column NVARCHAR(10)
);
Next, we can insert sample data into the table defined above:
INSERT INTO CharacterTypesDemo
(char_column, nchar_column, varchar_column, nvarchar_column)
VALUES
('Baeldung', N'Baeλdμng', 'Baeldung', N'Baeλdμng');
Notably, the inserted data includes non-English characters.
The CHAR datatype in SQL Server is used to store fixed-length, non-Unicode character data. When a column is created with the CHAR type, it always occupies the specified storage size, even if the actual data is shorter.
For example, in the above script, we defined the column char_column as CHAR(10) and inserted the text Baeldung into it. SQL Server still allocates the full 10 characters of storage, even though Baeldung contains only eight characters. Moreover, the CHAR type doesn’t support Unicode characters. The CHAR datatype is ideal for fields where the length of the data is uniform, such as state abbreviations or fixed-format codes.
Let’s check what happens if we try to insert Unicode text in the column:
INSERT INTO CharacterTypesDemo
(char_column)
VALUES
(N'Baeλdung');
Here, we inserted a string with a Unicode character. The N prefix indicates to SQL Server that the text is Unicode. We can execute the query to view the inserted result:
SELECT char_column
FROM CharacterTypesDemo;
Executing this query returns the following output:
+--------------+
| char_column |
+--------------+
| Bae?dung |
+--------------+
We can see that the Unicode characters aren’t handled correctly and are replaced with a ? symbol.
Furthermore, let’s check the size of the inserted data in bytes. We can use the function DATALENGTH to determine the number of bytes used to represent the text:
SELECT DATALENGTH(char_column)
FROM CharacterTypesDemo;
Executing this query gives the output:
+---------+
| Storage |
+---------+
| 10 |
+---------+
This confirms that the CHAR type allocates one byte per character and reserves the entire ten bytes, even though the length of the text is only eight characters.
The NCHAR datatype in SQL Server is designed to store fixed-length Unicode character data. Unlike CHAR, which is for non-Unicode data, NCHAR uses two bytes for each character, allowing it to support a wide range of international characters. When a column is defined with the NCHAR type, it always occupies the specified number of characters, regardless of the actual length of the data, similar to the CHAR datatype. In the table above, we defined the column nchar_column with the type NCHAR(10).
Let’s insert some test data with Unicode characters into this column to examine the details:
INSERT INTO CharacterTypesDemo
(nchar_column)
VALUES
(N'Baeλdung');
Now, we can retrieve the data:
SELECT nchar_column
FROM CharacterTypesDemo;
When we execute this query, we get the result:
+--------------+
| char_column |
+--------------+
| Baeλdung |
+--------------+
We can see that the Unicode character is handled correctly when using the NCHAR datatype.
Finally, let’s examine the storage required for this data by using the DATALENGTH function as we did previously:
SELECT DATALENGTH(nchar_column) AS Storage
FROM CharacterTypesDemo;
Executing this query returns this output:
+---------+
| Storage |
+---------+
| 20 |
+---------+
This confirms that 20 bytes are reserved to store 10 Unicode characters. This datatype is ideal for storing text that requires consistent length and includes multilingual support.
The VARCHAR datatype in SQL Server is used to store variable-length non-Unicode character data. Unlike the CHAR type, which always allocates a fixed amount of storage, the VARCHAR only allocates the actual length of the data. This makes VARCHAR more efficient for columns where the length of data can vary significantly, but it doesn’t support Unicode characters.
Let’s insert some test data into the VARCHAR column we defined previously:
INSERT INTO CharacterTypesDemo
(varchar_column)
VALUES
('Baeldung');
We can confirm the size of the inserted column using the same method as before:
SELECT DATALENGTH(varchar_column) AS Storage
FROM CharacterTypesDemo;
Executing this query retrieves the result:
+---------+
| Storage |
+---------+
| 8 |
+---------+
Here, we can see that the VARCHAR column allocated only the necessary bytes to store the data. The VARCHAR datatype is ideal for fields where the length of the text varies, such as names, descriptions, and other variable-length non-Unicode strings.
The NVARCHAR type in SQL Server is similar to VARCHAR but supports Unicode characters. Therefore, each character is stored using two bytes of storage, effectively doubling the space allocated compared to non-Unicode VARCHAR types.
Let’s insert the sample data into the previously created table for the nvarchar_column:
INSERT INTO CharacterTypesDemo (nvarchar_column)
VALUES (N'Baeλdung');
We can notice the usage of the N prefix to indicate that the text is Unicode. Let’s verify that the Unicode text is handled correctly:
SELECT nvarchar_column
FROM CharacterTypesDemo;
Executing this query returns the output:
+-----------------+
| nvarchar_column |
+-----------------+
| Baeλdung |
+-----------------+
We can see that the NVARCHAR column properly handles Unicode characters.
Finally, let’s examine the storage utilized by this column:
SELECT DATALENGTH(nvarchar_column) AS Storage
FROM CharacterTypesDemo;
When we execute this query, we get the result:
+---------+
| Storage |
+---------+
| 16 |
+---------+
Since the text contains eight characters, this allocation reserves 16 bytes to support Unicode as well.
To compare these four types effectively, let’s summarize their behaviors in a table structure:
| Property | CHAR | NCHAR | VARCHAR | NVARCHAR |
|---|---|---|---|---|
| Description | Fixed-length non-Unicode character | Fixed-length Unicode character | Variable-length non-Unicode character | Variable-length Unicode character |
| Storage | 1 byte per character | 2 bytes per character | 1 byte per character | 2 bytes per character |
| Allocation | Allocates full length | Allocates full length | Allocates only needed | Allocates only needed |
| Unicode Support | No | Yes | No | Yes |
| Example Practical Usage | State or Country Codes | Unicode Country Code | Product Description | Customer feedback |
In this article, we explored the differences among four seemingly similar text data types in SQL Server. We’ve compared their behaviors, including Unicode support, storage allocation, and more. Additionally, we examined specific scenarios where one datatype is more appropriate to use over another.
This understanding facilitates informed decisions in database design, ensuring optimal storage efficiency and effective support for application requirements in SQL Server.