Illegal Character Compilation Error

Last updated: February 20, 2025

Written by: baeldung

Reviewed by: David Martinez

Java+

Modern software architecture is often broken. Slow delivery leads to missed opportunities, innovation is stalled due to architectural complexities, and engineering resources are exceedingly expensive.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

With Orkes Conductor managed through Orkes Cloud, developers can focus on building mission critical applications without worrying about infrastructure maintenance to meet goals and, simply put, taking new products live faster and reducing total cost of ownership.

Try a 14-Day Free Trial of Orkes Conductor today.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

Try a 14-Day Free Trial of Orkes Conductor today.

Browser testing is essential if you have a website or web applications that users interact with. Manual testing can be very helpful to an extent, but given the multiple browsers available, not to mention versions and operating system, testing everything manually becomes time-consuming and repetitive.

To help automate this process, Selenium is a popular choice for developers, as an open-source tool with a large and active community. What's more, we can further scale our automation testing by running on theLambdaTest cloud-based testing platform.

Read more through our step-by-step tutorial on how to set up Selenium tests with Java and run them on LambdaTest:

>> Automated Browser Testing With Selenium

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

Try a 14-Day Free Trial of Orkes Conductor today.

Refactor Java code safely — and automatically — with OpenRewrite.

Refactoring big codebases by hand is slow, risky, and easy to put off. That’s where OpenRewrite comes in. The open-source framework for large-scale, automated code transformations helps teams modernize safely and consistently.

Each month, the creators and maintainers of OpenRewrite at Moderne run live, hands-on training sessions — one for newcomers and one for experienced users. You’ll see how recipes work, how to apply them across projects, and how to modernize code with confidence.

Join the next session, bring your questions, and learn how to automate the kind of work that usually eats your sprint time.

1. Overview

The illegal character compilation error is a file type encoding error. It’s produced if we use an incorrect encoding in our files when they are created. As result, in languages like Java, we can get this type of error when we try to compile our project. In this tutorial, we’ll describe the problem in detail along with some scenarios where we may encounter it, and then, we’ll present some examples of how to resolve it.

2. Illegal Character Compilation Error

2.1. Byte Order Mark (BOM)

Before we go into the byte order mark, we need to take a quick look at the UCS (Unicode) Transformation Format (UTF). UTF is a character encoding format that can encode all of the possible character code points in Unicode. There are several kinds of UTF encodings. Among all these, UTF-8 has been the most used.

UTF-8 uses an 8-bit variable-width encoding to maximize compatibility with ASCII. When we use this encoding in our files, we may find some bytes that represent the Unicode code point. As a result, our files start with a U+FEFF byte order mark (BOM). This mark, correctly used, is invisible. However, in some cases, it could lead to data errors.

In the UTF-8 encoding, the presence of the BOM is not fundamental. Although it’s not essential, the BOM may still appear in UTF-8 encoded text. The BOM addition could happen either by an encoding conversion or by a text editor that flags the content as UTF-8.

Text editors like Notepad on Windows could produce this kind of addition. As a consequence, when we use a Notepad-like text editor to create a code example and try to run it, we could get a compilation error. In contrast, modern IDEs encode created files as UTF-8 without the BOM. The next sections will show some examples of this problem.

2.2. Class with Illegal Character Compilation Error

Typically, we work with advanced IDEs, but sometimes, we use a text editor instead. Unfortunately, as we’ve learned, some text editors could create more problems than solutions because saving a file with a BOM could lead to a compilation error in Java. The “illegal character” error occurs in the compilation phase, so it’s quite easy to detect. The next example shows us how it works.

First, let’s write a simple class in our text editor, such as Notepad. This class is just a representation – we could write any code to test. Next, we save our file with the BOM to test:

public class TestBOM {
    public static void main(String ...args){
        System.out.println("BOM Test");
    }
}

Now, when we try to compile this file using the javac command:

$ javac ./TestBOM.java

Consequently, we get the error message:

∩╗┐public class TestBOM {
 ^
.\TestBOM.java:1: error: illegal character: '\u00bf'
∩╗┐public class TestBOM {
  ^
2 errors

Ideally, to fix this problem, the only thing to do is save the file as UTF-8 without BOM encoding. After that, the problem is solved. We should always check that our files are saved without a BOM.

Another way to fix this issue is with a tool like dos2unix. This tool will remove the BOM and also take care of other idiosyncrasies of Windows text files.

3. Reading Files

Additionally, let’s analyze some examples of reading files encoded with BOM.

Initially, we need to create a file with BOM to use for our test. This file contains our sample text, “Hello world with BOM.” – which will be our expected string. Next, let’s start testing.

3.1. Reading Files Using BufferedReader

First, we’ll test the file using the BufferedReader class:

@Test
public void whenInputFileHasBOM_thenUseInputStream() throws IOException {
    String line;
    String actual = "";
    try (BufferedReader br = new BufferedReader(new InputStreamReader(file))) {
        while ((line = br.readLine()) != null) {
            actual += line;
        }
    }
    assertEquals(expected, actual);
}

In this case, when we try to assert that the strings are equal, we get an error:

org.opentest4j.AssertionFailedError: expected: <Hello world with BOM.> but was: <Hello world with BOM.>
Expected :Hello world with BOM.
Actual   :Hello world with BOM.

Actually, if we skim the test response, both strings look apparently equal. Even so, the actual value of the string contains the BOM. As result, the strings aren’t equal.

Moreover, a quick fix would be to replace BOM characters:

@Test
public void whenInputFileHasBOM_thenUseInputStreamWithReplace() throws IOException {
    String line;
    String actual = "";
    try (BufferedReader br = new BufferedReader(new InputStreamReader(file))) {
        while ((line = br.readLine()) != null) {
            actual += line.replace("\uFEFF", "");
        }
    }
    assertEquals(expected, actual);
}

The replace method clears the BOM from our string, so our test passes. We need to work carefully with the replace method. A huge number of files to process can lead to performance issues.

3.2. Reading Files Using Apache Commons IO

In addition, the Apache Commons IO library provides the BOMInputStream class. This class is a wrapper that includes an encoded ByteOrderMark as its first bytes. Let’s see how it works:

@Test
public void whenInputFileHasBOM_thenUseBOMInputStream() throws IOException {
    String line;
    String actual = "";
    ByteOrderMark[] byteOrderMarks = new ByteOrderMark[] { 
      ByteOrderMark.UTF_8, ByteOrderMark.UTF_16BE, ByteOrderMark.UTF_16LE, ByteOrderMark.UTF_32BE, ByteOrderMark.UTF_32LE
    };
    InputStream inputStream = new BOMInputStream(ioStream, false, byteOrderMarks);
    Reader reader = new InputStreamReader(inputStream);
    BufferedReader br = new BufferedReader(reader);
    while ((line = br.readLine()) != null) {
        actual += line;
    }
    assertEquals(expected, actual);
}

The code is similar to previous examples, but we pass the BOMInputStream as a parameter into the InputStreamReader.

3.3. Reading Files Using Google Data (GData)

On the other hand, another helpful library to handle the BOM is Google Data (GData). This is an older library, but it helps manage the BOM inside the files. It uses XML as its underlying format. Let’s see it in action:

@Test
public void whenInputFileHasBOM_thenUseGoogleGdata() throws IOException {
    char[] actual = new char[21];
    try (Reader r = new UnicodeReader(ioStream, null)) {
        r.read(actual);
    }
    assertEquals(expected, String.valueOf(actual));
}

Finally, as we observed in the previous examples, removing the BOM from the files is important. If we don’t handle it properly in our files, unexpected results will happen when the data is read. That’s why we need to be aware of the existence of this mark in our files.

4. Conclusion

In this article, we covered several topics regarding the illegal character compilation error in Java. First, we learned what UTF is and how the BOM is integrated into it. Second, we showed a sample class created using a text editor – Windows Notepad, in this case. The generated class threw the compilation error for the illegal character. Finally, we presented some code examples on how to read files with a BOM.

The code backing this article is available on GitHub. Once you're logged in as a Baeldung Pro Member, start learning and coding on the project.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

Try a 14-Day Free Trial of Orkes Conductor today.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

Try a 14-Day Free Trial of Orkes Conductor today.

Modern Java teams move fast — but codebases don’t always keep up. Frameworks change, dependencies drift, and tech debt builds until it starts to drag on delivery. OpenRewrite was built to fix that: an open-source refactoring engine that automates repetitive code changes while keeping developer intent intact.

The monthly training series, led by the creators and maintainers of OpenRewrite at Moderne, walks through real-world migrations and modernization patterns. Whether you’re new to recipes or ready to write your own, you’ll learn practical ways to refactor safely and at scale.

If you’ve ever wished refactoring felt as natural — and as fast — as writing code, this is a good place to start.