When we work with files in Java, we often need to extract the filename from a given absolute path.
In this tutorial, we'll explore how to extract the filename.
2. Introduction to the Problem
The problem is pretty straightforward. Imagine we're given an absolute file path string. We want to extract the filename from it. A couple of examples may explain the problem quickly:
String PATH_LINUX = "/root/with space/subDir/myFile.linux"; String EXPECTED_FILENAME_LINUX = "myFile.linux"; String PATH_WIN = "C:\\root\\with space\\subDir\\myFile.win"; String EXPECTED_FILENAME_WIN = "myFile.win";
As we've seen, different filesystems may have different file separators. Therefore, in this tutorial, we'll address some platform-independent solutions. In other words, the same implementation will work on both *nix and Windows systems.
For simplicity, we'll use unit test assertions to verify if the solutions work as expected.
Next, let's see them in action.
3. Parsing the Absolute Path as a String
First of all, filesystems don't allow filenames to contain file separators. So, for example, we cannot create a file whose name contains “/” on Linux's Ext2, Ext3, or Ext4 filesystems:
$ touch "a/b.txt" touch: cannot touch 'a/b.txt': No such file or directory
In the example above, the filesystem treats “a/” as a directory. Based on this rule, an idea to solve the problem is to take out the substring from the last file separator until the end of the string.
String's lastIndexOf() method returns a substring's last indexing in that string. And then, we can simply get the filename by calling absolutePath.substring(lastIndex+1).
As we can see, the implementation is straightforward. However, we should note that to make our solution system-independent, we shouldn't hard code the file separator as “\\” for Windows or “/” for *nix systems. Instead, let's use File.separator in our code so that our program automatically adapts to the system it's running on:
int index = PATH_LINUX.lastIndexOf(File.separator); String filenameLinux = PATH_LINUX.substring(index + 1); assertEquals(EXPECTED_FILENAME_LINUX, filenameLinux);
The test above passes if we run it on a Linux machine. Similarly, the test below passes on a Windows machine:
int index = PATH_WIN.lastIndexOf(File.pathSeparator); String filenameWin = PATH_WIN.substring(index + 1); assertEquals(EXPECTED_FILENAME_WIN, filenameWin);
As we can see, the same implementation works on both systems.
Apart from parsing the absolute path as a string, we can use the standard File class to solve the problem.
4. Using the File.getName() Method
The File class provides the getName() method to get the filename directly. Further, we can construct a File object from the given absolute path string.
Let's first test it on the Linux system:
File fileLinux = new File(PATH_LINUX); assertEquals(EXPECTED_FILENAME_LINUX, fileLinux.getName());
The test passes if we give it a run. As File uses File.separator internally, if we test the same solution on a Windows system, it passes as well:
File fileWin = new File(PATH_WIN); assertEquals(EXPECTED_FILENAME_WIN, fileWin.getName());
5. Using the Path.getFileName() Method
Next, let's create a Path instance from the given PATH_LINUX string and test the solution on Linux:
Path pathLinux = Paths.get(PATH_LINUX); assertEquals(EXPECTED_FILENAME_LINUX, pathLinux.getFileName().toString());
When we execute the test, it passes. It's worth mentioning that Path.getFileName() returns a Path object. Therefore, we call the toString() method explicitly to convert it into a string.
The same implementation works on a Windows system with PATH_WIN as the path string too. This is because Path can detect the current FileSystem it's running on:
Path pathWin = Paths.get(PATH_WIN); assertEquals(EXPECTED_FILENAME_WIN, pathWin.getFileName().toString());
6. Using the FilenameUtils.getName() From Apache Commons IO
So far, we've addressed three solutions to extract the filename from an absolute path. As we've mentioned, they're platform-independent. However, all these three solutions work correctly only if the given absolute path matches the system the program is running on. For instance, our program can only handle Windows paths if it runs on Windows.
6.1. The Intelligent FilenameUtils.getName() Method
Well, in practice, the possibility of parsing a different system's path format is relatively low. However, Apache Commons IO‘s FilenameUtils class can “intelligently” extract the filename from different path formats. So if our program runs on Windows, it can also work for Linux file paths and vice versa.
Next, let's create a test:
String filenameLinux = FilenameUtils.getName(PATH_LINUX); assertEquals(EXPECTED_FILENAME_LINUX, filenameLinux); String filenameWin = FilenameUtils.getName(PATH_WIN); assertEquals(EXPECTED_FILENAME_WIN, filenameWin);
As we can see, the test above parses both PATH_LINUX and PATH_WIN. The test passes no matter whether we run it on Linux or Windows.
So next, we may want to know how FilenameUtils can automatically handle paths of different systems.
6.2. How FilenameUtils.getName() Works
If we have a look at FilenameUtils.getName()‘s implementation, its logic is similar to our “lastIndexOf” file separator approach. The difference is that FilenameUtils calls the lastIndexOf() method twice, once with the *nix separator (/), then with the Windows file separator (\). Finally, it takes the greater index as the “lastIndex”:
... final int lastUnixPos = fileName.lastIndexOf(UNIX_SEPARATOR); // UNIX_SEPARATOR = '/' final int lastWindowsPos = fileName.lastIndexOf(WINDOWS_SEPARATOR); // WINDOWS_SEPARATOR = '\\' return Math.max(lastUnixPos, lastWindowsPos);
Therefore, FilenameUtils.getName() doesn't check the current filesystem or the system's file separator. Instead, it finds the last file separator's index, no matter which system it belongs to, and then extracts the substring from this index until the end of the string as the final result.
6.3. An Edge Case That Makes FilenameUtils.getName() Fail
Now we understand how FilenameUtils.getName() works. It's indeed a clever solution, and it works in most cases. However, many Linux-supported filesystems allow a filename to contain backslashes (‘\'):
$ echo 'Hi there!' > 'my\file.txt' $ ls -l my* -rw-r--r-- 1 kent kent 10 Sep 13 23:55 'my\file.txt' $ cat 'my\file.txt' Hi there!
If the filename in the given Linux file path contains backslashes, the FilenameUtils.getName() will fail. A test may explain it clearly:
String filenameToBreak = FilenameUtils.getName("/root/somedir/magic\\file.txt"); assertNotEquals("magic\\file.txt", filenameToBreak); // <-- filenameToBreak = "file.txt", but we expect: magic\file.txt
We should keep this case in mind when we use this method.
In this article, we've learned how to extract the filename from a given absolute path string.
As always, the full source code of the example is available over on GitHub.