In this short article, we'll take a look at different ways to get a domain name from a given URL in Java.
2. What Is a Domain Name?
Simply put, a domain name represents a string that points to an IP address. It is part of the Uniform Resource Locator (URL). Using the domain name, users can access a specific website through the client software.
A domain name usually consists of two or three parts, each separated by a dot.
Starting from the end, the domain name may include:
- top-level domain (e.g., com in bealdung.com),
- second-level domain (e.g., co in google.co.uk or baeldung in baeldung.com),
- third-level domain (e.g., google in google.co.uk)
Domain names need to follow the rules and procedures specified by the Domain Name System (DNS).
3. Using the URI Class
Let's see how to extract the domain name from a URL using the java.net.URI class. The URI class provides the getHost() method, which returns the host component of the URL:
URI uri = new URI("https://www.baeldung.com/domain"); String host = uri.getHost(); assertEquals("www.baeldung.com", host);
The host contains sub-domain as well as the third, second, and top-level domains.
Additionally, to get a domain name, we’d need to remove the sub-domain from the given host:
String domainName = host.startsWith("www.") ? host.substring(4) : host; assertEquals("baeldung.com", domainName);
However, in some cases, we cannot get the domain name using the URI class. For example, it would be impossible to take out the sub-domain from the URL if we don’t know its exact value.
4. Using the InternetDomainName Class from Guava Library
Now we'll see how to get the domain name using the Guava library and the InternetDomainName class.
The InternetDomainName class provides the topPrivateDomain() method, which returns the part of the given domain name that is one level beneath the public suffix. In other words, the method will return top-level, second-level, and third-level domains.
Firstly, we’d need to extract the host from the given URL value. We can use the URI class:
String urlString = "https://www.baeldung.com/java-tutorial"; URI uri = new URI(urlString); String host = uri.getHost();
Next, let's get a domain name using InternetDomainName class and its topPrivateDomain() method:
InternetDomainName internetDomainName = InternetDomainName.from(host).topPrivateDomain(); String domainName = internetDomainName.toString(); assertEquals("baeldung.com", domainName);
Compared to the URI class, the InternetDomainName will omit the sub-domain from the returned value.
Lastly, we can remove the top-level domain from the given URL as well:
String publicSuffix = internetDomainName.publicSuffix().toString(); String name = domainName.substring(0, domainName.lastIndexOf("." + publicSuffix));
In addition, let’s create a test that will check the functionality:
assertEquals("baeldung", domainNameClient.getName("jira.baeldung.com")); assertEquals("google", domainNameClient.getName("www.google.co.uk"));
We can see that both sub-domains and top-level domains are removed from the result.
5. Using Regular Expression
Obtaining the domain name using regular expressions can be challenging. For instance, if we don't know the exact sub-domain value, we cannot determine what word (if any) should be extracted from the given URL.
On the other hand, if we know the sub-domain value, we can remove it from the URL using a regular expression:
String url = "https://www.baeldung.com/domain"; String domainName = url.replaceAll("http(s)?://|www\\.|/.*", ""); assertEquals("baeldung.com", domainName);
In this article, we looked at how to extract the domain name from the given URL. As always, the source code for the examples is available over on GitHub.