Java Top

Get started with Spring 5 and Spring Boot 2, through the Learn Spring course:

> CHECK OUT THE COURSE

1. Overview

Simply put, URL encoding translates special characters from the URL to a representation that adheres to the spec and can be correctly understood and interpreted.

In this tutorial, we'll focus on how to encode/decode the URL or form data so that it adheres to the spec and transmits over the network correctly.

2. Analyze the URL

Let's first look at a basic URI syntax:

scheme:[//[user:[email protected]]host[:port]][/]path[?query][#fragment]

The first step into encoding a URI is examining its parts and then encoding only the relevant portions.

Now let's look at an example of a URI:

String testUrl = 
  "http://www.baeldung.com?key1=value+1&key2=value%40%21%242&key3=value%253";

One way to analyze the URI is loading the String representation to a java.net.URI class:

@Test
public void givenURL_whenAnalyze_thenCorrect() throws Exception {
    URI uri = new URI(testUrl);

    assertThat(uri.getScheme(), is("http"));
    assertThat(uri.getHost(), is("www.baeldung.com"));
    assertThat(uri.getRawQuery(),
      .is("key1=value+1&key2=value%40%21%242&key3=value%253"));
}

The URI class parses the string representation URL and exposes its parts via a simple API, e.g., getXXX.

3. Encode the URL

When encoding URI, one of the common pitfalls is encoding the complete URI. Typically, we need to encode only the query portion of the URI.

Let's encode the data using the encode(data, encodingScheme) method of the URLEncoder class:

private String encodeValue(String value) {
    return URLEncoder.encode(value, StandardCharsets.UTF_8.toString());
}

@Test
public void givenRequestParam_whenUTF8Scheme_thenEncode() throws Exception {
    Map<String, String> requestParams = new HashMap<>();
    requestParams.put("key1", "value 1");
    requestParams.put("key2", "[email protected]!$2");
    requestParams.put("key3", "value%3");

    String encodedURL = requestParams.keySet().stream()
      .map(key -> key + "=" + encodeValue(requestParams.get(key)))
      .collect(joining("&", "http://www.baeldung.com?", ""));

    assertThat(testUrl, is(encodedURL));

The encode method accepts two parameters:

  1. data – string to be translated
  2. encodingScheme – name of the character encoding

This encode method converts the string into application/x-www-form-urlencoded format.

The encoding scheme will convert special characters into two digits hexadecimal representation of eight bits that will be represented in the form of “%xy“. When we are dealing with path parameters or adding parameters that are dynamic, we will encode the data and then send to the server.

Note: The World Wide Web Consortium Recommendation states that we should use UTF-8. Not doing so may introduce incompatibilities. (Reference: https://docs.oracle.com/javase/7/docs/api/java/net/URLEncoder.html)

4. Decode the URL

Let's now decode the previous URL using the decode method of the URLDecoder:

private String decode(String value) {
    return URLDecoder.decode(value, StandardCharsets.UTF_8.toString());
}

@Test
public void givenRequestParam_whenUTF8Scheme_thenDecodeRequestParams() {
    URI uri = new URI(testUrl);

    String scheme = uri.getScheme();
    String host = uri.getHost();
    String query = uri.getRawQuery();

    String decodedQuery = Arrays.stream(query.split("&"))
      .map(param -> param.split("=")[0] + "=" + decode(param.split("=")[1]))
      .collect(Collectors.joining("&"));

    assertEquals(
      "http://www.baeldung.com?key1=value [email protected]!$2&key3=value%3",
      scheme + "://" + host + "?" + decodedQuery);
}

There are two important points to remember here:

  • Analyze URL before decoding
  • Use the same encoding scheme for encoding and decoding

If we were to decode and then analyze, URL portions might not be parsed correctly. If we used another encoding scheme to decode the data, it would result in garbage data.

5. Encode a Path Segment

We can't use URLEncoder for encoding path segments of the URL. Path component refers to the hierarchical structure that represents a directory path, or it serves to locate resources separated by “/”.

Reserved characters in path segments are different than in query parameter values. For example, a “+” sign is a valid character in path segments and therefore should not be encoded.

To encode the path segment, we use the UriUtils class by Spring Framework instead.

UriUtils class provides encodePath and encodePathSegment methods for encoding path and path segment respectively:

private String encodePath(String path) {
    try {
        path = UriUtils.encodePath(path, "UTF-8");
    } catch (UnsupportedEncodingException e) {
        LOGGER.error("Error encoding parameter {}", e.getMessage(), e);
    }
    return path;
}
@Test
public void givenPathSegment_thenEncodeDecode() 
  throws UnsupportedEncodingException {
    String pathSegment = "/Path 1/Path+2";
    String encodedPathSegment = encodePath(pathSegment);
    String decodedPathSegment = UriUtils.decode(encodedPathSegment, "UTF-8");
    
    assertEquals("/Path%201/Path+2", encodedPathSegment);
    assertEquals("/Path 1/Path+2", decodedPathSegment);
}

In the above code snippet, we can see that when we used the encodePathSegment method, it returned the encoded value, and + is not encoded because it is a value character in the path component.

Let's add a path variable to our test URL:

String testUrl
  = "/path+1?key1=value+1&key2=value%40%21%242&key3=value%253";

And to assemble and assert a properly encoded URL, we'll change the test from Section 2:

String path = "path+1";
String encodedURL = requestParams.keySet().stream()
  .map(k -> k + "=" + encodeValue(requestParams.get(k)))
  .collect(joining("&", "/" + encodePath(path) + "?", ""));
assertThat(testUrl, CoreMatchers.is(encodedURL));

6. Conclusion

In this article, we saw how to encode and decode the data so that it can be transferred and interpreted correctly.

While the article focused on encoding/decoding URI query parameter values, the approach applies to HTML form parameters as well.

As always, the source code is available over on GitHub.

Java bottom

Get started with Spring 5 and Spring Boot 2, through the Learn Spring course:

>> CHECK OUT THE COURSE
Generic footer banner
Comments are closed on this article!