Expand Authors Top

If you have a few years of experience in the Java ecosystem and you’d like to share that with the community, have a look at our Contribution Guidelines.

Expanded Audience – Frontegg – Security (partner)
announcement - icon User management is very complex, when implemented properly. No surprise here.

Not having to roll all of that out manually, but instead integrating a mature, fully-fledged solution - yeah, that makes a lot of sense.
That's basically what Frontegg is - User Management for your application. It's focused on making your app scalable, secure and enjoyable for your users.
From signup to authentication, it supports simple scenarios all the way to complex and custom application logic.

Have a look:

>> Elegant User Management, Tailor-made for B2B SaaS

November Discount Launch 2022 – Top
We’re finally running a Black Friday launch. All Courses are 30% off until end-of-day today:

>> GET ACCESS NOW

November Discount Launch 2022 – TEMP TOP (NPI)
We’re finally running a Black Friday launch. All Courses are 30% off until end-of-day today:

>> GET ACCESS NOW

1. Overview

Simply put, URL encoding translates special characters from the URL to a representation that adheres to the spec and can be correctly understood and interpreted.

In this tutorial, we'll focus on how to encode/decode the URL or form data so that it adheres to the spec and transmits over the network correctly.

2. Analyze the URL

Let's first look at a basic URI syntax:

scheme:[//[user:[email protected]]host[:port]][/]path[?query][#fragment]

The first step into encoding a URI is examining its parts and then encoding only the relevant portions.

Now let's look at an example of a URI:

String testUrl = 
  "http://www.baeldung.com?key1=value+1&key2=value%40%21%242&key3=value%253";

One way to analyze the URI is loading the String representation to a java.net.URI class:

@Test
public void givenURL_whenAnalyze_thenCorrect() throws Exception {
    URI uri = new URI(testUrl);

    assertThat(uri.getScheme(), is("http"));
    assertThat(uri.getHost(), is("www.baeldung.com"));
    assertThat(uri.getRawQuery(),
      .is("key1=value+1&key2=value%40%21%242&key3=value%253"));
}

The URI class parses the string representation URL and exposes its parts via a simple API, e.g., getXXX.

3. Encode the URL

When encoding URI, one of the common pitfalls is encoding the complete URI. Typically, we need to encode only the query portion of the URI.

Let's encode the data using the encode(data, encodingScheme) method of the URLEncoder class:

private String encodeValue(String value) {
    return URLEncoder.encode(value, StandardCharsets.UTF_8.toString());
}

@Test
public void givenRequestParam_whenUTF8Scheme_thenEncode() throws Exception {
    Map<String, String> requestParams = new HashMap<>();
    requestParams.put("key1", "value 1");
    requestParams.put("key2", "[email protected]!$2");
    requestParams.put("key3", "value%3");

    String encodedURL = requestParams.keySet().stream()
      .map(key -> key + "=" + encodeValue(requestParams.get(key)))
      .collect(joining("&", "http://www.baeldung.com?", ""));

    assertThat(testUrl, is(encodedURL));

The encode method accepts two parameters:

  1. data – string to be translated
  2. encodingScheme – name of the character encoding

This encode method converts the string into application/x-www-form-urlencoded format.

The encoding scheme will convert special characters into two digits hexadecimal representation of eight bits that will be represented in the form of “%xy“. When we are dealing with path parameters or adding parameters that are dynamic, we will encode the data and then send to the server.

Note: The World Wide Web Consortium Recommendation states that we should use UTF-8. Not doing so may introduce incompatibilities. (Reference: https://docs.oracle.com/javase/7/docs/api/java/net/URLEncoder.html)

4. Decode the URL

Let's now decode the previous URL using the decode method of the URLDecoder:

private String decode(String value) {
    return URLDecoder.decode(value, StandardCharsets.UTF_8.toString());
}

@Test
public void givenRequestParam_whenUTF8Scheme_thenDecodeRequestParams() {
    URI uri = new URI(testUrl);

    String scheme = uri.getScheme();
    String host = uri.getHost();
    String query = uri.getRawQuery();

    String decodedQuery = Arrays.stream(query.split("&"))
      .map(param -> param.split("=")[0] + "=" + decode(param.split("=")[1]))
      .collect(Collectors.joining("&"));

    assertEquals(
      "http://www.baeldung.com?key1=value [email protected]!$2&key3=value%3",
      scheme + "://" + host + "?" + decodedQuery);
}

There are two important points to remember here:

  • Analyze URL before decoding
  • Use the same encoding scheme for encoding and decoding

If we were to decode and then analyze, URL portions might not be parsed correctly. If we used another encoding scheme to decode the data, it would result in garbage data.

5. Encode a Path Segment

We can't use URLEncoder for encoding path segments of the URL. Path component refers to the hierarchical structure that represents a directory path, or it serves to locate resources separated by “/”.

Reserved characters in path segments are different than in query parameter values. For example, a “+” sign is a valid character in path segments and therefore should not be encoded.

To encode the path segment, we use the UriUtils class by Spring Framework instead.

UriUtils class provides encodePath and encodePathSegment methods for encoding path and path segment respectively:

private String encodePath(String path) {
    try {
        path = UriUtils.encodePath(path, "UTF-8");
    } catch (UnsupportedEncodingException e) {
        LOGGER.error("Error encoding parameter {}", e.getMessage(), e);
    }
    return path;
}
@Test
public void givenPathSegment_thenEncodeDecode() 
  throws UnsupportedEncodingException {
    String pathSegment = "/Path 1/Path+2";
    String encodedPathSegment = encodePath(pathSegment);
    String decodedPathSegment = UriUtils.decode(encodedPathSegment, "UTF-8");
    
    assertEquals("/Path%201/Path+2", encodedPathSegment);
    assertEquals("/Path 1/Path+2", decodedPathSegment);
}

In the above code snippet, we can see that when we used the encodePathSegment method, it returned the encoded value, and + is not encoded because it is a value character in the path component.

Let's add a path variable to our test URL:

String testUrl
  = "/path+1?key1=value+1&key2=value%40%21%242&key3=value%253";

And to assemble and assert a properly encoded URL, we'll change the test from Section 2:

String path = "path+1";
String encodedURL = requestParams.keySet().stream()
  .map(k -> k + "=" + encodeValue(requestParams.get(k)))
  .collect(joining("&", "/" + encodePath(path) + "?", ""));
assertThat(testUrl, CoreMatchers.is(encodedURL));

6. Conclusion

In this article, we saw how to encode and decode the data so that it can be transferred and interpreted correctly.

While the article focused on encoding/decoding URI query parameter values, the approach applies to HTML form parameters as well.

As always, the source code is available over on GitHub.

November Discount Launch 2022 – Bottom
We’re finally running a Black Friday launch. All Courses are 30% off until end-of-day today:

>> GET ACCESS NOW

Generic footer banner
Comments are closed on this article!