How to Split a String in Java and Important Considerations

This article can be read in about 15 minutes.

Basic Usage of the split Method

In Java, you can split a string using the split method.

The following example demonstrates how to split a string by commas (,) and receive the result as an array.

String target = "apple,melon,banana,grape";
String[] array = target.split(",");

for (String element : array) {
  System.out.println(element);
}
// ["apple", "melon", "banana", "grape"]

The value returned by the split method is of type array. However, an array has limited functionality and is not always convenient to work with. In many cases, you may prefer to receive the split data as a list rather than an array.

To convert a string into a list, you can pass the array returned by the split method as an argument to the Arrays.asList method.

import java.util.Arrays;
import java.util.List;

// ...Omitted...

String target = "apple,melon,banana,grape";
List<String> list = Arrays.asList(target.split(","));

System.out.println(list);
// ["apple", "melon", "banana", "grape"]

Cautions for the split Method

While the split method may seem straightforward, not fully understanding its specifications can lead to unexpected pitfalls.

In the following sections, I will explain some key points to be aware of when using the split method.

Delimiters are treated as regular expressions

The delimiters passed as arguments to the split method are treated as regular expressions internally. The internal workings of the split method are shown below for reference.

public final class String
    implements java.io.Serializable, Comparable<String>, CharSequence {

    public String[] split(String regex, int limit) {

        // ...Omitted...

        return Pattern.compile(regex).split(this, limit);
    }
}

Therefore, be cautious when using symbols as delimiters.

For example, if you try to split a string using a dot (.) without realizing that it is treated as a regular expression, the string will not be split as expected.

String target = "192.168.1.0";
String[] result = target.split(".");
// []

In the above example, the data obtained from splitting the string results in an empty array.
This is because, in regular expressions, a dot (.) represents “any single character,” so every character in the string is treated as a delimiter.

If you want to split a string using a dot (.), you will need to escape the dot with \.

String target = "192.168.1.0";
String[] result = target.split("\\.");
// ["192", "168", "1", "0"]

By leveraging the fact that the split method processes regular expressions, you can implement the following logic.

String target = "apple,melon, banana, grape";
String[] result = target.split(",\\s*");
// ["apple", "melon", "banana", "grape"]

The \s is a regular expression that matches spaces, and the * means that the preceding character can occur zero or more times.
In other words, this splits the text by “comma + space (if present).”

With this splitting method, each element of the resulting array will not contain spaces, regardless of whether there is a space after the comma.

By default, all trailing empty elements in the array are removed

If the second argument of the split method is not specified, or if 0 is specified, the trailing empty elements are removed from the resulting array.

String target = "apple,melon,,banana,,";
// When the second argument is not specified
String[] result = target.split(",");
// ["apple", "melon", "", "banana"]

For example, if the above string is split using a comma as the delimiter, the two elements after “banana” will be removed because they are empty strings.
This behavior requires special attention when implementing processes like reading row data from a CSV file that assumes a fixed length.

You can avoid this issue by specifying a non-zero value for the second argument of the split method.

If a negative value is specified for the second argument, empty elements will not be removed, as shown below.

String target = "apple,melon,,banana,,";
String[] resultMinus1 = target.split(",", -1);
// ["apple", "melon", "", "banana", "", ""]

If a positive value is specified as the second argument of the split method, it will limit the number of splits to that value.
If the array has a fixed length, it’s best to specify that value.

However, be aware that the last element will contain all remaining characters that were not split.

String target = "apple,melon,,banana,,";

String[] result4 = target.split(",", 4);
// ["apple", "melon", "", "banana,,"]

String[] result5 = target.split(",", 5);
// ["apple", "melon", "", "banana", ","]

String[] result6 = target.split(",", 6);
// ["apple", "melon", "", "banana", "", ""]

String[] result7 = target.split(",", 7);
// ["apple", "melon", "", "banana", "", ""]

In my opinion, when using the split method to split a string, it’s generally better to specify a negative value (-1) for the second argument. There aren’t many situations where you would want to remove empty string elements.

Splitting an empty string does not return an empty array

String empty = "";
String[] array = empty.split(",");
// [""]

When you split an empty string, the result is not an empty array, but an array with a single element: an empty string at index 0.
You might expect that passing an empty string to the split method would return an empty array, but that’s not the case.

By the way, this behavior is not unique to Java; the split methods in JavaScript and Python behave the same way.

Splitting null throws an exception

The split method cannot be used on null; an exception will be thrown if you try to use it on a null value.

String target = null;
String[] array = target.split(",");
// NullPointerException

Considering a Generic Method for Splitting

I have created a general-purpose method for splitting strings based on the above information.
I hope it will be useful to you when implementing something.

This method has the following features:

  • The return value is not an array, but a more convenient list type.
  • If the string to be split is an empty string, an empty list is returned.
  • Empty elements are not deleted in the splitting process. (A negative value is set to the second argument of split by default.)
import java.util.Arrays;
import java.util.List;

public final class CollectionsUtil {

  private CollectionsUtil() {}

  public static List<String> toList(String target, String delimiter) {
    return toList(target, delimiter, -1);
  }

  public static List<String> toList(String target, String delimiter, int limit) {
    if (target.isEmpty()) {
      return Arrays.asList();
    }
    return Arrays.asList(target.split(delimiter, limit));
  }
}

Result of executing the above method

List<String> list1 = CollectionsUtil.toList("apple,melon,banana,grape", ",");
// ["apple", "melon", "banana", "grape"]

List<String> list2 = CollectionsUtil.toList("apple,melon,,banana,,", ",");
// ["apple", "melon", "", "banana", "", ""]

List<String> list3 = CollectionsUtil.toList("apple,melon,,banana,,", ",", 4);
// ["apple", "melon", "", "banana,,"]

List<String> list4 = CollectionsUtil.toList("", ",");
// []

References

String (Java Platform SE 8 )

Notes

If you are using Google Guava, one of the most popular third-party libraries, you can opt to use Splitter from Google Guava instead of the split method provided by the Java standard library.
However, Splitter differs significantly from the split method, as shown below, so it’s important to read the documentation to understand how to use it properly.
(Which is easier to learn: mastering the split method or mastering Splitter…?)

import com.google.common.base.Splitter;

// ...Omitted...

String target = "apple,melon, ,banana,,";
List<String> list = Splitter.on(',')
    .trimResults()
    .omitEmptyStrings()
    .splitToList(target);
// ["apple", "melon", "banana"]

Apache Commons also provides functionality for splitting strings.
However, be aware that the split method in Apache Commons behaves differently from the one provided by the Java standard library.

import org.apache.commons.lang3.StringUtils;

// ...Omitted...

String target = "apple,melon,,banana,,";

// split method in Java standard library
List<String> list1 = Arrays.asList(target.split(","));
// ["apple", "melon", "", "banana"]

// split method in Apache Commons
List<String> list2 = Arrays.asList(StringUtils.split(target, ","));
// ["apple", "melon", "banana"]

Comments