Do you think IIT Guwahati certified course can help you in your career?
No
Introduction
String Manipulation is a common and important operation performed on Strings. Appending, splitting and replacing characters are some types of String manipulation operations. The StringTokenizer class is explicitly used to split strings into two or more tokens, with a delimiter as its reference. Delimiters are characters that divide a string into two or more parts. For example, a string that contains a comma can be separated into two parts with the comma as the delimiter.
Some common delimiters are comma ( , ), semi-colons ( ; ), slashes ( / \ ) and pipes ( | ). The whitespaces act as the delimiter unless specified. The StringTokenizer keeps track of the index of the last token internally and computes the next token based on this index.
What is Java StringTokenizer?
The behavior of an instance of the StringTokenizer class depends on a parameter known as the returnDelim flag of boolean data type.
When the flag is set to true, a maximal sequence of consecutive characters (token) split with respect to the given delimiter is returned.
When the flag is set to false, the delimiter itself is considered a token. Thus, a sequence of characters or a delimiter would be returned as a token.
Class signature:
public class StringTokenizer
extends Object
implements Enumeration<Object>
Why Was StringTokenizer Introduced in Java?
The StringTokenizer class was introduced in early versions of Java to solve a common problem: splitting a string into smaller parts (tokens) based on delimiters like spaces, commas, or custom characters. Before the String.split() method was added in Java 1.4, there was no simple built-in way to parse strings. StringTokenizer offered a quick and easy way to break down text, making it useful for tasks like reading command-line arguments, processing user input, or parsing structured data formats.
It belonged to java.util package and became a standard utility in legacy Java code. Although it’s now considered outdated and has been largely replaced by split() and Scanner, it played an important role in early Java development.
Constructors of the StringTokenizer Class in Java
There are three types of constructor in StringTokenizer Class in Java
It constructs a string tokenizer with the default delimiter set for the given string.
The default delimiters are the space ( ), tab (\t), newline (\n), carriage-return (\r), and new-feed character (\f).
The delimiters are not counted as tokens.
public StringTokenizer(String str)
2. StringTokenizer(String str, String delim)
It constructs a string tokenizer of the specified string with the characters present in the delim variable as the delimiters.
If delim is set as null, no exceptions are thrown during the creation of the string tokenizer. However, implementing other methods on the resulting string tokenizer will throw a NullPointerException.
Examples of Constructors of String Tokenizer in Java
In Java, StringTokenizer is a utility class used to split strings into tokens. The class provides multiple constructors to customize how strings are tokenized. Here are examples of constructors of StringTokenizer with code implementations and their output:
Constructor with Only String Input
This constructor tokenizes a string using default delimiters (space, tab, newline, etc.).
Syntax:
StringTokenizer(String str)
Code Implementation:
import java.util.StringTokenizer;
public class TokenizerExample1 {
public static void main(String[] args) {
String str = "Java String Tokenizer Example";
StringTokenizer tokenizer = new StringTokenizer(str);
System.out.println("Tokens:");
while (tokenizer.hasMoreTokens()) {
System.out.println(tokenizer.nextToken());
}
}
}
You can also try this code with Online Java Compiler
It checks if any more tokens are available from the tokenizer’s string.
hasMoreElements()
boolean
It returns the same value as the hasMoreTokens() method.
nextElement()
Object
It returns the next token as part of the tokenizer in an object.
nextToken()
String
It returns the next token as part of the tokenizer in a string.
nextToken(String delim)
String
It returns the next token as part of the tokenizer in a string after switching to a new delimiter set.
countTokens()
int
It returns the number of tokens remaining in the string. In other words, it counts the number of times the nextToken() method will be called before an exception is generated.
Examples of Java String Tokenizer Methods
The StringTokenizer class in Java provides methods to break a string into tokens. Here’s an overview of some commonly used StringTokenizer methods with code examples and outputs.
1. hasMoreTokens()
This method checks if there are more tokens available in the string.
Code Implementation:
import java.util.StringTokenizer;
public class HasMoreTokensExample {
public static void main(String[] args) {
String str = "Java String Tokenizer Example";
StringTokenizer tokenizer = new StringTokenizer(str);
System.out.println("Tokens:");
while (tokenizer.hasMoreTokens()) {
System.out.println(tokenizer.nextToken());
}
}
}
You can also try this code with Online Java Compiler
import java.util.*;
public class Main
{
public static void main(String args[])
{
String str = "The StringBuilder class in Java";
System.out.println("String => " + str);
System.out.println("\nConstructor1: ");
StringTokenizer st1 = new StringTokenizer(str);
System.out.println("Total number of tokens with space as the delimiter: " + st1.countTokens())
while (st1.hasMoreTokens())
System.out.println(st1.nextToken());
System.out.println("\nConstructor2: ");
StringTokenizer st2 = new StringTokenizer(str, " in ");
while (st2.hasMoreTokens())
System.out.println(st2.nextElement());
System.out.println("\nConstructor3: ");
StringTokenizer st3 = new StringTokenizer(str, " in ", true);
while (st3.hasMoreElements())
System.out.println(st3.nextToken("in"));
}
}
You can also try this code with Online Java Compiler
String => The StringBuilder class in Java
Constructor1:
Total number of tokens with space as the delimiter: 5
The
StringBuilder
class
in
Java
Constructor2:
The Str
gBu
lder class
Java
Constructor3:
The Str
i
n
gBu
i
lder class
i
n
Java
Difference between the StringTokenizer and the split method
StringTokenizer is a legacy class that breaks strings into two or more tokens, while the split method splits a string according to the matches of regular expressions.
The StringTokenizer returns one substring at a time, while the split method returns an array of separated character sequences.
StringTokenizer, as a class, uses constructors to specify the delimiting character for a string.
When to Use StringTokenizer
While largely outdated, StringTokenizer can still be useful in limited scenarios. It’s suitable for quick and simple token parsing when you don’t need the power of regular expressions or the overhead of using Scanner. For example, if you need to split a basic input string using a single-character delimiter, StringTokenizer offers a lightweight solution. It's also relevant in maintaining legacy Java applications where refactoring code may not be feasible or necessary. However, for most modern applications, developers prefer String.split() or Scanner.
Real-Life Use Cases
1. Parsing CSV-Like Input Suppose you have a line from a CSV file:
String line = "Rahul,23,Engineer";
StringTokenizer st = new StringTokenizer(line, ",");
while (st.hasMoreTokens()) {
System.out.println(st.nextToken());
}
This will print each value separately. While not suitable for complex CSVs (with quotes or escaped characters), it's quick for simple, flat data parsing.
2. Tokenizing Command-Line Arguments For command-line apps that accept user input in a single line, StringTokenizer helps split the input into commands and parameters:
String input = "copy file1.txt file2.txt";
StringTokenizer st = new StringTokenizer(input, " ");
while (st.hasMoreTokens()) {
System.out.println(st.nextToken());
}
This prints each word individually, allowing you to process commands and arguments easily.
Advantages and Limitations of StringTokenizer
Advantages
Simple to Use: Ideal for basic string splitting using fixed delimiters.
Lightweight: No complex setup or regex required.
Legacy Compatibility: Still found in older Java code and quick scripts.
Limitations
No Regex Support: Cannot handle complex patterns or conditions like String.split().
Outdated: Considered obsolete; newer classes like Scanner and String.split() are preferred.
Limited Flexibility: Doesn't handle edge cases like empty tokens or quoted strings well.
Frequently Asked Questions
Why should I use StringTokenizer?
StringTokenizer is useful for quickly breaking down a string into tokens based on delimiters without needing to implement complex parsing logic. It's simple and efficient for processing basic string splitting in Java.
Is StringTokenizer thread-safe?
Yes, StringTokenizer is thread-safe because its methods are synchronized. However, it's generally recommended to use newer classes from java.util.regex or String.split() for thread safety, flexibility, and performance.
Can StringTokenizer handle multiple delimiters at once?
StringTokenizer can handle multiple delimiters but treats each delimiter character independently. It cannot interpret multiple-character delimiters or distinguish between delimiter sequences without additional logic.
What happens to empty tokens in StringTokenizer?
StringTokenizer ignores empty tokens between delimiters, so sequences like "a,,b" will tokenize as ["a", "b"]. If empty tokens are needed, consider using String.split() or java.util.Scanner.
Conclusion
The use of StringTokenizer class in recent times is not encouraged as they are not very flexible and robust as the split method of the java.util.regex class. However, it is still in use due to a few compatibility reasons and its execution speed. This blog explains the StringTokenizer class in Java. It also briefly discusses its constructors and methods along with their implementation. Check out this problem - Longest String Chain