Table of contents
1.
Introduction
2.
What is Token in Java?
3.
Types of Java Tokens
3.1.
1. Keywords
3.2.
2. Indentifiers
3.3.
3. Literals
3.4.
4. Special Symbol
3.5.
5. Operators
3.6.
6. Comments
3.7.
7. Separators
4.
What is tokenization in Java?
5.
Why are Java Tokens Important?
6.
Where are these tokens used?
7.
Frequently Asked Questions
7.1.
What are the Java tokens?
7.2.
What are the tokens and variables in Java?
7.3.
Is Punctuator a token in Java?
7.4.
Can Java tokens span across multiple lines?
7.5.
How are Java tokens generated by the compiler?
8.
Conclusion
Last Updated: Dec 16, 2024
Medium

Java Tokens

Author Manish Kumar
4 upvotes
Career growth poll
Do you think IIT Guwahati certified course can help you in your career?

Introduction

In Java, tokens are the smallest elements of a program that the compiler recognizes as meaningful. These building blocks form the foundation of every Java program, as they define the structure and syntax. Whether you're writing a simple "Hello, World!" program or a complex application, Java tokens play a crucial role in ensuring your code is syntactically correct.

Java tokens are categorized into keywords, identifiers, literals, operators, separators, and special symbols. Understanding these tokens is essential for beginners and experienced developers alike, as they are key to writing clear and error-free code.

Java Tokens

What is Token in Java?

Tokens are the smallest unit of the Java program. The compiler breaks down the lines of code into tokens. The concept of tokens is central to the functioning of Java. When you look closely at any Java code, you will find that it comprises classes and methods. 

Let's see an example to understand tokens.

// program to understand java tokens

class HelloToken {
    public static void main(String[] args) {
        System.out.println("Hello Ninja!!"); 
    }
}
You can also try this code with Online Java Compiler
Run Code

 

In the above program, “class, HelloToken, {, static, void, main, (, String, args, [, ], ), System, ., out, println” etc. are tokens.

Also see, Swap Function in Java

Types of Java Tokens

There are five types of java tokens: keywords, identifiers, literals, operators and separators. The classification is based on their work type; some are used to define names, and others for arithmetic operations.

  1. Keywords
  2. Identifiers
  3. Literals/Constants
  4. Special Symbols
  5. Operators
  6. Comments
  7. Separators

1. Keywords

Keywords are reserved words in programming languages. These are used to indicate predefined terms and actions in a program. As a result, these words are not allowed to be used as names of variables or objects. It is case-sensitive and always written in lowercase. Java has the following keywords:

abstractassertbooleanbreakbyte
casecatchcharclasscontinue
defaultdo doubleelseenum
extendsfinalfinally floatfor
ifimplementsimportinstanceofint
interfacelongnativenewnull
packageprivateprotectedpublicreturn
shortstaticstrictfpsuperswitch
synchronisedthisthrowthrowstransient
tryvoidvolatilewhileconst/goto*

2. Indentifiers

A method name, class name, variable name, or label is an identifier in Java. The user typically defines these. The identifier names cannot be the same as any reserved keyword. Let's see an example to understand identifiers:

public class Test
{
    public static void main(String[] args)
    {
        int num = 10;
    }
}
You can also try this code with Online Java Compiler
Run Code

 

Identifiers present in the above program are:

Test: The name of the class.

main: The name of a method.

String: A predefined class name.

args: A variable name.

num: A variable name.
 

There are rules for naming identifiers.

  • The characters allowed are [A-Z], [a-z], [0-9], _ and $.
  • Identifiers are case-sensitive. That is, “ninja” is not the same as “NINJA”.
  • Identifier names should not start with a digit. For example, “007IamNinja” is an invalid identifier.
  • Whitespace is not allowed inside an identifier.
  • Keywords can’t be used as an identifier.

3. Literals

Literals represent fixed values in a source code. These are similar to standard variables with the difference that these are constant. These can be classified as an integer literal, a string literal, a boolean etc. The user defines these mainly to define constants. 

Syntax to define literals:

final data_type variable_name;

There are five types of literals in Java:

  • Integer
  • Floating Point
  • Boolean
  • Character
  • String
TypeExample
int22
double25.08
booleanTRUE
charN
string“Coding Ninjas”

4. Special Symbol

These are the special symbols used to separate java tokens. These are sometimes called punctuators. The separators have special meaning and thus should not be used for anything else. 

 

  • Brackets []: These are used to define arrays and represent single and multi-dimension subscripts.
     
  • Braces {}: These mark the start and end of multi-line code blocks.
     
  • Parenthesis (): Used for function calls and parameters.
     
  • Comma (,): These separate statements. For example, separate parameters.
     
  • Semicolon ( : ): It invokes the initialization list.
     
  • Asterisk (*): This is used to generate pointer variables.
     
  • Assignment operator (=): It assigns values to variables. For example, a=10, here we are giving value 10 to variable 'a'.

5. Operators

As the name suggests, operators perform operations between different entities. Whenever the compiler sees an operator, it tokenizes it and proceeds further. Java has other operators based on their functionality. In total, there are eight types of operators in Java.

Let's see these operators along with their examples.

OperatorExamples
Arithmetic+.-.*,/,%
Assignment=.+=,-=,*=,/=
Unary++,--,≠
Logical&&,||
Relational==,≠,≥,≤
Ternarycondition?stat1:stat2
Bitwise&,|,~,^
Shift>>,<<,>>>

6. Comments

In java there are multiple types of comments available in java : 

  1. Single-line comments: These are the comments that are denoted by // and are used to comment a single line of code. 
     
  2. Multi-line comment: These comments start with /* and end with */. When a user wants to comment on multiple lines in the code, these comments are used for it. 
     
  3. Java doc comments: These comments are denoted by "/**" and "*/". To provide documentation of Java code, these comments are used. Also, the Java doc tool is used to process these comments and generate HTML documents.

7. Separators

Separators help to define the structure of the class. These are used to separate different parts of the codes. There are various separators that are used in Java. The most commonly used separator is a semicolon(;). 

Other separators include comma, parentheses, brackets, colon, and many more

What is tokenization in Java?

Tokenization is a process to extract tokens from a source code. Java has a built-in class named StringTokenizer that permits an application to break a string into tokens. Another class, StreamTokenizer, exists but is slightly complex to implement. 

Let's discuss the features of the StringTokenizer class:

  • ️It does not distinguish between identifiers, numbers and quoted strings.
  • ️Does not recognize and skip comments.
  • ️It maintains a current position within the string to be tokenized. 
  • ️By taking a substring of the String used to create the StringTokenizer object a token is returned.
  • ️The delimiters can be specified at creation time or per token basis.

 

The class's behaviour is based on the returnDelims flag:

  • When flag is set to true, a maximal sequence of consecutive characters (token) split concerning the given delimiter is returned.
  • The delimiter is considered a token when the flag is false. Thus, a sequence of characters or a delimiter would be returned as a token.

Example of tokenization in Java

StringTokenizer ninja = new StringTokenizer("Coding Ninjas is Best");
while (ninja.hasMoreTokens()) {
    System.out.println(ninja.nextToken());
}

 

Output

Coding
Ninjas
is
Best

 

Note: The StringTokenizer class is deprecated. It is still retained due to compatibility reasons only.

Also Read - Type Conversion in Java

Why are Java Tokens Important?

Java Tokens are essential because of the following reasons:

  • As one of the fundamental components of Java code, Java tokens are essential.
     
  • Java tokens improve the readability, maintainability, and flexibility of the code.
     
  • They aid the compiler in correctly understanding and executing the code.
     
  • These tokens are recognised by the Java compiler based on delimiters and converted into bytecode.
     

Where are these tokens used?

These tokens are used in the source code of any program. To use these tokens, place them at the correct positions inside the code so they can provide meaning to the code. The placement of these tokens is critical because they define the structure of the bytecode.

Example:

class HelloToken {
    public static void main(String[] args) {
        System.out.println("Hello Ninja!!"); 
    }
}

 

In the above example, the tokens are meaningfully placed in their correct places. For example, if "main" comes before "void", then it will be a syntactical error, and the code will not compile.

Frequently Asked Questions

What are the Java tokens?

Java tokens are the basic building blocks of a Java program. They are the smallest unit of a program, and they include keywords, identifiers, operators, literals, separators, and comments. Separators are symbols that separate different parts of a program.

What are the tokens and variables in Java?

In Java, tokens are the smallest individual units of a program, such as keywords, identifiers, literals, and operators. Variables, on the other hand, are named storage locations that hold data and can be manipulated within a program.

Is Punctuator a token in Java?

Yes, punctuators (also called separators) are tokens in Java. They include symbols like {, }, ;, ,, (), and [], which define code structure, separate statements, and mark blocks in Java programs.

Can Java tokens span across multiple lines?

Java tokens cannot span multiple lines, as tokens like keywords, operators, and identifiers must be single, contiguous units. However, strings and comments can span multiple lines, but these are treated as distinct from tokens.

How are Java tokens generated by the compiler?

The compiler uses a lexical analyzer to scan the source code, splitting it into recognizable tokens like keywords, operators, identifiers, and literals. These tokens are then processed during syntax and semantic analysis to ensure code correctness.

Conclusion

We thoroughly discussed java tokens and gained a more profound knowledge of their usage and types. Java tokens are critical to writing efficient and error-free code. Java tokens are extractable using the StringTokenizer class of Java. 

We hope this blog has helped you. We recommend you visit our articles on different topics -

Live masterclass