Remember the days when you used to read about parts of speech, verbs, adjectives etc., in your high school English class? These are the basic building blocks of the English language. Without them, there will be no structure and order. Tokens serve similar functions for any programming language. You might be familiar with terms such as int, bool, public, private, and so on. What exactly are these? How do compilers separate them from each other?
Stay with us, and this article will solve all such queries. In this blog, we will explore Java Tokens in detail and understand the types of Java tokens.
What are Tokens?
Tokens are the basic building blocks of any code script. These include the keywords, operators, constants or other unique syntactic stuff specific to the language. The java compiler identifies these tokens based on delimiters and converts them into bytecode. Now, you must be wondering what these delimiters are and what bytecode is.
Delimiters
Most languages consider spaces, tabs, new lines, carriage returns and new feed-character as a delimiter. These delimiters are not part of the token but help identify tokens.
Bytecode
It is the object code that an interpreter converts into machine code that computers understand. These are compact numeric codes, references and constants that encode the result of compiler parsing.
What is Token in Java?
The concept of tokens is central to the functioning of Java. When you look closely at any Java code, you will find that it comprises classes and methods. But what are these methods made up of? Yes, you guessed it right, these are the tokens.
Tokens are the smallest unit of the Java program. The compiler breaks down the lines of code into tokens. Let's see an example to understand tokens.
// program to understand java tokens
class HelloToken {
public static void main(String[] args) {
System.out.println("Hello Ninja!!");
}
}
In the above program, “class, HelloToken, {, static, void, main, (, String, args, [, ], ), System, ., out, println” etc. are tokens.
There are five types of java tokens: keywords, identifiers, literals, operators and separators. The classification is based on their work type; some are used to define names, and others for arithmetic operations.
Keywords
Identifiers
Literals
Special Symbols
Operators
Keywords
Keywords are reserved words in programming languages. These are used to indicate predefined terms and actions in a program. As a result, these words are not allowed to be used as names of variables or objects. It is case-sensitive and always written in lowercase. Java has the following keywords:
abstract
assert
boolean
break
byte
case
catch
char
class
continue
default
do
double
else
enum
extends
final
finally
float
for
if
implements
import
instanceof
int
interface
long
native
new
null
package
private
protected
public
return
short
static
strictfp
super
switch
synchronised
this
throw
throws
transient
try
void
volatile
while
const/goto*
Indentifiers
A method name, class name, variable name, or label is an identifier in Java. The user typically defines these. The identifier names cannot be the same as any reserved keyword. Let's see an example to understand identifiers:
public class Test
{
public static void main(String[] args)
{
int num = 10;
}
}
Identifiers present in the above program are:
Test: The name of the class.
main: The name of a method.
String: A predefined class name.
args: A variable name.
num: A variable name.
There are rules for naming identifiers.
The characters allowed are [A-Z], [a-z], [0-9], _ and $.
Identifiers are case-sensitive. That is, “ninja” is not the same as “NINJA”.
Identifier names should not start with a digit. For example, “007IamNinja” is an invalid identifier.
Whitespace is not allowed inside an identifier.
Keywords can’t be used as an identifier.
Literals
Literals represent fixed values in a source code. These are similar to standard variables with the difference that these are constant. These can be classified as an integer literal, a string literal, a boolean etc. The user defines these mainly to define constants.
Syntax to define literals:
final data_type variable_name;
There are five types of literals in Java:
Integer
Floating Point
Boolean
Character
String
Type
Example
int
22
double
25.08
boolean
TRUE
char
N
string
“Coding Ninjas”
Operators
As the name suggests, operators perform operations between different entities. Whenever the compiler sees an operator, it tokenizes it and proceeds further. Java has other operators based on their functionality. In total, there are eight types of operators in Java.
Let's see these operators along with their examples.
Operator
Examples
Arithmetic
+.-.*,/,%
Assignment
=.+=,-=,*=,/=
Unary
++,--,≠
Logical
&&,||
Relational
==,≠,≥,≤
Ternary
condition?stat1:stat2
Bitwise
&,|,~,^
Shift
>>,<<,>>>
Separators
These are the special symbols used to separate java tokens. These are sometimes called punctuators. The separators have special meaning and thus should not be used for anything else.
Brackets []: These are used to define arrays and represent single and multi-dimension subscripts.
Braces {}: These mark the start and end of multi-line code blocks.
Parenthesis (): Used for function calls and parameters.
Comma (,): These separate statements. For example, separate parameters.
Semicolon ( : ): It invokes the initialization list.
Asterisk (*): This is used to generate pointer variables.
Assignment operator (=): It assigns values to variables. For example, a=10, here we are giving value 10 to variable 'a'.
Comments
In java there are multiple types of comments available in java :
Single-line comments: These are the comments that are denoted by // and are used to comment a single line of code.
Multi-line comment: These comments start with /* and end with */. When a user wants to comment on multiple lines in the code, these comments are used for it.
Java doc comments: These comments are denoted by "/**" and "*/". To provide documentation of Java code, these comments are used. Also, the Java doc tool is used to process these comments and generate HTML documents.
Separators
Separators help to define the structure of the class. These are used to separate different parts of the codes. There are various separators that are used in Java. The most commonly used separator is a semicolon(;).
Other separators include comma, parentheses, brackets, colon, and many more.
What is tokenization in Java?
Tokenization is a process to extract tokens from a source code. Java has a built-in class named StringTokenizer that permits an application to break a string into tokens. Another class, StreamTokenizer, exists but is slightly complex to implement.
Let's discuss the features of the StringTokenizer class:
️It does not distinguish between identifiers, numbers and quoted strings.
️Does not recognize and skip comments.
️It maintains a current position within the string to be tokenized.
️By taking a substring of the String used to create the StringTokenizer object a token is returned.
️The delimiters can be specified at creation time or per token basis.
The class's behaviour is based on the returnDelims flag:
When flag is set to true, a maximal sequence of consecutive characters (token) split concerning the given delimiter is returned.
The delimiter is considered a token when the flag is false. Thus, a sequence of characters or a delimiter would be returned as a token.
Example of tokenization in Java
StringTokenizer ninja = new StringTokenizer("Coding Ninjas is Best");
while (ninja.hasMoreTokens()) {
System.out.println(ninja.nextToken());
}
Output
Coding
Ninjas
is
Best
Note: The StringTokenizer class is deprecated. It is still retained due to compatibility reasons only.
Java Tokens are essential because of the following reasons:
As one of the fundamental components of Java code, Java tokens are essential.
Java tokens improve the readability, maintainability, and flexibility of the code.
They aid the compiler in correctly understanding and executing the code.
These tokens are recognised by the Java compiler based on delimiters and converted into bytecode.
Where are these tokens used?
These tokens are used in the source code of any program. To use these tokens, place them at the correct positions inside the code so they can provide meaning to the code. The placement of these tokens is critical because they define the structure of the bytecode.
Example:
class HelloToken {
public static void main(String[] args) {
System.out.println("Hello Ninja!!");
}
}
In the above example, the tokens are meaningfully placed in their correct places. For example, if "main" comes before "void", then it will be a syntactical error, and the code will not compile.
Frequently Asked Questions
What are the Java tokens?
Java tokens are the basic building blocks of a Java program. They are the smallest unit of a program, and they include keywords, identifiers, operators, literals, separators, and comments. Separators are symbols that separate different parts of a program.
What are the 5 tokens in Java?
The five tokens in Java include keywords, identifiers, operators, literals, and separators. Keywords are reserved words with special meanings in the Java language, and identifiers are names given to classes, methods, and variables.
Why are java tokens important?
Java tokens are important because they are the fundamental elements of Java code. They help the compiler to understand and execute the code accurately. Java tokens also aid in code readability, maintainability, and modification.
What are the tokens and variables in Java?
In Java, tokens are the smallest individual units of a program, such as keywords, identifiers, literals, and operators. Variables, on the other hand, are named storage locations that hold data and can be manipulated within a program.
Conclusion
We thoroughly discussed java tokens and gained a more profound knowledge of their usage and types. Java tokens are critical to writing efficient and error-free code. Java tokens are extractable using the StringTokenizer class of Java.
We hope this blog has helped you. We recommend you visit our articles on different topics -