Table of contents
1.
Introduction
2.
Syntax of Programming Language
3.
Type-checking environments
4.
Type-checking expressions
5.
 
6.
 
7.
Other Features of Type Checking
7.1.
Assignments
7.2.
Data Structures
7.3.
Overloading
7.4.
Type Conversion
7.5.
Polymorphism / Generic Types
7.6.
Implicit Types
8.
Frequently Asked Questions
8.1.
What is type checking?
8.2.
What is overloading?
8.3.
What do you mean by type conversion?
8.4.
Define polymorphism.
8.5.
What do you mean by AST?
9.
Conclusion
Last Updated: Mar 27, 2024
Easy

Advanced Type Checking

Author Aditya Kumar
0 upvote
Career growth poll
Do you think IIT Guwahati certified course can help you in your career?
Compiler Design

Introduction

The technique of validating and enforcing type restrictions in values is type checking.

Many texts are filtered out by the compiler's Lexical Analysis and parsing stages. Still, these approaches cannot handle many programming languages with well-formed needs since they are frequently not context-free and cannot be checked by the membership of context-free grammar.

The type-checking Phases of Compiler design is interleaved with the syntax analysis phase, so it is done before a program's execution or translation (static typing), and the information is gathered for use by subsequent steps, such as the translator, which will naturally combine type calculation with the actual translation.

illustration image

Syntax of Programming Language

Program → Funs

Funs → Fun
Funs → Fun Funs

Fun → TypeId(TypeIds) = Exp

TypeId → int id
TypeId → bool id

TypeIds → TypeId
TypeIds → TypeId, TypeIds

Exp → num
Exp → id
Exp → Exp + Exp
Exp → Exp = Exp
Exp → if Exp then Exp else Exp
Exp → id(Exps)
Exp → let id = Exp in Exp

Exps → Exp
Exps → Exp, Exps

 

It's a recursively defined first-order functional programming language.

A program is a collection of declarations of functions.

No function may be declared more than once since they are mutually exclusive.

Each function specifies the type of its output and the types and names of its parameters.

There are separate namespaces for functions and variables.

Parameters should not be used more than once.

A function's body is an expression that might be an integer constant, variable names, a sum expression, a comparison, a function call, or a locally defined expression.

For both booleans and integers, a comparison is defined.

For integers, the addition is defined.

A program must have a primary function that takes an integer as an input.

When called for execution, the main returns an integer.

Type-checking environments

We require symbol tables to type-check the program since they connect variables and functions to their types. We'll utilize two symbol tables, one for variables and one for functions because variables and functions have different namespaces. A variable is assigned to either an int or a bool type. The type of a function comprises its parameters and the type of its outcome. Function types are represented as a parenthesized list of the argument types, an arrow, and the result type, such as,

(int, bool) →int 

This is a function that takes two parameters of type int and bool and returns an integer.

See Type Checking Environment and Expressions

Type-checking expressions

The symbol tables for variables and functions are inherited properties when we type-check expressions. 

The expression's type (int or bool) is returned as a synthesised attribute. 

We'll allow the type checker function to utilise a notation identical to the concrete syntax for pattern-matching purposes to make the presentation independent of any specific data structure for abstract syntax. 

However, one should continue to think of it as abstract syntax, as all ambiguity and other difficulties have been fixed.

We presume predefined functions for extracting characteristics from terminals (variable names and numeric constants). As a result, id contains a named function that retrieves the identifier's name. Similarly, num has a value function that returns the number's value.

We define one or more functions for each non-terminal that accepts an abstract syntax subtree as an input and returns the synthesised characteristics.

The type-checking function for expressions is demonstrated in the following figure. ‘CheckExp’ is the function for type checking expressions. 

The argument ‘vtable’ specifies the symbol table for variables, whereas the parameter ‘ftable’ sets the symbol table for functions. 

The function error reports a type error. We let the error-reporting function return to allow the type checker to continue and report more than one error. When a typing mistake is reported, the type checker can make an educated guess as to what the type should have been and return that guess, enabling type checking to proceed. However, this assumption might be incorrect, resulting in erroneous type errors being reported later.

      CheckExp(Exp, vtable, ftable) =  

case Exp of
num int
id

t = lookup(vtable, name(id))

if t = unbound

then error(); int

else t

Exp1 + Exp2 

t1 = CheckExp(Exp1, vtable, f table)

t2 = CheckExp(Exp2, vtable, f table)

if t1 = int and t2 = int

then int

else error(); int

Exp1 = Exp2

t1 = CheckExp(Exp1, vtable, ftable) 

t2 = CheckExp(Exp2, vtable, ftable) 

if t1 = t2 

then bool 

else error(); bool 

if Exp1 then Exp2 else Exp3

t1 = CheckExp(Exp1, vtable, ftable)

t2 = CheckExp(Exp2, vtable, ftable)

t3 = CheckExp(Exp3, vtable, ftable)

if t1 = bool and t2 = t3

then t2

else error(); t2

id ( Exps )

t = lookup(ftable, name(id))

if t = unbound

then error(); int

else

((t1, . . . , tn) → t0) = t

[t1,...,tm] = CheckExps(Exps, vtable’, ftable)

if m = n and t1 = t1’,...,tn = tn

then t0

else error(); t0

let id = Exp1 in Exp2

t1 = CheckExp(Exp1, vtable, ftable)

vtable’ = bind(vtable, name(id), t1)

CheckExp(Exp2, vtable’, ftable)

 

Each of the instances covered by ‘CheckExp’ will be briefly explained.

  • The type of a number is int.
  • The type of a variable may be determined by checking up its name in the symbol table for variables. The lookup function returns the unique value unbound if the variable is not found in the symbol table.
  • When this occurs, an error is raised, and the type checker makes an educated guess that the type is int. Otherwise, the type retrieved by lookup is returned.
  • A plus expression(+) has an integer outcome and needs both parameters to be integers.
  • For a comparison to work, the arguments must be of the same type. The outcome is a boolean in either instance.
  • The condition of a conditional expression must be of type bool, and the two branches must be of the same type. The value of one of the branches is the result of a condition; therefore, it has the same type as these. If the branches have different types, the type checker throws an error and takes the type of the then-branch as its best estimate for the type of the whole expression at random.
  • The function name is looked up in the function environment to find the amount and kinds of parameters as well as the return type during a function call. The Call’s number of parameters must match the expected number, and the types of the arguments must match the specified types. The function's return type is the result type. An error is raised if the function name isn't found in ftable, and the type checker estimates the result type to be int at random.
  • A let-expression creates a new variable whose type is the same as the expression that defines the variable's value. The function ‘bind’ is used to expand the symbol table for variables, which is then used to verify the body expression and determine its type, which is then the type of the entire expression.

Since a let-expression can't produce a type error by itself (but its components can), therefore, no checking is done.

We've included CheckExps in the figure since CheckExp mentions the nonterminal Exps and its related type checking function CheckExps.

  CheckExps(Exps, vtable, ftable) =

case Exps of
Exp [CheckExp(Exp, vtable, ftable)]
Exp, Exps CheckExp(Exp, vtable, ftable) :: CheckExps(Exps, vtable, ftable)

 

 

 

Check out this article - Compile Time Polymorphism

Other Features of Type Checking

Our sample language is essential, and it does not cover all elements of type checking. Below are a few examples of various components, along with brief descriptions of how they might be used.

Assignments

When a value is assigned to a variable, the type checker checks that the value is the same type as the declared type.

Data Structures

A data structure can specify a value that has several components, such as a struct, or a value that has various types at different times. To express complicated types, a type checker will require the data structure to describe them to express complicated types. This data structure is identical to the declarations data structure's AST(abstract syntax tree).

Overloading

When the same term is applied to several operations, the ‘=’ operator compares both integers and boolean values in the example language specified in earlier sections. In contrast, most programming languages use the ‘+’ and ‘-’ operators for boolean and integer operations. When these operators are predefined, they only cover a finite number of situations; therefore, they are tested, but this necessitates different argument types for distinct instances.

For example, suppose the read function is defined to read either integers or boolean values from a text stream. In that case, the type checker must pass the intended type of each expression as an inherited property so that the overloaded operator is picked correctly.

Type Conversion

Operators for converting one type to another, such as converting an integer to a floating type, may exist in a language. If the type checker determines that one or both arguments do not have the right type, it will attempt to convert one or both.

Polymorphism / Generic Types

Polymorphism (definition of a function across a vast class of similar types) is allowed in several languages. To construct instances of this type, a function will explicitly define generic components, and the type checker will insert the actual types at every generic/polymorphic function usage.

Implicit Types

Since specific languages, such as Haskell, demand that programs be strongly typed yet do not need explicit variable or function type declarations, a type inference technique is used to gather information about the usage of functions and variables used to infer their types. When there are inconsistencies in inferences, a type error is reported.

Also see, Type Checking a Program and Function Declarations

Frequently Asked Questions

What is type checking?

Type checking is the process of ensuring that each action in a program adheres to the language's type system. This usually signifies that all operands in a given expression are of the correct type and quantity.

What is overloading?

Overloading occurs when a class has many functions with the same name but distinct arguments. Function overloading allows us to use the same name for many functions in the same class, executing the same or specific functions.

What do you mean by type conversion?

Type conversion is an operation that takes a data item of one type and generates data objects of numerous types equal to it. A type conversion operation's signature is as follows:

conversion_op: type1→type2

Define polymorphism.

Polymorphism is a fundamental concept in object-oriented programming (OOP) that covers instances when something appears in several forms. It is a concept in computer science that you may access objects of various kinds using the same interface. This interface can be implemented in a variety of ways by each type.

What do you mean by AST?

A tree representation of the abstract syntactic structure of the text (typically source code) expressed in a formal language is known as an abstract syntax tree (AST). Each branch of the tree represents a textual construct.

Conclusion

In this article, we have discussed advanced type checking. We have discussed attributes, function declarations and a program in type checking. We have also discussed other features of type checking such as data structure, overloading, polymorphism and implicit types.

Recommended Reading:

Do check out The Interview guide for Product Based Companies as well as some of the Popular Interview Problems from Top companies like Amazon, Adobe, Google, Uber, Microsoft, etc. on Coding Ninjas Studio.

Also check out some of the Guided Paths on topics such as Data Structure and Algorithms, Competitive Programming, Operating Systems, Computer Networks, DBMS, System Design, etc. as well as some Contests, Test Series, Interview Bundles, and some Interview Experiences curated by top Industry Experts only on Coding Ninjas Studio.

Do upvote our blog to help other ninjas grow. 

Happy Coding!

Live masterclass