Do you think IIT Guwahati certified course can help you in your career?
No
Introduction
In computers, the data are stored in memory registers with binary bits 1's and 0's as the computers only understand binary language. When we enter data in the computer, it is converted into binary and then processed and used by the CPU in different ways. The memory registers have a specific range and a format to store data. Scientists have developed a real number representation method in the memory registers of 8 bit, 16 bit, 32bit.
We have two major approaches for storing real numbers: Fixed and Floating-Point Representation. This article will learn about Fixed and Floating-Point Representation in detail.
Fixed Point Representation
In computers, fixed-point representation is a real data type for numbers. Fixed point representation can convert data into binary form, and then the data is processed, stored, and used by the computer. It has a fixed number of bits for the integral and fractional parts. For example, if given fixed-point representation is IIIII.FFF, we can store a minimum value of 00000.001 and a maximum value of 99999.999.
There are three parts of the fixed-point number representation: Sign bit, Integral part, and Fractional part. The below figure depicts it.
Parts of fixed-point representation.
Sign bit:- The fixed-point number representation in binary uses a sign bit. The negative number has a sign bit 1, while a positive number has a bit 0.
Integral Part:- The integral part in fixed-point numbers is of different lengths at different places. It depends on the register's size; for an 8-bit register, the integral part is 4 bits.
Fractional part:- The Fractional part is of different lengths at different places. It depends on the registers; for an 8-bit register, the fractional part is 3 bits.
Size of Sign Bit, Integer Part, and Fractional Part for different registers are displayed below:
Register
Sign Bit
Integer Part
Fraction Part
8-bit register
1 bit
4 bits
3 bits
16-bit register
1 bit
9 bits
6 bits
32-bit register
1 bit
15 bits
9 bits
How to write numbers in Fixed-point notation?
Now that we have learned about fixed-point number representation, let's see how to represent it.
The number considered is 4.5
Step 1: We will convert the number 4.5 to binary form. 4.5 = 100.1
Step 2: Represent the binary number in fixed-point notation with the following format.
Fixed Point Notation of 4.5
Floating Point Representation
Floating Point representation doesn't reserve any specific number of bits for the integer or fractional parts. But instead, it reserves certain bits for the number (called the significand or mantissa ) and a fixed number of bits to say where the decimal place lies(called the exponent).
The computer uses floating-point number representation to convert the input data into binary form. This binary form number is converted into scientific notation, which is converted into floating-point representation.
The floating-point representation has two types of notation:
1. Scientific notation: Scientific notation is the method of representing binary numbers into a x be form. It is further converted into floating-point representation. For example,
Number = 32625
Number in Scientific Notation = 32.625 x 103
Number in binary form = 1101.101*2101
Here, Mantissa is 1101.101 and Base part is 2101.
2. Normalization notation: It is a special case of scientific notation. Normalized means that we have at least one non-zero digit after the decimal point.
A floating-point representation has three parts: Sign bit, Exponent Part, and Mantissa. We can see the below diagram to understand these parts.
Parts of floating-point representation
Sign bit:- The floating-point numbers in binary uses a sign bit. A negative number has a sign bit 1, while a positive number has a sign bit 0. The sign of any number depends on mantissa, not on exponent.
Mantissa Part:- The mantissa part is of different lengths at different places. It depends on registers like for a 16-bit register, and mantissa part is of 8 bits.
Exponent Part:- It is the power of the number. It depends on the size of the register. For example, in the 16-bit register, the exponent part is of 7 bits.
How to write numbers in Floating-point notation
Now that we have learned about floating-point number representation, let's see how to represent it.
The number considered is 53.5
Step 1: We will convert the number 53.5 to binary form. 53.5 = 110101.1
Step 2: Normalize the number ( base is 2) = (1.101011) * 25.
Step 2: Represent the binary number in floating-point notation with the following format.
Floating Point Notation of 53.5
De-normalized Notation
De-normalization Notation is just the reverse of the normalized notation. In normalized notation, after decimal we have '1' written in the equation, but in the de-normalized notation, we have '0' after the decimal. For example, the largest de-normalized number with excess-64 can be represented as:
Sign Bit
Exponent Part
Mantissa Part
0
1111111
01111111
Advantages of Fixed Point Representation
The advantages of Fixed Point Representation are as follows:
Fixed-point calculations can be completed more quickly than floating-point calculations. This is because integer arithmetic, which is frequently quicker than floating-point arithmetic, can be used to create fixed-point arithmetic.
Fixed-point numbers can be represented more compactly than floating-point numbers by utilising fewer bits. In embedded systems or other applications where memory is constrained, this could help preserve memory space.
Effective memory management is possible with the help of fixed-point pointers. This is because registers can be accessed more quickly than memory and can store fixed-point pointers.
Disadvantages of Fixed Point Representation
The disadvantages of Fixed Point Representation are as follows:
Fixed-point numbers have the potential to overflow or underflow if improperly handled. Calculation errors may result from this.
Compared to floating-point numbers, fixed-point numbers are less precise. As a result, fixed-point numbers cannot express a range of values as broad as floating-point numbers.
Fixed-point programming is sometimes more difficult than floating-point programming. This is due to the fact that fixed-point values are less precise, making it more crucial to take precautions to prevent overflow and underflow.
Advantages of Floating Point Representation
loating-point representation offers several advantages, particularly for scientific and engineering applications that require a wide range of values and precise calculations. Here are the key advantages:
Wide Range of Values- Floating-point numbers can represent very large and very small numbers, making them suitable for scientific calculations involving extreme values.
Precision- They provide a high degree of precision for fractional values, which is essential for accurate mathematical computations.
Dynamic Range- Floating-point representation allows for a dynamic range that covers many orders of magnitude, from very small to very large numbers.
Standardization- The IEEE 754 standard ensures consistency and portability of floating-point arithmetic across different computing platforms and programming languages.
Support for Special Values- Floating-point representation includes special values such as zero, positive and negative infinity, and NaN (Not a Number), which are useful for handling exceptional cases and errors in computations.
Efficiency- Modern processors and hardware are optimized for floating-point arithmetic, providing efficient performance for mathematical and scientific calculations.
Exponent Representation- The use of an exponent in floating-point numbers allows for the representation of very large and very small numbers in a compact format.
Disadvantages of Floating Point Representation
Despite its many advantages, floating-point representation has several disadvantages:
Precision Issues- Floating-point numbers cannot represent all real numbers exactly, leading to rounding errors and limited precision, especially in iterative calculations.
Complexity- Floating-point arithmetic is more complex than integer arithmetic, both in terms of understanding the representation and the hardware required to implement it.
Performance- Floating-point operations are generally slower than integer operations due to the additional complexity in processing.
Representation Gaps- There are gaps between representable numbers, meaning some values cannot be represented exactly, leading to potential inaccuracies in calculations.
Special Cases Handling- Handling special cases like NaN, infinity, and denormalized numbers adds additional complexity to software and hardware implementations.
Consistency- Due to rounding errors and differences in implementation, floating-point calculations can yield slightly different results on different systems or compilers, affecting reproducibility.
Storage Requirements- Floating-point numbers typically require more storage space than integers (e.g., 32 bits for single precision, 64 bits for double precision), which can be an issue in memory-constrained environments.
Frequently Asked Questions
What are the different types of fixed point representation?
The various types of fixed-point representation include signed, unsigned, fractional, integer, and scaled formats. Each type caters to specific requirements in signal processing and digital systems design, offering a range of precision and numerical representations suitable for diverse applications.
What is normalized floating-point representation?
If the most significant bit of the mantissa is non-zero, then such a representation is known as normalized floating-point.
Which method is preferred in fixed and floating-point representation?
We should choose fixed-point representation among fixed and floating-point representation if the data do not exceed the range and high performance is required. At the same time, we may use floating-point representation if the data is greater than the range.
What is fixed-point vs floating-point architecture?
Fixed-point uses a fixed number of decimal places, while floating-point supports a wide range of values with variable decimal places.
What is IEEE 754 floating-point representation?
IEEE 754 standardizes floating-point representation, defining formats for binary and decimal values, ensuring consistency and precision in numerical computations.
What is special value representation?
Some special values depend upon different values of the mantissa and exponent in the IEEE 754 standard.
All the exponent bits 0 with all the mantissa bits 0 represents 0. If the sign bit is 0, then +0, else -0. All the exponent bits 1 with all the mantissa bits 0 represents infinity. If the sign bit is 0, then +∞, else -∞. All the exponent bits 0 and the mantissa bits non-zero represent a denormalized number. All the exponent bits 1 and the mantissa bits non-zero represent an error.
Conclusion
In this article, we have extensively discussed Fixed and Floating-Point Representation. We also learned how to write numbers in Fixed and Floating-Point Representation.
We hope that this blog has helped you enhance your knowledge regarding Fixed and Floating-Point Representation and if you would like to learn more, check out our articles on Repeater in Computer Network, and Register in Computer. Do upvote our blog to help other ninjas grow. Happy Coding!