Table of contents

Introduction

Representing Numbers

2.1.

Python

Representing Text

Representing Bits & Bytes

Data Compression

Frequently Asked Questions

6.1.

What's the difference between ASCII and Unicode in text representation?

6.2.

Can lossy compression be used for text files?

6.3.

Why can't we just use lossless compression for everything?

Conclusion

Last Updated: Aug 13, 2025

Easy

Data Representation

Author Ravi Khorwal

Do you think IIT Guwahati certified course can help you in your career?

Yes

Introduction

Data representation is fundamental in computing. It's all about how computers, which only understand binary, interpret various types of information - numbers, text, images, and more. Think of it like translating a foreign language into one you understand.

This article unfolds the layers of this translation process. We'll start from the basics of representing numbers and text, delve into the nuts & bolts of bits & bytes, and even touch upon how data can be squeezed without losing meaning through compression.

Representing Numbers

In computing, numbers are the starting point of everything. Whether you're counting clicks, calculating an average, or setting a high score in a game, it's all numbers under the hood. Computers use a binary system, meaning they operate using two digits: 0 and 1. These digits, or bits, are the smallest units of data in computing.

Let's break it down. Think about the decimal system, which is what we use daily. It's based on 10 digits, from 0 to 9. Each step you move to the left increases the value tenfold, right? In binary, it's similar, but you only have 0 and 1. Each step to the left doubles the value. So, 10 in binary is not ten, but two in decimal.

Now, when we talk about representing larger numbers, we string these bits together. For example, the binary number 110 translates to six in decimal (1*2^2 + 1*2^1 + 0*2^0). This way, using just two symbols, computers can represent any number imaginable.

Here's a quick example in Python to convert a decimal number to binary:

Python

Python

def decimal_to_binary(decimal):

    return bin(decimal)[2:]

# Convert the number 10 to binary

binary_number = decimal_to_binary(10)

print(f"Decimal 10 in binary is: {binary_number}")

You can also try this code with Online Python Compiler

Run Code

Output

Decimal 10 in binary is: 1010

This code snippet defines a function decimal_to_binary that uses Python's built-in bin function to convert a decimal number to its binary representation, then strips off the '0b' prefix that Python adds to denote a binary number.

Understanding how numbers are represented is crucial because it forms the foundation of more complex data types and operations in computing. From simple arithmetic to complex algorithms, it's all about manipulating these binary digits efficiently.

Imagine every number you know is translated into a unique combination of 0s & 1s for a computer to understand. This process is like using a secret code where each number has its binary equivalent. For example, the number 2 is represented as '10' in binary, and 3 is '11'.

Why does this matter?

Computers use this binary system to perform calculations, store data, and more. It's the foundation of all computing processes. Here's a simple example to illustrate this. Let's convert the number 5 into binary:

Start with the highest power of 2 that fits into 5. That's 2^2 (which is 4).
Subtract 4 from 5, leaving us with 1.
Now, we take the next lower power of 2, which is 2^1 (2), but it doesn't fit into 1. So, we mark it as 0 in binary.
Finally, 2^0 (which is 1) fits perfectly into our remaining 1.

So, the binary representation of the number 5 is '101'. This is a straightforward process that computers use to represent numbers, making calculations and data processing possible.

Understanding this binary system is key to getting how computers work at the most basic level. It's not just about numbers; it's about translating our everyday numerical system into something a computer can work with.

Representing Text

In computing, text is represented using a system that assigns a unique number to each character (like letters, numbers, and symbols). The most common system for this is ASCII (American Standard Code for Information Interchange). Let's simplify this concept.

Imagine your computer has a big chart. On this chart, every letter or symbol you can type is matched with a specific number. For instance, in ASCII, the letter 'A' is matched with the number 65, 'B' with 66, and so on. When you type a letter, your computer sees it as its corresponding number.

Why is this important?

Well, because computers only understand numbers (or more precisely, binary numbers), this system allows them to understand and store text. Every time you type something, it gets converted into a series of numbers that the computer can process.

Let's take the word "Hi" as an example. In ASCII, 'H' is 72 and 'i' is 105. So, the computer represents "Hi" as the numbers 72 and 105.

It's a bit like translating a language where each letter has its equivalent in another language (in this case, the language of numbers). This process enables computers to handle text data, from simple documents to complex web pages.

Representing Bits & Bytes

In our today’s digital world, 'bits' and 'bytes' are terms thrown around a lot. But what are they really? Simply put, a bit is the smallest piece of data in a computer and it can be either a 0 or a 1. A byte, on the other hand, is a group of 8 bits. This might sound too simplistic, but it's the foundation of all digital data.

Think of bits like individual on/off switches. A single bit can represent two states, much like a light switch can be either on or off. When you combine 8 of these switches (bits) together, you get a byte. This byte can represent 256 different states (from 00000000 to 11111111 in binary).

Why does this matter?

Well, every piece of data on your computer, from the text in a document to the colors in a photo, is ultimately represented by these bits and bytes. For example, a simple text file might use one byte for each character, meaning 'A' could be represented as 01000001 in binary (which is 65 in decimal, as per ASCII).

This concept of bits and bytes is crucial for understanding how computers store and process information. It's like the alphabet of the computer world; by combining these bits and bytes in various ways, computers can represent and manage all sorts of complex data.

Understanding bits and bytes gives us insight into the inner workings of digital storage and data processing, illustrating how complex operations are broken down into manageable, bite-sized (or byte-sized!) pieces.

Data Compression

Data compression is about making files smaller so they take up less space on your computer or can be sent faster over the internet. It's like when you pack a suitcase for a trip and try to fit as much as you can into as little space as possible. In computing, there are two main types of compression: lossless and lossy.

Lossless Compression keeps all the original data intact. When you decompress the file, it's exactly the same as it was before. It's perfect for text documents and program files, where you can't afford to lose any information. Imagine you're writing a note and want to make it shorter without changing any words; that's what lossless compression does with data.

Lossy Compression reduces file size by permanently removing some information, usually data that's not essential or that humans can't easily notice, like slight color differences in a photo or very quiet sounds in music. It's commonly used for images, videos, and audio files where a perfect reproduction isn't necessary. Think of telling a story but leaving out some details that aren't crucial to the plot. The main idea is still clear, but the story is shorter.

Let's consider a simple example of lossless compression: Run-Length Encoding (RLE). If you have a file with the content "AAAAAABBBBBCCCCC", RLE would compress this to "6A5B5C", significantly reducing the size without losing any data. It simply counts how many times a character repeats and then writes down the character and its count.

Data compression is a balancing act. You want to reduce the size as much as possible without losing the essential qualities of the file or making it unusable. It's like packing that suitcase: you want it to be light, but you also need everything for your trip. In the digital world, effective compression means faster downloads, less storage space, and efficient data handling.

Frequently Asked Questions

What's the difference between ASCII and Unicode in text representation?

ASCII uses 7 bits to represent characters, limiting it to 128 unique symbols. Unicode is a more comprehensive system that uses more bits to include characters from many languages and symbols, supporting over 143,000 characters.

Can lossy compression be used for text files?

Typically, no. Text files usually use lossless compression because every character is important. Lossy compression, which removes data, could alter the meaning of the text.

Why can't we just use lossless compression for everything?

Lossless compression doesn't reduce file sizes as much as lossy compression can. For large files like videos or high-quality images, lossless compression might not sufficiently reduce the file size for efficient storage or transmission.

Conclusion

In this article, we've looked at the essentials of data representation, from numbers and text to the foundational bits and bytes. We also explored the compact world of data compression, understanding its necessity and the trade-offs between its two main types: lossless and lossy. This information helps us in appreciating how digital systems manage and optimize data, crucial for anyone interested in learning the basics of computers.

You can refer to our guided paths on the Coding Ninjas. You can check our course to learn more about DSA, DBMS, Competitive Programming, Python, Java, JavaScript, etc. Also, check out some of the Guided Paths on topics such as Data Structure and Algorithms, Competitive Programming, Operating Systems, Computer Networks, DBMS, System Design, etc., as well as some Contests, Test Series, and Interview Experiences curated by top Industry Experts.

Live masterclass

Top 5 GenAI Projects to Crack 25 LPA+ Roles in 2026

by Shantanu Shubham

10 Mar, 2026

03:00 PM

12+ registered

Zero to Data Analyst: Google Analyst Roadmap for 30L+ CTC

by Prashant

08 Mar, 2026

06:30 AM

152+ registered

Beginner to GenAI Engineer Roadmap for 30L+ CTC at Amazon

by Shantanu Shubham

08 Mar, 2026

08:30 AM

47+ registered

Amazon-Ready SQL & Python : Crack 20L+ CTC Data Analyst Roles

by Abhishek Soni

09 Mar, 2026

01:30 PM

142+ registered

Top GenAI Skills to crack 30 LPA+ roles at Amazon & Google

by Sumit Shukla

09 Mar, 2026

03:00 PM

12+ registered

Data Analysis for 20L+ CTC@Flipkart: End-Season Sales dataset

by Sumit Shukla

10 Mar, 2026

01:30 PM

30+ registered

Top 5 GenAI Projects to Crack 25 LPA+ Roles in 2026

by Shantanu Shubham

10 Mar, 2026

03:00 PM

12+ registered

Zero to Data Analyst: Google Analyst Roadmap for 30L+ CTC

by Prashant

08 Mar, 2026

06:30 AM

152+ registered

View more events