Table of contents
1.
Introduction
2.
Basic Concepts 
2.1.
Notations and Basic Equations
2.2.
XOR and Bias
2.3.
Piling-up Lemma
3.
Linear Approximation of S-boxes
4.
Frequently Asked Questions
4.1.
Do S-boxes perform only permutations?
4.2.
Where is the Linear Approximation Table used?
4.3.
What is linear cryptanalysis?
5.
Conclusion
Last Updated: Mar 27, 2024
Medium

Approximations of S-boxes linearly in Cryptography

Author Satvik Gupta
1 upvote
Career growth poll
Do you think IIT Guwahati certified course can help you in your career?

Introduction

Cryptography is an important part of the digital world today. It is how we protect the vast amounts of digital information that exist on thousands of servers worldwide. To ensure that cryptography remains strong and cannot be hacked, we ourselves need to study methods to break cryptographic algorithms. This helps us learn about weaknesses in our ciphers and how to remove those weaknesses.

Approximations of S-boxes linearly in Cryptography

S-boxes are a common part of many cryptographic ciphers. There are many ways that have been tried to break them and find flaws. Linear Cryptanalysis is one of them. To perform linear cryptanalysis, we first need to perform a Linear Approximation of the S-box. Let us see what it is and how we can do it. 

Basic Concepts 

Cryptography involves a lot of probability. To properly understand the linear approximation of S-boxes, we will need some basic concepts of probability and random variables. 

To learn about S-boxes, you can read our article SPN in Cryptography. For Random Variables, you can read the articles Introduction to Random Variables and Random Variables

Linear Approximation of S-boxes is important because non-linear S-boxes are considered cryptographically secure. S-boxes, and even other cryptographical tools, that have linear properties are considered vulnerable to many attacks. 

Linear properties are those that we can easily estimate and predict using probabilistic analysis. They allow us to predict which plaintext will result in which ciphertext, or vice versa. This is because S-boxes with linear properties tend to have bias. We will look at all of this in detail in the upcoming sections.

Notations and Basic Equations

We will be using the following notation throughout the article.

Pr[X=x] represents the probability that the random variable X takes the value x.

Pr[x] is used to represent the same thing, when X is understood.

Pr[x,y] represents the probability that X=x and Y=y.

Pr[x|y] represents the probability that X=x, given that Y=y.

The following results can be concluded from the above definition.

Probability Equations

The second equation is the famous Bayes Theorem.

XOR and Bias

Suppose X1, X2,.... are independent random variables that can take values 0 and 1. 

Let 

pi = Pr[Xi=0]

Then,

1-pi = Pr[Xi=1]

 

For Xi and Xj, the independence of Xi and Xj implies that:

 Pr[Xi=0,Xj=0] = pipj

 Pr[Xi=0,Xj=1] = pi(1-pj)

 Pr[Xi=1,Xj=0] =(1- pi )pj

 Pr[Xi=1,Xj=1] = (1-pi)(1-pj )

Let's consider a new random variable Y = Xi ⊕ Xj. 

It's easy to see that 

Pr[Y=0]=Pr[Xi ⊕ Xj=0] = pipj+ (1-pi)(1-pj )

Pr[Y=1]=Pr[Xi ⊕ Xj=1] = pi(1-pj)+ (1-pi)pj 

 

This is because ⊕ gives output 1 when both inputs are different, otherwise the output is 0. 

Bias is a commonly used term in probability theory and is used in probability distribution of random variables. The bias of the random variable Xi is defined as:

i = pi - ½

Piling-up Lemma

Let i1,i2,i3,....ik denote the bias of the random variable Xi1 ⊕ Xi2 ⊕ Xi3 ⊕ …..⊕ Xik, then:

Bias of Random variable

For example: 

The bias of X1 ⊕ X3 ⊕ X6 will be 

1,3,6 = 22 1.3.6 

With these basic concepts, we can start doing the linear approximation of S-boxes. To learn more about Piling-up lemma, you can visit our article on Linear Cryptanalysis.

Linear Approximation of S-boxes

Consider an S-box that:

  • Takes a binary string of length m as input.
     
  • Outputs a binary string of length n.
     
  • m may or not be equal to n.
     

Let us take an input string X = (x1,x2,.....xm). For each xi, there is a random variable Xi that can take the values 0 or 1. Since we are equally likely to choose 0 or 1, the bias of Xi is 0. Also, all the Xi variables are independent of each other.

Similarly, we define an output string Y = (y1,y2,.....yn). There are n random variables Yi that can be 0 or 1. However, the bias of Yi may or may not be 0, and they may or may not be independent of each other. That will depend on the S-box. 

Let us take an example. 

Consider the following S-box. 

S-box

It takes a binary input of length 4 and outputs a binary string of length 4. For simplicity, the binary strings are represented in hexadecimal form. 

The table below shows the value of the random variables Xi and Yi for all possible cases.

Value of random variables for s-box

Let's say we want to find the bias of the random variable X1 ⊕ X4 ⊕ Y. Bias is useful because when the value of bias is away from zero (i.e, the random variable tends to bias towards certain values, compared to others), then we can launch a linear attack. A perfect random variable would have a bias of zero, which means it is equally likely to hold any possible value.

We can do calculate the bias by counting the number of rows in which this variable is 0, and dividing by 16(the total number of rows in the table). This will give us pi for this variable, and using that, we can easily calculate bias. 

The number of rows in which  X1 ⊕ X4 ⊕ Y=0 is 8. 

Therefore, the probability of it being 0 = 8/16 =1/2.

This makes the bias as 0. 

In this way, we can calculate the bias of all random variables. 

Let us represent the random variables in this format: 

 a1X1 ⊕ a2X2 ⊕ a3X3 ⊕ a4X4 ⊕ b1Y1 ⊕ b2Y2 ⊕ b3Y3 ⊕ b4Y4 

Where aand bi can be 0 or 1. 

This may seem confusing. But it's simply a way of representation. 

For example, in our earlier example,, we took the random variable X1 ⊕ X4 ⊕ Y2

To represent it in the new format, we would simply take a1=1, a4=1,b=1, and all the other values of ai and bi as 0. This gives us a common representation for all random variables. 

It also allows us to create the Linear Approximation Table for an S-box. The binary value of (a1,a2,a3,a4) gives the row number. This is also called the input sum. The binary value of (b1,b2,b3,b4) gives the column number. This is also called the output sum. The value in the table with that row and column will be the number of rows in which that random variable is equal to zero. Since we have binary string of length 4, we will represent it in hexadecimal for simplicity. 

For example, in our earlier example:

a=1, a2=0, a=0, a4=1.

The input sum will be (1,0,0,1), which is 9 in hexadecimal. 

b=0, b2=1, b=0, b4=0.

The output sum will be (0,1,0,0), which is 4 in hexadecimal. 

The number of rows with  X1 ⊕ X4 ⊕ Y=0 is 8. 

The input sum (9 in our case) gives the row. The output sum (4 in our case) gives the column. Thus, the value at row number 9 and column number 4 of our table will be 8.

Similarly, we can calculate the same for all other values. The complete table is given below. 

Linear Approximation Table

Frequently Asked Questions

Do S-boxes perform only permutations?

No, S-boxes can transform the data using any method. It is possible that they do a permutation, but it is not necessary. In fact, for S-boxes where m and n are different, it is impossible that the S-box does a permutation.

Where is the Linear Approximation Table used?

The linear approximation table can be used to find out the bias for certain inputs and outputs. A bias means that the random variable is more likely to take on some values as compared to others.This is useful during the Linear cryptanalysis attack.

What is linear cryptanalysis?

Linear cryptanalysis is a known plaintext attack on ciphers. It involves looking at probabilistic linear relationships between the plaintext and the ciphertext to find weaknesses. We try to find out the probability that a particular plaintext will produce a certain ciphertext.

Conclusion

This blog has explored approximation of S-boxes linearly in Cryptography. We have seen the basic probability and statistics required for it. We have looked into bias, and constructed the linear approximation table for an example S-box.

If you liked this article, check out our other articles on Cryptography:

You can practice questions on various problems on Coding Ninjas Studio, attempt mock tests. You can also go through interview experiences, interview bundle, go along guided paths for preparations, and a lot more!

Keep coding, keep reading Ninjas. 

Live masterclass