Table of contents
1.
Introduction
2.
Character Set in MySQL
3.
Collation in MySQL
3.1.
Properties of Collation in MySQL
3.2.
Sorting and Comparing using Collation in MySQL
4.
Setting Character Set and Collation in MySQL
4.1.
Setting Character Set and Collation at Server Level
4.1.1.
Example
4.2.
Setting Character Set and Collation at Database Level
4.2.1.
Example
4.3.
Setting Character Set and Collation at Table Level
4.3.1.
Example
4.4.
Setting Character Set and Collation at Column Level
4.4.1.
Example
5.
Key Points while Setting Character Set and Collation
6.
Frequently Asked Questions
6.1.
What is a collation in the database?
6.2.
What is collation in MySQL used for?
6.3.
Can we set collation at table level?
6.4.
What is the default collation in MySQL?
7.
Conclusion
Last Updated: Mar 27, 2024
Easy

Collation In MySQL

Author Gazal Arora
0 upvote

Introduction

The relational database management system MySQL is free to use and open-source. It stores user data in the form of tables. It allows users to create, alter, and query data. A collation in MySQL is a set of rules that defines how to sort and compare character strings. 

In this article, we will learn about collation in detail.

Collation in MySQL

Character Set in MySQL

A character set in MySQL is a set of symbols and encoding techniques. A collation is defined as a set of rules for comparing characters in a character set. A Character-set allows users to store data with a variety of character sets and do comparisons among them using  a variety of collations. 

NOTE: We can define character sets in MySQL at the server, database, table, and column levels.

Collation in MySQL

A collation is a set of rules in MySQL for comparing and sorting character strings. Each collation is a part of a specific character set. Every character set has at least one collation, and most of them have two or more. However, two character sets cannot have the same collation.

For a list of all default collations and character sets given by the MySQL database server, use the following:

SHOW CHARACTER SET;  


OUTPUT

In the below output, the default Collation column values give the default Collations name for each character set.

show character set output

NOTE: The SHOW CHARACTER SET statement by default displays all available character sets. Use MySQL's LIKE or WHERE clause if you want to access character sets of a particular type
Syntax

SHOW COLLATION LIKE 'character_set_name%';  

Properties of Collation in MySQL

  1. A collation for a character set starts with the character set name and ends with _cs (case sensitive), _ci (case insensitive), or _bin (binary)
     
  2. A collation is unique for a character set which means two character set cannot have the same collation
     
  3. MySQL uses latin1 and latin1_swedish_ci as the default character set and default collation respectively

Sorting and Comparing using Collation in MySQL

We can sort the character string using the collation. Collation performs character sorting based on weights. Characters of the Character string are mapped to weights. We can say two character strings are equal if their weights are the same. 

Basically, it compares characters based on the relative magnitude of their weights if their weights are different.

The WEIGHT_STRING() function from MySQL can be used to figure out the character weights in a string. This function returns a binary String value that represents the weights. To display the weights in printable form, we need to use the HEX(WEIGHT_STRING(str)) function.

Setting Character Set and Collation in MySQL

MySQL allows users to define character set and collation in four ways mentioned below:

  1. Setting Character Set and Collation at the Server Level
  2. Setting Character Set and Collation at the Database Level
  3. Setting Character Set and Collation at the Table Level
  4. Setting Character Set and Collation at the Column Level

Setting Character Set and Collation at Server Level

MySQL also allows users to change the default settings of the Character set and collation at server start-up. We can explicitly specify the Character set and collation as the beginning, then it will take that Character set and collation for all the databases.
Syntax to specify the Character set and collation during server startup using the command line tool:

mysqld --character-set-server = <character-set-name> --collation-server = <collation-name>


Example

To specify the utf8 as a character set and utf8_unicode_cs collation at the server level, use the following:

mysqld --character-set-server = utf8 --collation-server = utf8_unicode_ci  

Setting Character Set and Collation at Database Level

We can change the default settings of the Character set and collation at the database level while creating a database using CREATE DATABASE or changing the database using ALTER DATABASE statement.

NOTE: If we don't specify the Character set and collation during database creation, it will take the default server-level configuration of the Character set and collation.

Below is the syntax of setting Character set and collation while creating a database:

CREATE DATABASE database_name  CHARACTER SET character_set_name COLLATE collation_name;  


Below is the syntax of setting Character set and collation while altering a database:

ALTER DATABASE database_name  CHARACTER SET character_set_name COLLATE collation_name;  


NOTE: MySQL uses the character set and collation set at the database level for all tables created within the database.

Example

To specify the utf8 as a character set and utf8_unicode_cs collation at the database level, use the following:

CREATE DATABASE codingNinjas CHARACTER SET utf8 COLLATE utf8_unicode_ci;  

Setting Character Set and Collation at Table Level

A database contains tables that have a character set and collation defined. If they are not defined, a default character set and collation of the database is used.

We can set a character set and collation for a table while creating using CREATE TABLE or using ALTER TABLE command.

Below is the syntax of setting Character set and collation while creating a table:

CREATE TABLE table_name(  
… )  CHARACTER SET character_set_name  COLLATE collation_name  ;


Below is the syntax of setting Character set and collation while altering a table:

ALTER TABLE table_name(  
  ...  )  CHARACTER SET character_set_name  COLLATE collation_name  ;


Example

To specify the utf8 as a character set and utf8_unicode_cs collation at the table level, use the following:

USE codingNinjas;   
CREATE TABLE table1( Column1 int(25) ) CHARACTER SET utf8 COLLATE utf8_unicode_ci;  

Setting Character Set and Collation at Column Level

A column can be of any different type such as INT, CHAR, VARCHAR, or TEXT. We can define a column’s own character set and collation, which is different from the table's default settings.

We can define the column’s character set and collation in the column's definition using the CREATE TABLE or ALTER TABLE statements.

Below is the syntax for setting the Character set and collation of a column while creating a table:

CREATE TABLE table1(column_name [column_type] (length)  CHARACTER SET character_set_name  COLLATE collation_name );


Below is the syntax for setting the Character set and collation of a column using alter table:

ALTER  TABLE table1 MODIFY column_name [column_type] (length)  CHARACTER SET character_set_name  COLLATE collation_name;


Example

To specify the utf8 as a character set and utf8_unicode_cs collation at the column level, use the following:

USE codingNinjas;   
CREATE TABLE table1(  Column1 int(25) CHARACTER SET utf8 COLLATE utf8_unicode_ci );

Key Points while Setting Character Set and Collation

The following are the key points or rules for setting the character set and collation:

  • We can set  both character set and collation in MySQL explicitly
     
  • If we only specify the character set, MySQL uses the character set's default collation
     
  • If we only specify collation, MySQL uses the default character set associated with the collation
     
  • If we do not specify the Character set and coalition, MySQL uses the default ones

Frequently Asked Questions

What is a collation in the database?

A collation in a database is defined as a set of rules that determine how data is sorted and compared. It is usually applied to text data(Strings). Each collation is a part of a specific character set.

What is collation in MySQL used for?

Collations in MySQL used to determine the rules to sort and compare data/character Strings. It is basically used to sort data in different languages for making comparisons between text values. Different languages have different character sets and collations.

Can we set collation at table level?

Yes, we can set collation for a table. We can either set it using creation or we can use ALTER TABLE to define a specific collation.

What is the default collation in MySQL?

MySQL uses latin1 and latin1_swedish_ci as the default character set and default collation respectively.

Conclusion

In this article, we learned about collations in MySQL. We also learned how to define collations at server, database, table, and column levels. This is used to sort and compare different Character Strings in MySQL.

Refer here to learn more about MySQL.

You can refer to our guided paths on the Coding Ninjas. You can check our course to learn more about DSADBMSCompetitive ProgrammingPythonJavaJavaScript, etc. 

Also, check out some of the Guided Paths on topics such as Data Structure and AlgorithmsCompetitive ProgrammingOperating SystemsComputer Networks, DBMS, and System Design, etc. as well as some Contests, Test SeriesInterview Bundles, and some Interview Experiences curated by top Industry Experts.

 

Happy Learning!

Live masterclass