See how you stack up against top hiring criteria for the role in 2025.
Compare against 1000+ live job postings
Identify critical technical skill gaps
Get a personalized improvement roadmap
No signup required, takes less than 30 sec
Introduction
YAML is used for serializing data in a human-readable format, and it is commonly used in configuration files and exchanging data between systems. You can parse YAML in different programming languages with in-built or third-party libraries.
In this article, you will learn how to read YAML from a file using Python.
Let’s get started.
What is YAML?
YAML stands for “YAML Ain’t Markup Language,” a human-readable data serialization format commonly used for data exchange, scripting, and configuration files. There are some alternatives to YAML as well, such as JSON and XML; in the next section, we will compare them.
Differences between JSON, XML, and YAML
Let’s take a look at some of the key differences between JSON, XML, and YAML.
Feature
YAML
XML
JSON
Readability
Very readable and human-friendly syntax.
Verbose and less human-readable due to tags.
Readable but less concise than YAML.
Comments
It supports inline and block comments.
It supports comments but can clutter the code.
Only supports single-line comments.
Metadata
Can include metadata within the document.
No support for metadata.
No support for metadata.
Schema Validation
Limited built-in schema validation.
Can use DTD, XSD, or other validation tools.
No built-in schema validation.
Data Types
Supports a wide range of native data types.
Limited native data types requires additional parsing.
Limited native data types requires additional parsing.
Now we will look at the same data represented in all the 3 formats to illustrate the difference between them.
YAML
JSON
XML
YAML
person: name: John Doe age: 30 email: john@example.com phone: 123-456-7890
As you can see, YAML is significantly easier to read in comparison to XML and JSON.
In the following sections, you will learn how to read YAML data in Python using PyYAML.
Installing PyYAML
We must use a third-party library called PyYAML as Python lacks built-in support for YAML. This library lets you parse YAML files and convert them into Python data structures.
If you use pip or pip3, you can install PyYAML using the following commands:-
Bash
Bash
pip3 install pyyaml
pip install pyyaml
On successful installation, you will see an output similar to the following.
For Linux distributions, you can use the following commands depending on your package manager:-
Before getting started, we will create a demo YAML file.
Creating a YAML File
YAML files have the “.YAML” extension; create a “demo.YAML” file and add the following content.
YAML
YAML
name: John Doe
age: 30
email: johndoe@example.com
interests:
- Reading
- Hiking
- Cooking
For this example, we have used VSCode for Python notebooks. You may use other software such as Jupyter. Also, make sure to set the Python kernel correctly.
Reading YAML
Create a notebook cell and paste the following code inside it.
Python
Python
import yaml
with open('data.yaml', 'r') as yaml_file:
yaml_data = yaml.safe_load(yaml_file)
print(yaml_data)
You can also try this code with Online Python Compiler
In the code snippet above, we have used the load function with the SafeLoader to convert the contents of our YAML file into corresponding Python data structures. When you run the notebook cell, you should see the following output.
Always ensure that your YAML files are correctly formatted and indented, as YAML relies on it for structuring the data.
Types of Loaders Supported by the load() Function
The load() function supports the following loaders:-
SafeLoader: This loader only recognizes standard YAML tags and cannot construct arbitrary objects, making it safe to use with untrusted data.
BaseLoader: This loader can only construct basic Python objects such as lists, dictionaries, and Unicode strings.
Loader: This loader supports all predefined tags and can construct an arbitrary Python object, making it unsafe to use with untrusted data.
In the next section, you will learn about the basics of YAML.
Basics of YAML
The following are some basics of YAML:-
Indentation
YAML uses indentations to indicate nested elements. It is important to use them correctly as it determines the structure of your data.
Scalar
Scalars are single values such as strings, numbers, or booleans. They are usually written without quotation marks; however, you can use them if necessary.
Example
YAML
YAML
name: John Doe
age: 30
email: johndoe@example.com
List
List elements are represented by hyphens followed by a space. Each element can be of different types, such as scalars or other data structures.
Example
YAML
YAML
interests:
- Reading
- Hiking
- Cooking
Dictionary
This data structure is used for storing data in the form of key-value pairs.
Example
YAML
YAML
person:
name: John Doe
age: 30
Comment
You can create comments in a YAML file using the ‘#’ symbol, similar to Python.
Example
YAML
YAML
name: John Doe #This is a comment
age: 30
email: johndoe@example.com
The following section will teach you how to read key-value pairs from a YAML file.
Reading Values and Keys From a YAML File
Paste the following YAML into the data.YAML file we created earlier.
YAML
YAML
- name: John Doe
age: 30
email: john@example.com
address:
street: 123 Main St
city: Anytown
country: USA
- name: Jane Smith
age: 25
email: jane@example.com
address:
street: 456 Elm St
city: Another City
country: Canada
- name: Alex Johnson
age: 28
email: alex@example.com
address:
street: 789 Oak Rd
city: Someplace
country: Australia
Create a new cell in your Python notebook and paste the following code.
Python
Python
import yaml
with open('data.yaml', 'r') as yaml_file:
yaml_data = yaml.load(yaml_file, Loader=yaml.SafeLoader)
for obj in yaml_data:
for key, value in obj.items():
print(key, ":", value)
print()
You can also try this code with Online Python Compiler
In the code snippet above, we use the outer loop to traverse over the array and the inner loop traverses over each key-value pair. When you run the notebook cell, you should see the following output.
Data serialization is the process of converting complex data structures into a format that can be easily transmitted and reconstructed from the serialized form. Some common data serialization formats are JSON, XML, and YAML.
Is YAML better than JSON?
YAML is better than JSON in terms of readability and simpler syntax. However, JSON has wider support across different programming languages due to its consistency and strict specification.
How to define types in YAML?
In YAML, data types are inferred based on the content and structure of your data. However, you can explicitly define a type using the ‘!!’ symbol followed by a type name such as int, float, string, or boolean.
Conclusion
In this article, you learned how to read data from a YAML file using Python, some basics of YAML, and some differences between YAML, JSON, and XML.
If you want to learn more about Python and YAML, go through the following articles:-