Table of contents
1.
Introduction
2.
Reference Counting
2.1.
Python
3.
Example 2: Reference Counting with Cyclic Reference
4.
Example 3: Using the sys.getrefcount() function
4.1.
Python
5.
Garbage Collection
6.
Generational Garbage Collection
7.
Automatic Garbage Collection of Cycles
8.
Manual Garbage Collection
8.1.
1. Triggering garbage collection
8.2.
2. Disabling and re-enabling garbage collection
8.3.
3. Checking the number of objects in each generation
8.4.
Python
9.
Forced Garbage Collection
10.
Disabling Garbage Collection
10.1.
Python
11.
Interacting with Python Garbage Collector
11.1.
1. Enabling and disabling the garbage collector
11.2.
2. Forcing garbage collection
11.3.
3. Inspecting garbage collector settings
11.4.
4. Setting garbage collector thresholds
12.
Advantages and Disadvantages
12.1.
Advantages
12.2.
Disadvantages
13.
Frequently Asked Questions
13.1.
Can I completely disable garbage collection in Python?
13.2.
How often does the garbage collector run in Python?
13.3.
Can I force garbage collection to run at a specific point in my code?
14.
Conclusion
Last Updated: Jul 22, 2024
Easy

Python Garbage Collection

Author Gaurav Gandhi
0 upvote

Introduction

Python is a powerful programming language used by developers worldwide. One of the key features that makes Python so efficient is its automatic memory management. Python uses a technique called garbage collection to free up memory that is no longer being used by the program. 

Python Garbage Collection

In this article, we will explore what garbage collection is, how it works in Python, and some examples to illustrate its usage. We will cover topics such as reference counting, generational garbage collection, and manual garbage collection. 

Reference Counting

In Python, every object has a reference count, which keeps track of how many references point to that object. When an object's reference count drops to zero, it means that the object is no longer being used by the program, and Python automatically frees up the memory occupied by that object. This process is known as reference counting.

For example : 

  • Python

Python

# Example 1: Simple Reference Counting

a = 10  # Creates an object with value 10

b = a # Creates another reference to the same object

print(a) 

print(b)



del a # Deletes the reference 'a'

print(b)



del b # Deletes the reference 'b'

# Object is now garbage collected
You can also try this code with Online Python Compiler
Run Code

 

Output

10
10
10 (object still exists)


In this example, we create an object with the value 10 and assign it to the variable `a`. We then create another reference `b` that points to the same object. When we delete the reference `a` using the `del` keyword, the object still exists because it has another reference `b` pointing to it. However, when we delete the reference `b`, the object's reference count drops to zero, and Python's garbage collector automatically frees up the memory occupied by the object.

Example 2: Reference Counting with Cyclic Reference

In some cases, objects can have cyclic references, meaning they refer to each other, creating a loop. Reference counting alone cannot handle cyclic references, and that's where Python's garbage collector comes into play. Here's an example:

# Example 2: Reference Counting with Cyclic Reference

class Node:
    def __init__(self, data):
        self.data = data
        self.next = None
# Create nodes with cyclic reference
node1 = Node(1)
node2 = Node(2)
node1.next = node2
node2.next = node1
# Delete references
del node1
del node2


In this example, we define a `Node` class that represents a node in a linked list. Each node has a `data` attribute and a `next` attribute pointing to the next node. We create two nodes, `node1` and `node2`, and make them refer to each other, creating a cyclic reference.

When we delete the references `node1` and `node2`, the objects still have references to each other, preventing them from being garbage collected through reference counting alone. Python's garbage collector identifies such cyclic references and frees up the memory accordingly.

Example 3: Using the sys.getrefcount() function

 

Python provides a built-in function called `sys.getrefcount()` that allows you to check the reference count of an object. Here's an example:

  • Python

Python

import sys

# Example 3: Using the sys.getrefcount() function

a = 10

print(sys.getrefcount(a)) 

b = a

print(sys.getrefcount(a)) 

del b

print(sys.getrefcount(a))
You can also try this code with Online Python Compiler
Run Code

Output

2
3
2


In this example, we import the `sys` module to use the `getrefcount()` function. We create a variable `a` with the value 10 and print its reference count using `sys.getrefcount(a)`. The initial reference count is 2 because the object is referenced by the variable `a` and also by the argument passed to `getrefcount()`.

Next, we create another reference `b` pointing to the same object as `a`. When we print the reference count again, it increments to 3 because now the object is referenced by `a`, `b`, and the argument passed to `getrefcount()`.

Finally, we delete the reference `b`, and the reference count decreases back to 2.

Note that using `sys.getrefcount()` can be helpful for understanding reference counting, but it should be used with caution in production code as it can have some limitations and overhead.

Garbage Collection

In addition to reference counting, Python employs a more sophisticated garbage collection mechanism to handle objects that are no longer reachable but may still occupy memory. This is particularly useful in situations where objects have cyclic references or are part of complex data structures.

Python's garbage collector is an automatic memory management system that periodically identifies and frees up memory occupied by objects that are no longer in use. It supplements the reference counting mechanism to ensure efficient memory utilization.

The garbage collector in Python is based on the concept of reachability. An object is considered reachable if there is at least one reference path from a root object (such as a global variable or a local variable in the current scope) to that object. If an object becomes unreachable, meaning there are no longer any references to it, the garbage collector marks it as eligible for collection and frees up the memory associated with it.

Python's garbage collector runs periodically in the background, and its behavior can be controlled and fine-tuned using various techniques, which we will explore in the subsequent sections.

Generational Garbage Collection

Python's garbage collector uses a technique called generational garbage collection to optimize the collection process. The idea behind generational garbage collection is that most objects have a short lifespan and are collected quickly, while a smaller number of objects survive longer.

Python divides objects into three generations based on their age and how many garbage collection cycles they have survived:
 

1. Generation 0 (Young Generation): This is where new objects are allocated. The garbage collector frequently checks and collects objects in this generation.
 

2. Generation 1 (Middle Generation): Objects that survive the garbage collection in Generation 0 are moved to Generation 1. The garbage collector checks this generation less frequently than Generation 0.

 

3. Generation 2 (Old Generation): Objects that survive the garbage collection in Generation 1 are moved to Generation 2. The garbage collector checks this generation even less frequently than Generation 1.


The garbage collector focuses its efforts on the younger generations, as they tend to contain a larger number of short-lived objects. By collecting and freeing up memory in the younger generations more frequently, the garbage collector can operate efficiently and minimize the impact on the program's performance.

When the garbage collector runs, it first checks the objects in Generation 0. If an object survives the collection, it is moved to Generation 1. Similarly, if an object survives the collection in Generation 1, it is moved to Generation 2. This process continues until an object is no longer reachable and can be safely collected.

Automatic Garbage Collection of Cycles

Python's garbage collector is capable of automatically detecting and collecting objects that are involved in reference cycles. A reference cycle occurs when a group of objects refer to each other, forming a circular reference pattern, but are unreachable from any root object.

Let’s see an example to show a reference cycle:

class Node:
    def __init__(self, data):
        self.data = data
        self.next = None
# Create a reference cycle
node1 = Node(1)
node2 = Node(2)
node1.next = node2
node2.next = node1
# Delete the references
del node1
del node2

 

In this example, `node1` and `node2` refer to each other, creating a reference cycle. Even after deleting the references `node1` and `node2`, the objects themselves still exist in memory because they have a circular reference.

Python's garbage collector identifies such reference cycles and automatically collects the objects involved in the cycle. It does this by periodically running a cyclic garbage collector that uses a combination of algorithms, such as the "mark-and-sweep" algorithm, to detect and break reference cycles.

The cyclic garbage collector traverses the object graph, marking reachable objects and collecting unreachable objects that are part of reference cycles. This process ensures that objects involved in reference cycles are properly garbage collected, freeing up the memory they occupy.

Manual Garbage Collection

Although Python's garbage collector automatically handles memory management in most cases, there may be situations where you want to manually trigger garbage collection or have more control over the garbage collection process. Python provides a way to manually initiate garbage collection through the `gc` module.

Let’s see few examples of how you can manually interact with the garbage collector:

1. Triggering garbage collection

import gc
# Manually trigger garbage collection
gc.collect()

 

By calling `gc.collect()`, you can explicitly request the garbage collector to run and collect any unreachable objects.

2. Disabling and re-enabling garbage collection

import gc
# Disable automatic garbage collection
gc.disable()
# Perform some memory-intensive operations
# ...
# Re-enable automatic garbage collection
gc.enable()

 

You can use `gc.disable()` to temporarily turn off the automatic garbage collection, perform some memory-intensive operations, and then re-enable it with `gc.enable()`. This can be useful in scenarios where you want to have fine-grained control over when garbage collection occurs.

3. Checking the number of objects in each generation

  • Python

Python

import gc

# Get the number of objects in each generation

generation_counts = gc.get_count()

print(generation_counts) 
You can also try this code with Online Python Compiler
Run Code

Output: 

(count0, count1, count2)


The `gc.get_count()` function returns a tuple containing the number of objects in each generation. This can be helpful for monitoring and understanding the distribution of objects across generations.

Manual garbage collection should be used sparingly and only when necessary, as the automatic garbage collector is usually efficient in handling memory management. However, in certain cases, such as when dealing with large datasets or memory-intensive operations, manual control over garbage collection can be beneficial.

Forced Garbage Collection

In addition to manual garbage collection, Python provides a way to force garbage collection of objects that are not automatically collected by the garbage collector. This can be useful in scenarios where you have objects with `__del__` methods or objects that hold external resources that need to be explicitly released.

To force garbage collection of an object, you can use the `gc.collect()` function along with the `del` statement. Here's an example:

import gc
class Resource:
    def __init__(self, name):
        self.name = name
        print(f"Resource {self.name} acquired")
    
    def __del__(self):
        print(f"Resource {self.name} released")
# Create a resource object
resource = Resource("My Resource")
# Perform some operations with the resource
# ...
# Force garbage collection of the resource
del resource
gc.collect()

 

In this example, we define a `Resource` class that represents an object holding an external resource. The `__init__` method is called when the object is created, and the `__del__` method is called when the object is garbage collected.

After creating the `resource` object and performing some operations with it, we want to explicitly release the resource. We use the `del` statement to delete the reference to the `resource` object, and then we call `gc.collect()` to force the garbage collector to run and collect the object.

By forcing garbage collection, we ensure that the `__del__` method is called, and the associated resource is properly released.

It's important to note that forcing garbage collection should be used judiciously and only when necessary. In most cases, Python's automatic garbage collection mechanism is sufficient, and explicitly forcing garbage collection can introduce unnecessary overhead.

Disabling Garbage Collection

In some situations, you may want to temporarily disable garbage collection to optimize performance or to have more control over memory management. Python allows you to disable the garbage collector using the `gc.disable()` function.

Let’s see an example of how you can disable garbage collection:

import gc
# Disable garbage collection
gc.disable()
# Perform memory-intensive operations
# ...
# Re-enable garbage collection
gc.enable()

 

When you call `gc.disable()`, the automatic garbage collection is turned off. This means that Python will not automatically collect and free up memory occupied by unreachable objects.

After disabling garbage collection, you can perform memory-intensive operations or tasks that require fine-grained control over memory management. However, it's important to remember that disabling garbage collection can lead to increased memory usage and potential memory leaks if not used carefully.

Once you have completed the memory-intensive operations, it's crucial to re-enable garbage collection using `gc.enable()`. This will resume the automatic garbage collection process and ensure that memory is properly managed.

Disabling garbage collection should be used sparingly and only when absolutely necessary. It's generally recommended to rely on Python's automatic garbage collection mechanism for most cases, as it is efficient and handles memory management effectively.

Here's an example that demonstrates the effect of disabling garbage collection:

  • Python

Python

import gc

# Disable garbage collection

gc.disable()

# Create a large number of objects

objects = [list(range(100000)) for _ in range(1000)]

# Check the number of objects in each generation

print(gc.get_count()) 

# Re-enable garbage collection

gc.enable()

# Trigger garbage collection

gc.collect()

# Check the number of objects in each generation

print(gc.get_count())
You can also try this code with Online Python Compiler
Run Code


Output: 

(1000, 0, 0)
(0, 0, 0)

 

In this example, we disable garbage collection and create a large number of objects. When we check the number of objects in each generation using `gc.get_count()`, we can see that all the objects are in Generation 0.After re-enabling garbage collection and triggering it with `gc.collect()`, the objects are collected, and the number of objects in each generation becomes zero.

Interacting with Python Garbage Collector

Python provides several functions and methods in the `gc` module that allow you to interact with the garbage collector and control its behavior. Let's look at the some of these interactions:

1. Enabling and disabling the garbage collector

   - `gc.enable()`: Enables automatic garbage collection (default).

   - `gc.disable()`: Disables automatic garbage collection.

Example:

   import gc

   # Disable garbage collection
   gc.disable()

   # Perform some operations
   # ...

   # Enable garbage collection
   gc.enable()

2. Forcing garbage collection

   - `gc.collect()`: Manually triggers the garbage collector to perform a full collection of all generations.

Example:

   import gc
   # Force garbage collection
   gc.collect()

3. Inspecting garbage collector settings

   - `gc.get_threshold()`: Returns the current threshold values for each generation as a tuple (threshold0, threshold1, threshold2).

   - `gc.get_count()`: Returns the number of objects in each generation as a tuple (count0, count1, count2).

Example:

   import gc
   # Get the current threshold values
   threshold0, threshold1, threshold2 = gc.get_threshold()
   print(f"Threshold values: ({threshold0}, {threshold1}, {threshold2})")

   # Get the number of objects in each generation
   count0, count1, count2 = gc.get_count()
   print(f"Object counts: ({count0}, {count1}, {count2})")

4. Setting garbage collector thresholds

   - `gc.set_threshold(threshold0, threshold1, threshold2)`: Sets the threshold values for each generation.
Example:

 import gc
   # Set new threshold values
   gc.set_threshold(700, 10, 10)


The threshold values determine when the garbage collector triggers a collection in each generation. By default, the threshold values are set to (700, 10, 10), but you can adjust them based on your specific requirements.

Advantages and Disadvantages

Python's garbage collection mechanism offers several advantages but also has some disadvantages. Let’s discuss both of them : 

Advantages

1. Automatic Memory Management: Python's garbage collector automatically handles memory management, freeing developers from the burden of manually allocating and deallocating memory. This helps prevent common memory-related bugs, such as memory leaks and dangling pointers.


2. Simplified Programming: With automatic garbage collection, developers can focus on writing the core logic of their programs without worrying about low-level memory management details. This simplifies the programming process and improves productivity.


3. Prevention of Memory Leaks: The garbage collector helps prevent memory leaks by automatically identifying and freeing up memory that is no longer being used by the program. This ensures efficient memory utilization and prevents the program from consuming excessive memory over time.


4. Efficient Memory Usage: Python's garbage collector employs various techniques, such as reference counting and generational garbage collection, to optimize memory usage. It efficiently manages memory by collecting and freeing up objects that are no longer reachable, reducing the overall memory footprint of the program.

Disadvantages

1. Performance Overhead: The garbage collection process introduces some performance overhead. When the garbage collector runs, it needs to traverse the object graph, identify unreachable objects, and free up memory. This can lead to temporary pauses or slowdowns in the program's execution, especially when dealing with large datasets or memory-intensive operations.


2. Unpredictable Execution: The exact timing and frequency of garbage collection are not directly controlled by the programmer. The garbage collector runs periodically in the background, and its execution can be influenced by various factors such as memory usage, object creation rate, and available system resources. This unpredictability can sometimes make it challenging to optimize performance-critical code.


3. Increased Memory Usage: In some cases, the garbage collector may not immediately collect and free up memory for objects that are no longer in use. This can result in temporarily increased memory usage until the next garbage collection cycle occurs. If the program creates a large number of short-lived objects, it can put pressure on the memory system.


4. Limited Control: While Python provides some control over the garbage collector through the `gc` module, the level of control is limited compared to manual memory management. In certain scenarios, such as real-time systems or low-level system programming, more fine-grained control over memory management may be required.

Frequently Asked Questions

Can I completely disable garbage collection in Python?

Yes, you can disable garbage collection using gc.disable(), but it's not recommended for most cases as it can lead to memory leaks and inefficient memory usage.

How often does the garbage collector run in Python?

The frequency of garbage collection depends on various factors, such as the number of allocations, available memory, and the threshold values for each generation. Python's garbage collector runs periodically in the background.

Can I force garbage collection to run at a specific point in my code?

Yes, you can manually trigger garbage collection by calling gc.collect(). However, it's generally best to let Python's automatic garbage collection handle memory management unless you have specific requirements.

Conclusion

In this article, we discussed about the Python's garbage collection mechanism, which plays a crucial role in automating memory management. We learned about reference counting, generational garbage collection, and how Python handles cyclic references. We also talked about manual garbage collection, forced garbage collection, and the ability to disable garbage collection. Furthermore, we looked into various ways to interact with the garbage collector using the gc module. 

You can also practice coding questions commonly asked in interviews on Coding Ninjas Code360

Also, check out some of the Guided Paths on topics such as Data Structure and AlgorithmsCompetitive ProgrammingOperating SystemsComputer Networks, DBMSSystem Design, etc., as well as some Contests, Test Series, and Interview Experiences curated by top Industry Experts.

Live masterclass