1.
Introduction
2.
Mark-and-sweep algorithm
3.
Baker's mark-and-sweep algorithm
4.
Mark-and-compact algorithm
5.
5.1.
Which algorithm is used for garbage collection?
6.
Conclusion
Last Updated: Mar 27, 2024

# Trace-Based Garbage Collection

GAZAL ARORA
1 upvote

## Introduction

Garbage collection is a process of recovering pooled computer storage that has been used by a program but is no longer required by that application. This frees up storage for other applications.

Garbage collection involves the collection of garbage as it is being created; trace-based garbage collection involves periodically collecting garbage.

Trace-based garbage collectors will run at regular intervals to find unreachable objects (garbage) and regain their space. These intervals run when the free space in storage falls below a certain threshold or is depleted.

## Mark-and-sweep algorithm

The mark-and-sweep algorithm is a tracing garbage collector because it traces out the whole collection of objects that are directly or indirectly accessible to the program. The algorithm finds unreachable objects and adds them to a list of available space.

Input: Set of objects, a heap, and a free list with all unallocated chunks of the heap.

Output: A modified free list after all garbage has been removed.

Algorithm

1. An UnscannedList is initialized as an empty list. It will hold objects that have been reached, but whose successors have not been considered, i.e., their successors have not been scanned to check which objects are reachable through them.
2. Each object includes a reached-bit to identify whether it has been reached.
3. A list named Free will hold objects that are unknown to be free.

``````//Marking

1. Add every object referenced by the root set to Unscanned and set its reached-bit to 1.
2. while( empty set != Unscanned ){
3.     remove some object obj from Unscanned;
4.     for(each object obj' referenced in obj){
5.         if( reached-bit is 0 (obj' is unreached) ){
6              set the reached-bit of obj' to 1;
7.             place obj' in Unscanned;
}
}

//Sweeping

8. Free = empty set;
9. for( each chunk of obj in the heap){
10.    if( reached-bit is 0 (obj is unreached))
11.    else set reached-bit of obj to 0;
}

Let’s understand the algorithm.``````

Marking

It includes the initialization of UnscannedList by adding all objects referenced by the root set. Reached-bit for such objects is set to 1. We were next, looping while examining if each object obj was ever placed in the UnscannedList. This loop also includes another for-loop implementing scanning of object obj by examining obj' for which we find a reference within obj. If obj' has been reached, i.e., reached-bit is 1, nothing is done. Otherwise, we set its reached-bit to 1. Add obj' to UnscannedList.

Sweeping

The sweeping phase reclaims any space that has not been reached by the end of the marking phase. Objects on the free list are included in this. The technique sweeps across the entire heap because the unreached items set cannot be enumerated directly.

This includes the one-by-one addition of free and unreachable objects to the free list.

The set reached bits to 0 to maintain proper garbage collection preconditions for the next operation.

An UnscannedList containing four objects, the first of which corresponds to object obj. There are other three types of items that can be reached from o.

• An object that has already been scanned.
• An object of UnscannedList.
• An item that was earlier thought to be unreachable but now reachable.

## Baker's mark-and-sweep algorithm

Since there is no easy method to locate unreachable objects without examining the entire heap, the sweeping portion of the mark-and-sweep approach is expensive.

We optimize this by retaining a list of allocated objects and then taking the difference between the allocated and reached objects to identify the set of unreachable objects.

Input: Set of objects, a heap, and a free list with all unallocated chunks of the heap.

Output: A modified free list after all garbage has been removed.

Algorithm

``````//Marking

1. Scanned = empty set;
2. Unscanned = empty set;
3. move objects referenced by root set from Unreached to Unscanned.
4. while( empty set != Unscanned ){
5.     move some object obj from Unscanned to Scanned;
6.     for(each object obj' referenced in obj){
7.          if( reached-bit is 0 (obj' is unreached) ){
8.             move obj' from Unreached to Unscanned;
}
}
}

//Sweeping

8. Free = Free U(union) Unreached;
10. Unreached = Scanned;``````

Here, We used four lists, Free, Unreached, Unscanned, and Scanned, each holding objects in a state of the same name. We initialized Scanned to an empty list and Unscanned to contain objects reached from the root set.

Next, we implemented a basic marking algorithm. We used a for-loop to examine references on one unscanned object obj if any obj' s references have not yet been reached. Add obj' to the Unscanned state in that case. Then, we took objects in the Unreached list and deallocated their chunks by moving them into the Free list.

It is beneficial to combine adjacent free chunks into larger chunks since every time we return a chunk to the free list, we evaluate the chunks (left and right) and merge if one is free.

## Mark-and-compact algorithm

• We relocate collectors to shift reachable things around the heap to avoid memory fragmentation.
• Because the area occupied by a reachable object is usually smaller than the freed space, we can relocate all reachable objects to one end of the heap instead of freeing them individually after discovering holes, leaving the entire heap as one free chunk.
• The garbage collector has already evaluated every reference within the reachable objects at this point, so changing them to point to new locations isn't too difficult.
• This decreases fragmentation and allows huge objects to be stored in memory; also, because data only takes up a few cache lines and pages, relocation enhances a program's spatial and temporal locality by allocating newly created objects to the same cache line and page.
• For nearby chunks, prefetching gives an extra benefit. Instead of a free list, we only require a pointer free at the start of the one free block, which simplifies the data structure for keeping free space.

The algorithm is divided into three sections.

1. Marking is similar to mark-and-sweep.
2. In the second phase, the algorithm scans the allocated section of the heap and computes a new address for each of the reachable objects assigned from the low end of the head, ensuring there are no gaps between them. A NewLocation structure is used to store these addresses.
3. The references to copying objects are modified to point to their respective new locations, whose addresses are contained in the NewLocation structure.

Input: Set of objects, a heap, and a free list with all unallocated chunks of the heap.

Output: A modified free list after all garbage has been removed.

Algorithm

Data structures used:

1. UnscannedList.
2. Reached bits in all objects.
3. A free pointer marks the beginning of an unallocated region in the heap.
4. The table NewLocation (hash table, search tree) is used to implement setting new location(NewLocation(obj)) and getting value of NewLocation(obj) given an object obj both at O(1) time.

``````// marking.
1. Unscanned: A set of objects referenced by root set;
2. while( empty set != Unscanned){
3.     remove object obj from Unscanned;
4.     for(each object obj' referenced in o){
5.         if(obj' is unreached){
6.             mark obj' as reached;
7.             put obj' on list Unscanned;
}
}
}

// compute the new locations.
8. free: starting location of heap storage.
9. for(each chunk c in heap, from the low end){
10.     if(i is reached){
11.         newLocation(c) = free;
12.         free = free + sizeof(c);
}
}

// Redirect references and relocate reached objects.
13. for(each chunk c in heap, from the low end){
14.     if(c is reached){
15.         for(each reference c.r in c)
16.             c.r = NewLocation(c.r);
17.         copy c to NewLocation(c);
}
}
18. for(each reference r in root set)
19.     r = NewLocation(r);``````

Line by Line understanding of the algorithm.

• Line(1 - 7) is similar to the first basic algorithm.
• Line(8 -12) represents the second phase of the algorithm. It visits every chunk, and new addresses are assigned to these chunks in the same order as their old addresses.
• Line(13 - 17) represents the third phase of the algorithm where we visit reached objects and replace internal pointers of a reached object obj by the new values using the NewLocation table(Line(15 - 16)).
• Line(17), we move object obj with the new internal references to its new location.
• Line(18) and (19) retarget pointers in the root set elements, which are not heaped objects.

Conclusion

Objects are compacted in place using a mark-and-compact collector. The copying collector copies things from one region to another, freeing up space for relocation. This enables the movement of reachable objects as they are discovered.

• The complexity of the basic mark-and-sweep algorithm is proportional to the number of heap chunks.
• The complexity of Baker's mark-and-sweep algorithm is proportional to the number of items reached.
• The complexity of the basic mark-and-compact approach is proportional to the number of chunks in a heap plus the total size of objects reached.

Also check out - Phases of Compiler

### Which algorithm is used for garbage collection?

The mark-and-sweep algorithm is a tracing garbage collector because it traces out the whole collection of objects that are directly or indirectly accessible to the program. The algorithm finds unreachable objects and adds them to a list of available spaces.

## Conclusion

In this article, we learned about Trace-Based-garbage collection. We looked into the mark-and-sweep algorithm that is a tracing garbage collector because it traces out the collection of objects that are directly or indirectly accessible to the program.

We looked at the three variations of this algorithm.

• Basic mark-and-sweep algorithm.
• Baker's mark-and-sweep algorithm.
• Basic mark-and-compact algorithm.

We learned that the complexity of the basic mark-and-sweep algorithm is proportional to the number of heap chunks.

The complexity of Baker's mark-and-sweep algorithm is proportional to the number of items reached.

The complexity of the basic mark-and-compact approach is proportional to the number of chunks in a heap plus the total size of objects reached.