Table of contents
1.
Introduction
2.
Parallel and Concurrent Garbage Collection
3.
Partial Object Relocation
4.
Consecutive Collection for unsafe languages
5.
Frequently Asked Questions
5.1.
What is a Garbage Collection?
5.2.
Which algorithm is used to implement garbage collectors?
6.
Conclusion
Last Updated: Mar 27, 2024

Advanced Topics in Garbage Collection

Career growth poll
Do you think IIT Guwahati certified course can help you in your career?
Compiler Design

Introduction

Garbage Collection refers to the process of recovering the system memory held by a program that no longer needs this memory. Other programs can use the freed memory. Hence, the garbage collector identifies the objects in the programs that will not be required in the future and frees their space. The most popular algorithm used to implement garbage collectors is Mark and Sweep.

Parallel and Concurrent Garbage Collection

A Garbage Collector is said to be parallel if it uses multiple threads, whereas it is said to be concurrent if it works simultaneously with the mutator.

Let us first see the steps performed by an incremental analysis:-

  1. First of all, it finds the root set. This step is performed with all the mutators stopped.
  2. Then it interleaves the tracing of the reachable objects with the mutator's execution (s). During this period, whenever a mutator points from a Scanned object to an unreached object, the reference is remembered. We have few options regarding the granularity through which these references are remembered. For now, we will assume the card-based scheme, where the heap is divided into sections called "cards," and a bit map is maintained indicating which cards are dirty (have had one or more references within them rewritten).
  3. Again the mutator(s) are stopped to rescan all the cards that might hold references to the unreached objects.

 

The set of objects reached by the root set in the large multithreaded applications can be very large. Hence, it is not possible to take time and space to visit all such objects while all mutations cease. It is also possible that because of the large heap and a large number of mutation threads, we need to rescan many cards again after all the objects are scanned once. Hence, we should scan the cards in parallel, and the mutators should execute concurrently.

We can implement step (2) tracing in parallel by using the multiple garbage collecting threads concurrently with the mutator threads. This would trace most of the reachable objects.

We can implement step (2) by stopping the mutators and using the parallel threads to find all the reachable objects.

 

Concurrent garbage-collection algorithm

  1. First, the root set is scanned for each mutator thread. All the objects directly reachable from that thread are put into the Unscanned state. The easiest incremental approach to this step is to wait till a mutator thread calls up the memory manager and scan its own root set if it has not already been done. If any mutator threads have not called up the memory allocation function, but all the tracing has been done, then the current thread must be interrupted to get its root set scanned.
  2. Now the objects in the Unscanned state are scanned. We use a work queue of fixed-size work packets to support the parallel computation. Each of these holds several Unscanned objects. As soon as the Unscanned objects are discovered, they are placed in work packets. These work packets will be dequeued by the Threads looking for work and then trace the Unscanned objects therein. This strategy will spread the work evenly among the workers in this tracing process. If the system runs out of space and the space is not found to create such work packets, we simply mark the cards holding the objects to force them to be scanned. It is always possible because the bit array holding the marks for the cards has already been allocated.
  3. Then the objects in dirty cards are scanned. When there are no Unscanned objects left in the work queue and all the threads' root sets have been scanned, the cards are scanned again for the reachable objects. Dirty cards continue to be produced as long as the mutators execute. Hence, we need to stop the tracing process using some criteria. We can allow the cards to be Scanned again only once or a fixed number of times, or when the number of cards to be scanned is reduced to some threshold. As a result, the parallel and concurrent step mostly terminates before completing the trace, which is finished by the final step below.
  4. The final step guarantees to mark all the reachable objects as reached. The root sets of all the threads can be found quickly using all the processors in the system after stopping all the mutators. A small number of objects have to be placed in the Unscanned state because the reachability of most objects has been traced. All the threads then start tracing the rest of the reachable objects and rescanning all the cards.

 

Partial Object Relocation

As we know, fragmentation can be eliminated by copying or compacting the collectors, but still, the collectors have some overheads. A compacting collector is required to move all the objects and update all the references at the end of the garbage collection. As the tracing proceeds, a copying collector is responsible for figuring out where the reachable objects would go. If tracing is performed incrementally, we need to either translate the mutator's all the references or move every object and update all their references at the end. Both the options are very expensive, especially for the large heaps.

Instead of the above, we can use a copying generational garbage collector. It is very effective in collecting immature objects and reducing fragmentation but can be expensive when collecting them. We can use the training algorithm to limit the amount of mature data analyzed each time. However, the overhead of the training algorithm is quite sensitive to the size of the remembered sets for each partition.

There is one hybrid collection scheme that uses concurrent tracing to reclaim each of the unreachable objects, and at the same time, it moves only a single part of the objects. This method reduces the fragmentation without incurring the full cost of the relocation in each collection cycle.

  1. Before the tracing begins, choose a part of the heap that will be evacuated.
  2. As the reachable objects are marked, remember all the references pointing to objects in the designated area.
  3. When tracing is complete, sweep the storage in parallel to reclaim the space occupied by unreachable objects.
  4. Finally, evacuate the reachable objects occupying the designated area and fix the references to the evacuated objects.

Also see, Storage Organization

Consecutive Collection for unsafe languages

It is impossible to build a garbage collector that would work for all the C and C++ programs. Since we can always compute the address using the arithmetic operations, none of the memory locations in C and C++ can ever be unreachable. However, many C or C++ programs never fabricate the addresses in this manner. It has been seen that a conservative garbage collector (one that does not necessarily discard all the garbage) can be built to work well in practice for this class of the programs.

 

A conservative garbage collector assumes that the address cannot be fabricated, or the address of an allocated chunk of memory cannot be derived without an address pointing in the same chunk. We can find all the garbage in the programs satisfying those assumptions by treating as a valid address any bit pattern found anywhere in reachable memory, as long as that bit pattern may be construed as a memory location.

 

Here is how a conservative garbage collector works. First, the memory manager is modified to keep a data map of all the allocated chunks of memory. This map allows us to find the starting easily and ending boundary of the chunk of memory that spans a certain address. The tracing starts by scanning the program's root set to find any bit pattern that looks like a memory location without worrying about its type. By looking up these potential addresses in the data map, we can find the starting addresses of those chunks of memory that might be reached and place them in the Unscanned state. We then scan all the unscanned chunks, find more (presumably) reachable chunks of memory, and place them on the worklist until it becomes empty. After tracing, we sweep through the heap storage using the data map to locate and free all the unreachable chunks of memory.

You can also read about the memory hierarchy.

Frequently Asked Questions

What is a Garbage Collection?

Garbage Collection refers to the process of recovering the system memory held by a program that no longer needs this memory. Other programs can use the freed memory.

Which algorithm is used to implement garbage collectors?

The most popular algorithm used to implement garbage collectors is Mark and Sweep.

Conclusion

In this article, we have extensively discussed the following things:

  1. We first discussed what Garbage Collection is.
  2. Then we discussed the advanced topics of the Garbage Collection.

Recommended Reading:

Do check out The Interview guide for Product Based Companies as well as some of the Popular Interview Problems from Top companies like Amazon, Adobe, Google, Uber, Microsoft, etc. on Coding Ninjas Studio.

Also check out some of the Guided Paths on topics such as Data Structure and Algorithms, Competitive Programming, Operating Systems, Computer Networks, DBMS, System Design, etc. as well as some Contests, Test Series, Interview Bundles, and some Interview Experiences curated by top Industry Experts only on Coding Ninjas Studio.

Cheers!

Live masterclass