Saturday, May 3, 2008

How the Garbage Collector works - Part 1

How the Garbage Collector works - Part 2

The Garbage Collector (GC) can be considered the heart of the .NET Framework. It manages the allocation and release of memory for any .NET application. In order to create good .NET applications, we must know how the Garbage Collector (GC) works.

Basic rules

  • The Garbage Collector (GC) can’t be controlled by the application.
  • All garbage-collectable objects are allocated from one contiguous range of address space and are grouped by age.
  • There are never any gaps between objects in the managed heap.
  • The order of objects in memory remains the order in which they were created.
  • The oldest objects are at the lowest addresses (managed heap bottom), while new objects are created at increasing addresses (managed heap top).
  • Periodically the managed heap is compacted by removing dead objects and sliding the live objects up toward the low-address end of the heap.

Determine which object is dead

Don’t confuse the created objects with the references that point them (pointers)! Let’s consider the following code sample:

string s1 = "STRING";

string s2 = "STRING";


Here, "s1" and "s2" are not the created string objects! The code above creates only one string object ("STRING") due the intern mechanism. Please read more about interned strings in this article: How to: Optimize the memory usage with strings. The "s1" and "s2" are just references to the same string object (just pointers).

The object remains on the heap until it's no longer referenced by any active code, at which point the memory it's using is reclaimed by the Garbage Collector (GC).

Even if one of the two references is set to null, the Garbage Collector (GC) will be still considering the "STRING" object to be alive because the other reference is pointing to it.

The GC generations

All the living objects from the managed heap are divided in three groups by their age. Those groups are generically called "Generations". Those generations are very useful to prevent memory fragmentation on the managed heap. The Garbage Collector (GC) can search for dead object on each generation at a time (partial collections), to improve the collecting performance.

Now let’s see what the Garbage Collector (GC) is using each generation for:

  • Generation 0 (Gen0) contains all the newly created objects and it is located on the top of the heap zone (higher memory addresses). All the objects contained by Gen0 are considered short-lived object and the Garbage Collector (GC) is expecting to them to be quickly destroyed in order to release the used memory space. Due to this presumption, the Garbage Collector (GC) will try to collect dead objects most often from Gen0 because it is cheapest.
  • Generation 1 (Gen1) contains all the living objects from Gen0 that have survived to several Gen0 collects. So those objects are upgraded from Generation 0 to Generation 1. Gen1 is defined in the middle of the heap zone and it is exposed to fewer garbage collects than Gen0. Gen1’s collects are more expensive than the Gen0’s so the Garbage Collector (GC) will try to avoid them if it is not really necessary.
  • Generation 2 (Gen2) contains all the living objects from Gen1 that have survived to several Gen2 collects. Those objects are considered long-lived objects and destroying them is very expensive. Because of this, the Garbage Collector (GC) will hardly try to collect them. The Gen2 zone is located on the bottom of the managed heap zone (lowest memory addresses).
How the Garbage Collector works - Part 2

kick it on DotNetKicks.com

1 comment:

Anonymous said...

Nice one. Lookingforward more from you.