Is Garbage Collection a Failed Concept

July 7th, 2008 @

In this article, I’ll discuss garbage collection and the value it brings to large-scale application development.

1990′s Redux
There’s been a lot of noise about the next version of standard C++. It’s due in 2009 and has been named C++0x. (Where’s a brand-name expert when you need one?) C++ thought leaders have been let loose and are in full force discussing its merits. Caught in the crossfire of this marketing blitz is a renewed criticism of Java, its performance and its garbage collection. Comparing Java to C++ feels so late 1990′s that I am suddenly feeling the fear of the Y2K doomsday scenarios.

Garbage collection bad, RAII good
One such example is an interview in which an internationally-recognized C++ expert was asked whether garbage collection should be added to C++0x. His stunning reply was that garbage collection was a “failed and improperly implemented concept“. The paraphrased version of his argument is that garbage collection is bad because it is non-deterministic and may not actually succeed. It only deals with one kind of resource – the memory resource. All other resources, such as file handles and database connections for example, remain undressed. Garbage collection robs developers of destructors to manage resources. In summary, garbage collection is a crutch for developers and if we all just follow the resource acquisition is initialization idiom (RAII), we’ll do just fine.

The point of my article is not to dissect nor critique the opinion expressed above but rather to focus on general misconceptions about garbage collection.

How real should real-time be?
Garbage collection critics love to point out that GC is bad because it is non-deterministic. It is true that garbage collection removes some degree of application determinism. Since it is automated, collection can occur at any time, run for varying lengths of time and impede on the overlaying application’s responsiveness. This can lead to broken functionality within an application if it is required to respond in real-time. Garbage collection also circumvents the need for destructors. Consequently, without destructors, programmers cannot hook resource deallocation and thus cannot precisely predict when a resource is released. Thus garbage collection affects application determinism in two ways: responsiveness and resource management.

I have two points of contention with this. First, let’s be honest; how many applications have hard real-time requirements for responsiveness? The majority of applications either have no real-time requirements or have soft ones. The latter is exemplified by online stores, stock trading applications, and GUI-centric apps where fast response times are necessary but the occasional processing delay is tolerated. Second, even with C++, application determinism can only be guaranteed in systems with no shared resources (such as in embedded systems). Most applications run atop a multi-process operating systems where the CPU and I/O resources are shared. The illusion of application determinism quickly evaporates in these types of systems where the application has no control over the sharing of resources. In the end, giving up some degree of determinism in return for automated garbage collection is a trade I’d make any day. There is a big price to be paid in complexity in order to obtain that extra level of determinism promised by C++. More on this point later.

Give RAII a chance
Some believe that RAII is a better way to manage memory. This idiom, perhaps the most important in C++, is a technique of acquiring a resource in the object constructor and releasing it in its destructor. Provided all classes are designed to conform to this pattern, life will be good.

However, there is a big difference between programming by idiom and programming by compiler-enforced rules. An idiom is just a best-practice design pattern. As sound as an idea as RAII is, it doesn’t scale very well to handle large-scale development where teams are composed of developers ranging in talent, experience and commitment to code quality. While I have no doubt that an all-star team composed of the likes of Bjarne Stroustrup and Scott Meyers would do just fine with this idiom, the rest of us have to deal with people who aren’t always experienced or worst, aren’t always concerned about code quality. Idioms require human intelligence to be understood and human care to be enforced. They easily break down in large teams where a minority of people (those experienced/caring people) fix the careless bugs created by the majority. Using automated garbage collection to abstract memory management allows the first category of folks to think about more important things and prevents the second category from making mistakes. The bottom line is that idioms don’t scale very well while compiler-enforced rules do. While it is true that garbage collection only addresses the memory resource, it happens the be the most frequently used and the most likely to produce hard-to-find bugs. Garbage collection almost eliminates an entire class of bugs.

Real programmers don’t need GC
There are some who still believe that garbage collection is a crutch and that real programmers don’t need it. Since there is no ISO-type standard body that can help us discern a real programmer from a fakeone, we are out of luck. But let’s see if garbage collection can remove our dependency on real programmers for memory management.

Garbage collection abstracts away most of the brain-power dispensed on memory management. This benefits everyone. While this doesn’t mean that developers never need to think about memory management, it does mean that most people can think about higher-level problems. At most, perhaps one person in every organization will still need to be sacrificed to the god of heap with the task of tuning the garbage collector by setting the right compaction parameters and nursery sizes.

Manually managing memory adds a level of complexity that requires human time and effort to overcome. In the end, there is no sufficient return on investment to justify its use for non-legacy applications.

Works except when it doesn’t
Some people still believe that there are inherent flaws with garbage collection and that it cannot reliably detect garbage.

Garbage collection has been researched extensively in academia since the 1960′s. While implementation problems may have plagued early versions Sun’s Hotspot JVM, for example, these have been addressed long ago. There are no inherent flaws with automated collection nor with its ability to detect garbage.

It may surprise you to learn that I like C++. I really do. But I like it in the same way I like assembler: because I want to understand what happens under the hood. But when you look at it from a practical point of view, where business cycles drive software development, garbage collection is indispensable in large-scale development. Any talk of avoiding garbage collection at this point of software engineering’s evolution is just plain… garbage.