2010 June 21
It is high time that programming languages did away with null. A language that abstracts the details of hardware, especially memory handling, do not need the null reference.[1] Null serves no real purpose. Instead, it create serious problems. In order to make programs more robust, languages like Java, C# and Python do not provide direct access to memory locations. They does automatic memory management, which is good. Consider this excerpt from The Java Language Environment:
Most studies agree that pointers are one of the primary features that enable programmers to inject bugs into their code. Given that structures are gone, and arrays and strings are objects, the need for pointers to these constructs goes away. Thus, Java has no pointer data types.
.............
You no longer have dangling pointers and trashing of memory because of incorrect pointers, because there are no pointers in Java.
This implies that an object always refer to a valid memory location, ready to receive and respond to messages. The presence of null breaks this premise. Null is deemed to be part of the type system, but it cannot respond to messages! [2] Null references can cause the same problems as dangling pointers. Null renders an object "dead" and are a rich source of bugs that crash systems. So programmers often take special steps to deal with them. As an example, look at the following Java class that represents a database connection pool. (Though we use the Java notation, the concept presented is equally valid for C#, Python, Ruby and most other popular languages that support Object Oriented Programming).
class DbPool {
// Returns a database connection from a pool.
// If no connections are free, an attempt is made to
// allocate a new connection and return it.
// If this attempt also fail, null is returned.
DbConnection getConnection() {
if (hasFreeConnection()) {
return nextFreeConenction();
} else {
try {
return new DbConnection(this.settings);
} catch (FailedToCreateDbConnection ex) {
return null;
}
}
}
}
As usual, a user of this library decided to ignore the documented warning:
try {
dbPool.getConnection().query(sql);
} catch (DbException ex) {
// handle ex
}
As NullPointerExceptions[3] are unchecked, the compiler will not raise a flag here. This code will work well for unit tests, but will almost certainly fail in production. The program can be made safer only by adding more scaffolding:
// To make the code safer, the programmer has to either check
// for null (as required by the DbPool documentation)
DbConnection c = dbPool.getConnection();
if (c != null) {
try {
c.query(sql);
} catch (DbException ex) {
// handle ex
}
}
// or catch the NullPointerException
try {
dbPool.getConnection().query(sql);
} catch (NullPointerException ex) {
// handle ex
} catch (DbException ex) {
// handle ex
}
As we can clearly see, null is useful only for complicating matters. Life will be much simpler if a complex object, when uninitialized, can assume a default value. Java has implemented this concept partially - for primitive types:
int i; // implicitly initialized to 0.
10 * i; // => 0, no exception, program just runs.
BankAccount account; // null!!
account.getBalance(); // A NullPointerException and a
// possible BANG! in Java,
// but surely a BANG! in C++, if account was
// declared a pointer. BTW, a language that exposes pointers
// may not be able to shed NULL.
Now for some wishful thinking - Java do not have the null reference. It also have the same semantics for declaring primitives and complex types:
BankAccount account; // implicitly initialized using
// the default constructor.
account.getBalance(); // => 0
(You may have noticed that our new dialect of Java do not have the superfluous new keyword[4] as well).
As there is no null, getConnection() should either throw a checked exception or return a default object. Here we choose to return a default object[5]:
class DbPool {
private static final DbConnection defaultDbConnection;
DbConnection getConnection() {
if (hasFreeConnection()) {
return nextFreeConenction();
} else {
try {
return DbConnection(settings);
} catch (FailedToCreateDbConnection ex) {
return defaultDbConnection;
}
}
}
Mutable operations on defaultDbConnection can be controlled with internal checks and exceptions. Or we just return a new object each time. If constness is strictly enforced by the language, returning a static const value becomes more viable. This is possible to some extent in C++. We don't have to make an explicit declaration of defaultDbConnection if the language itself provide a non-mutable default object, and make it accessible using a reference (similar to the this pointer). This reference can be called default:
class DbPool {
private static final DbConnection defaultDbConnection;
DbConnection getConnection() {
if (hasFreeConnection()) {
return nextFreeConenction();
} else {
try {
return DbConnection(settings);
} catch (FailedToCreateDbConnection ex) {
return default;
}
}
}
}
Null is a "billion dollar mistake" as its inventor himself once admitted[6]. As long as language designers choose to repeat this mistake, we have to pay special attention to make our code "null safe".