VIJAY MATHEW

home > writings > Dangerous Designs

2009 March 13

Programming language specifications often start with a design philosophy. Of all those I have read, I like that of the Scheme language the most. You can read it in the introduction of the Scheme standard , where it is stated as a single line:

Programming languages should be designed not by piling feature on top of feature, but by removing the weaknesses and restrictions that make additional features appear necessary.

The argument is that, using a very small number of rules for forming expressions and with a minimal syntax it is possible to support all possible programming paradigms. For instance, if the language has support for higher-order functions, closures and dynamic typing, we can implement object oriented programming without special language level syntactic support. Tail-call optimization elude the need for special looping constructs.

But this ideology has failed to capture the imagination of the majority of professional programmers. Instead the following thinking seems to have imbued their minds:

As more features are piled on top of an already bogged language, the more powerful will it become.

In this article I try to prove that this argument is false and misleading. Adding more syntax and restrictive rules to a language, which has a badly designed core, will only make it weaker and even susceptible to security risks. I will make my point clear with the help of a feature added to the Java programming language - Inner classes.

Inner classes are Java's answer to Smalltalk's blocks and Scheme's closures. Look at the following code snippet:


public class OuterClass {
    // Inner class
    class AddN {
        AddN(int n) { _n = n; }
        int add(int v) { return _n + v; }
        private int _n;
    }
    public AddN createAddN(int var) {
        return new AddN(var);
    }
 }

The method createAddN takes an integer (n) as argument and return an object that adds n to a value. That object is defined as an inner class whose local state stores the value of n. Those who are familiar with Scheme or Common Lisp should be shouting by now: "Hey, this can be done more elegantly and compactly, like this-"

(define (addn n) (lambda (k) (+ n k)))

No new syntax to learn, no special rules to remember.

Agreed.

The problem is that Java does not have first-class functions and closures, and the JVM is not designed to support them. But they are really nice features and are the natural solutions for many a programming problem. So as to have these nice features, Java introduced a restrictive rule to the language. The following example will throw more light on this:


public class OuterClass {
    public int add10(final int n) {
        class Add10 {
            int add() { return 10 + n; }
        }
        return new Add10().add();
    }
} 

Here we have a method add10, that contain a local inner class, which is used to add the constant value 10 to a given value. But why the parameter to add10 is declared final? Problem is, the JVM has no idea of inner classes. So the Java compiler will generate a separate class file for the inner class. Now how to pass a local variable declared in Outerclass.add10 method to the Add10 class? The compiler does a trick here. You can find this out by decompiling the OuterClass$1Add10.class file. The compiler quietly adds a variable val$n to the Add10 class. When an instance of Add10 is created, the value of n is copied to val$n. The JVM needs a guarantee that the original value of n will not change after this copying is done because there is no way for it to keep track of those changes. Requiring the programmer to declare the variable final is the only way out of this problem.

An inner class should be able to read and write all variables of its parent class, despite what access modifiers they have. Otherwise, there is no point in making it an inner class in the first place! The following code demonstrates this. Here, the inner class is copying the result of the computation to a private variable of the outer class:


public class OuterClass {
    public void add10(final int n) {
        class Add10 {
            void add() { k = 10 + n; }
        }
        new Add10().add();
    }
    private int k = 0;
}  

Now we face a new dilemma. If, for the JVM, both OuterClass and Add10 are two unrelated classes, how an instance of Add10 is able to modify a private variable declared in OuterClass? The answer can be found by decompiling both OuterClass.class and OuterClass$1Add10.class files. We see that the compiler has secretly placed a new method with package level access in OuterClass.class file:



int access$002(OuterClass aOuterClass3,  int int4)  {
     this.k = aOuterClass3;
     return aOuterClass3;
}

Using this method, not just Add10, but any class in the package can see and modify the private variable OuterClass.k! If you generate a class file with appropriate byte code and place it in the same package as OuterClass.class, you can read from and write to its internal state using these secret access methods!

By adding a new feature Java has broken one of the key premises that identify it as an Object Oriented language, i.e, retention and protection of local state. This may not be a security problem. No one should rely on Object Oriented abstractions for securing their data anyway! But this might still cause problems for certain types of software and is certainly a hole in the language.

Higher-order functions and closures are features to be desired by any modern programming language. Unfortunately, many 'modern' programming languages have such rigid a design so that adding a new feature to it often breaks an existing, important feature.

I think this is where comparatively simple languages like Common Lisp and Scheme shine. You can add new syntax, even whole new paradigms, without touching or spoiling the compiler or the runtime system. As an example, read about how the Common Lisp Object System (CLOS) is implemented.

Good languages let you write terse, clean code. Look at the Scheme code snippet I gave at the beginning of this article. Compare that with all the verbosity in Java, just to get the same result. Of course, you will get what you want, only with a fissure in your program!



Comments:

Seminoma: Right on. Groovy solves this with closures and less LOC. Less LOC = less bugs.
paulkingasert: In Groovy you could define and use a Groovy Closure like this:
    k = 4
    addn = { n -> n + k }
    assert addn(5) == 9
scolebourne: This article comes across rather confused I'm afraid. You start with: 'Programming language specifications often start with a design philosophy' and then go on to complain about the value of Java because it doesn't have functions or closures. For example, here is your use case: 'The method createAddN() takes an integer 'n' and return an object that adds 'n' to a value.' But you don't address the question of why on earth anyone would actually _want_ a function that adds 10 to a number. Why not just write n + 10? Remember, that Java developers are out there coding every day without finding a desperate problem with what they have. You are right in one thing that programming languages have a philosophy. Java has a philosophy that is OO based, and very much not functional based. Those trying to write functional code in Java are bound to get hurt. The answer is to not try and break the philosophy!
Vijay Mathew: Hi scolebourne,
>> This article comes across rather confused I'm
>> afraid. You start with:
>> Programming language specifications often start
>> with a design philosophy
>> and then go on to complain about the value of Java
>> because it doesn't have functions or closures.
No, I don't lose focus. First of all, let me be clear on this: My intention is not to thrash Java. I use Java almost everyday and I think it is a mature, secure platform to deliver cross-platform products.
My intention was to show that using a minimum of syntax and restrictive rules, it is possible to create a language as powerful or even better than others with complicated syntaxes. It just happened that I chose the implementation of closures in two languages (Scheme and Java) to demonstrate this.
>> But you don't address the question of why on earth
>> anyone would actually _want_ a function that
>> adds 10 to a number.
That was just a small sample, intended to make the point clear. A longer, real-life example would have required me to explain that first, only to distract the readers from the main topic.
>> Java has a philosophy that is OO based,
I agree that the objective of Java designers was to create an Object Oriented language. The answer to whether they succeeded in implementing a pure OO language or not largely depends on how people define Object Oriented Programming. As per the definition of the man who invented the term OOP, Java is not a complete Object Oriented language. (Please see this page). For instance, the presence of primitive types and the facilities that let the programmer expose the internal state of an object makes Java a weak OO language.
>> and very much not functional based.
Then why do many want Java to bring in closures and first class functions? Look at these pages: http://www.javac.info/
http://blogs.sun.com/jag/entry/the_black_hole_theory_of I think even James Gosling is unhappy with inner classes!
Grant Rettke: Seems like a good reason for Java programmers to read the language spec; without it it is really anyone's guess as to what is happening to your code.
Günther Noack: I agree with your point that closures are probably *the* killer feature that eliminates the need for most other core language features.
However, I can totally understand scolebourne when saying that noone would use inner classes for this in Java. I don't think I'd believe in these ideas about language design after reading your article when I just had a background in a no-closures language.
It's hard to find an example that shows the power of closures to people who don't know them. I usually ask people how they'd do operations on a large set of objects that are contained in collection objects. Then I ask them to contemplate how often they do that in their projects. Finally, I show them something like this:
set.select({|x| x>10}).inject(0,{|a,b| a+b})
This does of course not show the full potential of closures, however it does make clear why one may want to pass around stuff like the function that calculates n+k. :)
Vijay Mathew:
  >> However, I can totally understand scolebourne when
  >> saying that noone would use inner classes for this
  >> in Java. I don't think I'd believe in these ideas  
  >> about language design after reading your article
  >> when I just had a background in a no-closures
  >> language.
As I said in my earlier comment, my intention was to show the power of simple and elegent language design without the distractions of an elaborate sample. So I choose a damn simple example.
>> set.select({|x| x>10}).inject(0,{|a,b| a+b})
I guess this is Haskell. The code is short and serves the purpose of displaying the elegance of higher-order functional programming. But the syntax might scare someone new to Haskell. I think the simple, one-line Scheme code snippet I gave is much more merciful to someone uninitiated.
Günther Noack: It's Ruby. Whether Scheme is easy to read or not probably depends on who you're going to show it to. People in my area mostly aren't familiar with Lisp syntax, so I usually choose not to scare them. ;)