Glenn Vanderburg

Static vs. Dynamic typing, part 3

17 May 2003

One more thing about the typing debate in today’s expert panel. Dave Thomas said, at one point, “Java and C++ have equated an object’s type with it class, and that’s wrong. The type of an object isn’t its class; it’s what the object can do.”

I agree with that, and I think it’ll be interesting to go revive a four-year-old piece I wrote as part of a WikiWikiWeb debate on the merits of multiple inheritance.

I think multiple inheritance is relatively unimportant and rarely useful, and I’m happy to be working in a language (Java) that does not support it. In part, I’ve come to believe that inheritance is given far too much importance by most OO languages, designs, developers, and pundits.

My thoughts about inheritance have been evolving since I learned Java. The revelations I’ve had may not seem like much to a Smalltalk programmer, but they represent a complete shift in my thinking.

Nearly every introduction to OO concepts I’ve ever read or seen has dealt with inheritance very early, and then moved on to a cursory discussion of “polymorphism” as a sort of nice side-effect of inheritance. Partly as a result of this (and partly because of the underlying misunderstanding) the word “inheritance” is usually used to refer to some combination of behavior inheritance and subtyping.

Java is the first language I have used that mostly separates those two concepts. Extending or implementing an interface represents a subtyping relationship, whereas extending a class represents the more traditional combination of subtyping and behavior inheritance. The process of using and designing with interfaces has brought subtyping out of the shadows and into the foreground of my thinking.

I’ve begun teaching the concepts of inheritance and polymorphism the other way ‘round: polymorphism and interfaces come first, as the fundamental issue, and behavior inheritance comes later, as a nice facility (more, but not much more, than a convenience) when two subtypes share significant portions of their behavior or implementation. It seems to work well. Many of the common inappropriate uses of inheritance never occur to programmers who learn it this way. I’ve found that the method also helps in the explanation of abstract classes and when to use them.

Consider a language that makes the separation even more clear: subtyping uses one mechanism, and behavior inheritance can be specified without also producing a subtype relationship. (I’m speaking hypothetically, but I won’t be surprised to learn that such a language actually exists.) Behavior inheritance becomes a kind of specialized composition or delegation facility. Some of the traditional objections and implementation complexities of MI disappear.

My conclusion, then, is that the strongly-typed OO community may have been going down a side path for most of its history, conflating two concepts that are actually separate. Perhaps, once we’ve corrected that misdirection, the time will come for MI to move back to center stage. For now, though, I’m pleased that Java avoids MI, if only because having to do without it has helped me to understand the fundamental issues more clearly. (From talking to colleagues, I believe it is having a similar effect on others, too.)

I realized almost immediately after posting that that Smalltalk and its ilk (including, for example, Ruby) have essentially the characteristic I was talking about, where subtyping and inheritance are separate concepts. With those languages in particular, that distinction is present because there is no subtyping at all … the notion really doesn’t exist in those languages, because typing as Java and C++ folks think of it doesn’t exist.

Static vs. Dynamic typing, part 2

17 May 2003

(Part 1)

The other assumption Ted Neward said he was questioning these days (at least, the other one I want to comment on) is that strongly typed languages are likely to be more efficient, because they give the compiler or VM more information that can be used to optimize things.

That may well be true. But I think the gap between the two is surely narrowing, to the point where (combined with increases in hardware speed) it’s usually a non-issue.

Untyped languages can be really fast. Check out all the cool Smalltalk things in Alan Kay’s keynote at ETech. I’ve been trying to find the time to start serious work on an app I want to write. As part of trying to decide what language to write it in, I’ve written essentially the same program in ObC/Cocoa, Java, and RubyCocoa. The ObC version is fastest, but not by much, and the Java and RubyCocoa versions perform almost identically.

It’s also interesting to reflect that the technology in Hotspot that makes modern Java pretty fast originated in attempts to make Self run fast, and Self is as dynamically typed as a language can be.

The history of programming language optimization is mostly the same story repeating. Languages include features and constructs designed to help the runtime system be efficient. And then implementation technology advances, and those same things that once helped the compiler generate fast code are suddenly inhibiting that process. I wonder whether the same thing will happen with static typing?

(Part 3)

Static vs. Dynamic typing, part 1

17 May 2003

It’s expert panel time at the Rocky Mountain Software Symposium, and one of the first questions was “What do you guys think about the whole static/dynamic typing debate?” (I suspect he was a plant, because before the panel started the panelists had decided that they wanted to talk about that issue if they could.)

Ted Neward said a couple of interesting things. I don’t want to misrepresent him, because he ultimately said that he is now questioning all of the assumptions that he’s always believed about static typing. But I was interested in what he said those assumptions were, and I guess the purpose of this blog entry is to question those assumptions on Ted’s behalf.

One of the assumptions was that static typing really helps security analysis on platforms (like Java) with mobile (and therefore possibly untrustworthy) code. The VM (or interpreter, or what have you) is able to use the type safety to help enforce the security model.

When Java first hit the streets, mobile code was a hot topic. General Magic was promoting their Magic Cap environment, featuring mobile code (“agents”) heavily, and powered by a language called Telescript. Nathaniel Borenstein was researching “active mail”, sending active invitations and the like via email using a dialect of Tcl called Safe-Tcl. Someone (I can’t remember who at the moment) was developing roughly equivalent functionality in Perl (the Safe.pm module). Luca Cardelli at DEC was developing a beautiful and novel little language called Obliq.

All of those languages supported secure mobile code, and all of them used very different security models. My memory of Telescript is fuzzy, but I know for a fact that Java is the only one of the rest that is statically typed. And I remember from my evaluation at the time that Java and Telescript had the two most complex security models (and complexity is not a good thing in a security model).

Static typing is one of the tools you can use to build a security model, but there are many others.

(Part 2, part 3.)

iTunes 4, I18N, and Antialiasing

09 May 2003

Why does the new iTunes no longer use antialiased fonts (except when there are accented characters in the string)?

Update: Matt Brubeck (any relation to the Brubeck from the image above?) forwarded an explanation from John Gruber of Daring Fireball fame:

In short, iTunes 4 uses 9-point text for these lists, and Mr. Vanderburg has changed the default settings for anti-aliasing in the General panel of System Prefs such that 9-point text is not anti-aliased. Thus, what he is seeing is what he asked for.

I don’t remember asking for that, but I suppose I did. And with the new release, iTunes changed from 10-point to 9-point. But what really made the ugliness obvious was the occasional antialiased line, like the Béla Fleck CD above. What’s going on there? John continues:

However, Jaguar introduced a change in text rendering such that Unicode text strings are always anti-aliased, in every application, no matter what your pref settings are. That’s why text with accented characters is anti-aliased, but plain ASCII is not.

Thanks Matt, and John. Preferences setting changed; all better now. :-)

Update 2: Yes, Matt Brubeck is a distant cousin of the famous Dave. What a delightful coincidence! I blog a weird problem, just happening to use a Dave Brubeck CD in the screenshot. And one of my blog readers (there aren’t that many!)—who happens to have the wherewithal to find out the answer to the problem—-is a relative.

As Duncan says: I love blogspace!

Assertions and Tests

06 May 2003

Part 9 of Bill Venners’ interview with Dave and Andy talks about how to effectively use assertions in your code. The discussion reminded me of some thoughts that were bouncing around my head a few weeks ago about the relationship between assertions and tests.

(None of this is new or original; in fact, I’m sure I’ve read all of these thoughts somewhere before. But I think they’re worth repeating.)

In late March I was finally able to sit in on Mike Clark’s talk on test-driven development. It was a great talk, but there was one question from the audience that really bothered me, because I knew there was a flawed assumption in the question, and I felt that the answer should be obvious, but neither Mike nor I could see it at that moment.

The question was from a lady who apparently had a background in the Eiffel language. She talked about Eiffel’s design-by-contract constructs: assertions for preconditions, postconditions, and class invariants. And then: “Isn’t that a better approach, so that you would have the tests actually in the code?”

It’s an excellent question. And the answer (which became crystal clear to me 10 minutes after the talk was over) is simple: assertions are not tests.

So what are tests?

Tests involve two parts: behavior checks, and input data. Assertions can partially do the behavior checking, but they don’t supply the input data. And there are very good reasons for having both parts in one place.

From a unit-testing point of view, the assertions that matter most are the postconditions. They have the primary job of verifying that the methods they are attached to did the right thing. (Preconditions can be used for correctly handling invalid input, but there are some good reasons not to use them for that).

Postconditions must be generalized: they must work for all possible inputs. In other words, the postcondition is a different (ideally more declarative) way of expressing the result of the same computation performed by the body of the method. Therefore, for a method that does fairly complicated things with its input, the postconditions must either:

depend on the same helper methods as the body of the method;
be just as complicated (and likely to contain bugs) as the body of the method; or
be just a sanity check rather than a full validation.

And in fact, the last is most common. Consider a method, byte[] md5(String text), that calculates an MD5 secure checksum on its input. That’s a complicated mathematical operation. So consider how you’d write the postcondition. Option 1 might be practical, but isn’t much help from a unit-testing perspective. Option 2 is not really practical. More than likely, you’ll fall back to option 3, just checking that the result is 16 bytes long (because all MD5 checksums are 128 bits long) or something similar.

Unit tests, on the other hand, just have to check particular inputs and outputs for something like this. So you can supply canned input and test the results against precalculated checksums. You might calculate checksums for the test data using different tools, so that your test is more for interoperability than correctness (you’re trusting those other implementations to be correct). It might be practical to hand-calculate an MD5 checksum for a small input to round out the test.

So what are assertions?

So what are assertions, then? What are they good for?

Assertions serve three primary purposes:

They serve as extremely basic tests, catching basic error conditions during development.
They are internal documentation, helping readers to see what the developer understood about the code as it was being written.
They are a fail-fast mechanism, ensuring that errors are reported close to where they occur.

Those things can still be very valuable, but in my opinion the least valuable role of assertions is the testing role. Assertions cannot and should not be viewed as comprehensive tests.