I wish to reply to Richard M. Stallman's recent criticism of Tcl, and his
suggestion that programmers not use it. In doing so, I neither support
nor reject the use of Tcl: it is a useful, but flawed, tool for building
certain kinds of applications; it is assinine to invalidate the positive
experiences many programmers have had with the system. It is of equal
folly to embrace it as panacea without investigating its weaknesses,
especially when a plethora of alternatives now exist. To those who
enjoy Scheme's minimalist syntax, I wholeheartedly co-endorse his
embracing of the language.
This commentary presents some technical points on the benefits and
detractions of Tcl, and offers some insights as to the merits of
languages like Tcl (in comparison to Scheme). I will be presenting
similar arguments in the upcoming USENIX Symposium on Very High Level
Languages, and a paper on the subject is online [1]. A more precise
description of Tcl semantics and implementation may be found in [2].
As a student of John's and as a student of programming languages and their
implementations, I've spent a lot of time over the past two years
contemplating Tcl, its semantics and its performance. At both of the
Tcl Workshops, I've presented papers of related work. It is from this
standpoint that I add my comments to the flurry of responses that RMS's
statement has generated.
> Why you should not use Tcl
> Richard Stallman, GNU Project
>
> ... intro deleted...
> Tcl was not designed to be a serious programming language. It was
> designed to be a "scripting language", on the assumption that a
> "scripting language" need not try to be a real programming language.
> So Tcl doesn't have the capabilities of one. It lacks arrays; it
> lacks structures from which you can make linked lists. It fakes
> having numbers, which works, but has to be slow. Tcl is ok for
> writing small programs, but when you push it beyond that, it becomes
> insufficient.
Tcl lacks pointers or pointer-like references; for example, there is no way
to pass by reference in Tcl. Tcl offers pass by name, so users not
wishing to _copy_ an array's contents when passing to a subroutine are
encouraged to either make the array a global variable (!)
or to pass its _string_name_ (the variable's string name, as it appears
in the source text of the caller procedure!) and have the callee use
"upvar" or "uplevel" to jump back into the caller's scope and grab a
reference to that variable and install it into the current one. In
other words, Tcl relies on a form of dynamic scoping to achieve
reference passing.
Tcl also lacks closures of any kind; it is impossible to "wrap" a section
of code and data together into some meaningful unit, especially a
stateful one (again, without use of global variables). The idea of
passing closures as arguments to functions, so popular in languages that
offer this, is unavailable to Tcl.
Tcl lacks automatic memory management facilities. To reclaim the memory
associated with a binding, you have to explicitly "unset" it, altering
your namespace. Tcl programs do not exhaust memory because they tend to
be small and because for simple, reusable data structures, the Berkeley
Tcl implementation automatically recollects values. Better stated:
memory management is not a difficult problem in a language that won't let
you use memory in very interesting ways.
Tcl is very slow. Tcl stores arrays as strings, making vector-ref take time
linear in the number of elements. Tcl's associative arrays ("array
variables") are implemented as hash tables, but the only key type is
strings, making this facility very slow as well. Tcl passes data back
to C as strings, and doesn't cache their native-typed values across
computations; when counting from 1 to 10,000, >90% of the time was spent
converting from strings to integers and vice versa. These problems are
solvable, as my MS thesis suggests, but John (and hence Sun) have no
intentions of implementing a faster Tcl interpreter for at least a year.
Here are some performance numbers I've made on my DEC Alpha (figure 100
million instructions per second, or .01usec per instruction):
scalar access Tcl set a 13 usec
Tcl $a 9 usec
C a (can be optimized away, else 0.01usec)
list access Tcl lindex $L1 200 500 usec
(short C a[199] 0.02 usec
elements)
More performance numbers can be found in the MS thesis, although their C
equivalents are not shown.
Tcl also has some serious design flaws (not unlike most languages, I
suppose). For example, it inherits C's arithmetic semantics, leading to
behavior like:
1234567890 * 1234567890 = 304084036 (Linux PC)
1234567890 * 1234567890 = 1524157875019052100 (DEC Alpha)
1234567890.0 * 1234567890.0 = 1.52527e+18 (Linux PC)
1234567890.0 * 1234567890.0 = 1.52416e+18 (DEC Alpha)
The argument in favor of correct mathematics is not simply a
beauty-contest argument, but purely in keeping with Tcl's claim to be a
"Very High Level Language" (as per its appearance in the recent USENIX
VHLL Symposium). If you want to abstract away the machine from the end
user, it's wise to not propagate its internal word size to users trying
to build user interfaces for scientific applications!
To be fair, Tcl does largely what it was intended to do. John Ousterhout's
code is among the cleanest, best-written code I have ever had the
pleasure of browsing, and it represents a terrific implementation of the
given set of ideas, borne out of writing (and rewriting) many "mini"
languages, none of which did everything he wanted. Tcl is a miracle
solution to solving certain classes of problems, such as configuration,
simple control, and other add-ons to much larger bodies of existing code.
For prototyping small systems with user interfaces, it is fantastic.
Research labs will adore Tcl, much as they have embraced Unix (as a
hackable OS), C (allowing the programmer to do _anything_), Perl (as a
Unix sysadmin tool second to none), etc.
> Tcl has a peculiar syntax that appeals to hackers because of its
> simplicity. But Tcl syntax seems strange to most users. If Tcl does
> become the "standard scripting language", users will curse it for
> years--the way people curse Fortran, MSDOS, Unix shell syntax, and
> other de facto standards they feel stuck with.
A stronger argument can be made: Tcl supports a single level of
substitution, confusing nearly all new users of the language.
While it is generally seen as unreasonable to foist parenthesized forms
(and hence RPN mathematics, etc.) on the programmers of the world, Tcl's
system is hardly a cure:
"expr $a+5" is not the same as "expr {$a+5}"
(if a is set to "45" (the string four-five), then both work, if it's a
math expression itself such as "4+5" the former works where the latter
errors. Similar bizarrities can be found throughout the language.)
Furthermore, Tcl doesn't really eliminate RPN notation. For example, to
assign a variable, you have to say:
set a 5
instead the more obvious
a = 5
Of course, the second argument ("a") is just a string, it's not really a
binding or a variable. Thus you can accidently say,
set $a 5
and this will mean to assign some variable whose string name is the
_value_ of the variable "a" (assume it exists). If teh current value of
"a" were "b", then this would do something very weird:
set b 5
set a b
#...many lines later...
set $a c
Tcl uses two different namespaces for procedures and data. You cannot use
"set" to define a procedure, for example. Thus you can have variables
with the same names as procedures.
More generally, you're allowed to give variables and procedures almost any
name, including names beginning with arbitrary punctuation and/or numbers.
For example:
proc 5 {a} {...}
# you forgot a backslash at the end of this next line.
# and tcl tries to call the function "5" with the value "6".
myFun blah blah blah ................................... blah blah
5 6
Since this is obviously horrible coding style and not at all valuable in
practice, this can only serve to confuse programmers with rare but
bizarre errors.
> For these reasons, the GNU project is not going to use Tcl in GNU
> software. Instead we want to provide two languages, similar in
> semantics but with different syntaxes. One will be Lisp-like, and one
> will have a more traditional algebraic syntax. Both will provide
> useful data types such as structures and arrays. The former will
> provide a simple syntax that hackers like; the latter will offer
> non-hackers a syntax that they are more comfortable with.
Scheme as a Replacement for Tcl
----------------------------------------
I personally don't see Scheme as a replacement for Tcl in the long run.
Scheme lacks the following features found in Tcl:
- algol-like syntax.
- variable traces (handy in debugging and in some applications).
- implicit conversion from one data type to another. If you pass an integer
to a routine demanding a float, it works because all int's are also floats.
The same is true in reverse for a subset of all floats, and errors are
only generated for non-integer values. This feature generalizes to
lists, hash tables, etc.
Scheme is not the only possible replacement language; ML works just as
well, depending on your stance re:dynamic vs. static typing.
Again, more elaborate arguments may be found in the online papers,
and I'll be presenting related commentary at the upcoming USENIX VHLL
conference. I should add, however, that I'm sufficiently convinced that
this is the "right way to do things" that I've devoted part of the past
year to designing and implementing Rush, a language that trivially
compiles into Scheme, but offers these features. Without attempting
heroics, Rush clocked in at 50-300x faster than Tcl.
Rush is still "in the lab", but STk (as Stallman points out) is a viable
alternative to Tcl for those not scared off by parentheses.
Adam Sah
PhD student
UC Berkeley Dept. of Computer Science
asah@cs.berkeley.edu
References
--------------
[1] A New Architecture for the Implementation of Scripting Languages.
to appear in USENIX Symposium on Very High Level Languages.
ftp://ginsberg.cs.berkeley.edu:/pub/papers/asah/rush-vhll94*
[2] TC: An Efficient Implementation of the Tcl Language
UC Berkeley Technical Report #UCB/CSD-94-812
ftp://ginsberg.cs.berkeley.edu:/pub/papers/asah/msthesis*
-- Thanks again, -A.Sah'94 ...Adam Sah...asah@cs.Berkeley.EDU... "Things are going, well..."