Optimizing Python Generator Functions in the AST

April 19th, 2008

I know I’ve written quite a few Python articles of late, so it may be frustrating for some of you to see yet another Python-specific post. There is a reason for it! First, Python is perhaps my favourite scripting language (Ruby’s a close second, if only because I’ve had some bad experiences with Ruby bindings to C libraries in the past – probably related to language popularity at the time). Second, the AST optimization patch I’ve been working on has meant I’ve been pretty immersed in the source code of the Python compiler the last few week.

Anyway, this post is a direct result of some work I’ve been doing with the AST optimization patch. Specifically, I’ve been seeing test_generators fail with a couple of specific optimizations.

In the current Python trunk, code like this:

def mygenerator1():
  if 0:
    yield 5
  return

or this:

def mygenerator2():
  return
  yield 7

will result in the function being treated as a generator, even though the generator will not actually return anything. That’s fine, but what happens when we throw the AST optimizer into the mix? Suddenly mygenerator1 becomes:

def mygenerator1():
  return

Similarly, mygenerator2 becomes:

def mygenerator2():
  return

What effect does this have when we actually call our generator functions? Here’s an example of what happens with mygenerator1 before the optimizer does its work:

>>> mygenerator1()
<generator object at 0xb7d2bc2c>
>>>

And here's what happens after the AST optimization patch is applied:

>>> mygenerator1()
>>>

As you can see, mygenerator1 is no longer a generator function with the AST optimizer in effect. Ouch.

In order to try and find a way to make optimization of generator functions a possibility in the AST optimizer, I went digging through the Python source code to investigate how and when an ordinary function becomes a generator function. The answer is a little complicated, but I'll do my best to summarize it all here.

It all begins in Python/compile.c's PyAST_Compile function. Before any code generation takes place, PySymtable_Build (see Python/symtable.c) is called to generate a symbol table from the AST being compiled.

PySymtable_Build in turn visits every node in the AST to construct the symbol table for compilation. Among the visitor functions is symtable_visit_expr, which is where a function is initially marked as a generator function if it is found to contain a "Yield" node. This is done by setting the ste_generator field of the current symtable_entry:

           case Yield_kind:
                if (e->v.Yield.value)
                    VISIT(st, expr, e->v.Yield.value);
                st->st_cur->ste_generator = 1;
                // ... code omitted ...

st_cur is effectively the symbol table entry for the current local namespace, and the "yield" statement can only appear inside a function. So here, when we're marking st_cur as a generator, we're effectively saying "this function is a generator!". This is only the first step in the making of a generator function.

Later on in the compile process (see Python/compile.c), we hit the compile_function function, which is responsible for generating the Python bytecode for a FunctionDef AST node. compile_function calls assemble() to build the code object that will be executed when this function is called. The assemble() function calls the makecode() function which in turn calls compute_code_flags(). In compute_code_flags, we find this:

    if (ste->ste_type == FunctionBlock) {
        if (!ste->ste_unoptimized)
            flags |= CO_OPTIMIZED;
        if (ste->ste_nested)
            flags |= CO_NESTED;
        if (ste->ste_generator)
            flags |= CO_GENERATOR;
    }

The result of this function will then be used as the co_flags for the PyCodeObject generated by the assemble() function. So, as we can see here, if the ste_generator field is set by PySymtable_Build for our function, our corresponding PyCodeObject is going to have its CO_GENERATOR flag set.

However, this still doesn't explain the behaviour we see when we actually call a generator function. Ordinary functions will just pass back whatever value is given to "return" (or Py_None, in the event that no "return" is explicitly given, or if no value is given to "return"). In the case of a generator function, however, the call will return a new "generator" object. So where does this generator come from?

Inside Python/ceval.c we find PyEval_EvalCodeEx. This function will eventually be called either by run_mod or run_pyc_file in Python/pythonrun.c or by exec_statement inside Python/ceval.c. PyEval_EvalCodeEx will step into itself recursively until it hits the execution frame of our generator function. Like the symtable_entry for the Symtable, the "frame" can be thought of as a sort of "local namespace" for the code object being executed at runtime. This is a simplified explanation, but it should be enough to make the concepts presented here easy enough to follow.

The frame for the function being executed has a code object associated with it, which in turn has the co_flags that were set for the function at compile time. Thus, PyEval_EvalCodeEx merely needs to test for the existence of the CO_GENERATOR flag:

    if (co->co_flags & CO_GENERATOR) {
        /* Don't need to keep the reference to f_back, it will be set
         * when the generator is resumed. */
        Py_XDECREF(f->f_back);
        f->f_back = NULL;

        PCALL(PCALL_GENERATOR);

        /* Create a new generator that owns the ready to run frame
         * and return that as the value. */
        return PyGen_New(f);
    }

The call to PyGen_New() seen here is how our generator function is truly differentiated from an ordinary function: the resulting PyObject will be a generator instance (see Objects/genobject.c). This will eventually bubble up to your Python code and, whenever we call the next() method or pass the generator off to a for loop, the tp_iternext method on the generator will be called to execute the function and yield the next available value. Once a StopIteration exception is thrown or the function returns (which, within the generator object, results in a StopIteration being thrown anyway), the generator will have ceased execution.

*exhales* So that's how generator functions work. :)

Despite all this analysis, the only work-around I can see is doing a full scan of the FunctionDef body for Yield nodes. If we find a Yield anywhere in the body, then rather than replacing an "if" node with a "pass", we might just insert an unreachable "yield" somewhere to force the compiler to produce a proper generator function. This isn't really an ideal solution, but it would require the least number of changes to the existing code base. Another option involving a bit more work would introduce an annotated AST: we can mark FunctionDef nodes as "generators" at the AST level.

In any case, the ultimate solution to this requires a broader discussion than what I can cover here. The exploration into how Python "knows" a function is actually a generator function was quite interesting, nonetheless. Hopefully it's interesting to somebody else out there. :)

Categories: Software Development | No Comments

The Internals of Python’s IMPORT_NAME Bytecode

April 14th, 2008

This was originally planned as a response to this post by Paul Bonser, but grew a little unwieldy (and his comment submission form seems to be broken?).

Effectively, Paul was (somewhat sleepily) mulling over the workings of the IMPORT_NAME bytecode. This bytecode is generated in response to Python code like the following:

import sys

And also for:

from foo import bar, baz

You’ll have to see the original post for the actual bytecode generated for this code, but Paul was asking why the latter syntax generates an IMPORT_NAME bytecode instruction which seems to do nothing at all with the fromlist, then generates additional IMPORT_FROM bytecodes that fetch the fromlist attributes from the parent module.

The documentation for __import__ somewhat solves this mystery:

Note that even though locals() and ['eggs'] are passed in as arguments, the __import__() function does not set the local variable named eggs; this is done by subsequent code that is generated for the import statement. (In fact, the standard implementation does not use its locals argument at all, and uses its globals only to determine the package context of the import statement.)

Essentially, when the IMPORT_NAME is executed for ‘from foo import bar, baz’, the __import__ builtin is called with the fromlist and a few other arguments (namely the globals() and locals() from the current frame of execution) to provide custom import handling for your Python programs. For example, you may want to prevent users of your program from writing scripts that import certain modules. I imagine Google’s App Engine might be using something like this to prevent access to certain evil or unavailable modules (but that’s just a wild, unfounded guess).

The code in Python/ceval.c for IMPORT_NAME seems to back this up (I’ve annotated the code with a few comments):


        case IMPORT_NAME:
            w = GETITEM(names, oparg);
            /* 1. LOCATE THE __import__ BUILTIN */
            x = PyDict_GetItemString(f->f_builtins, "__import__");
            if (x == NULL) {
                PyErr_SetString(PyExc_ImportError,
                        "__import__ not found");
                break;
            }
            Py_INCREF(x);
            v = POP();
            u = TOP();
            /* 2. BUILD THE LIST OF ARGUMENTS FOR __import__ USING THE fromlist, globals() AND locals() */
            if (PyInt_AsLong(u) != -1 || PyErr_Occurred())
                w = PyTuple_Pack(5,
                        w,
                        f->f_globals,
                        f->f_locals == NULL ?
                          Py_None : f->f_locals,
                        v,
                        u);
            else
                w = PyTuple_Pack(4,
                        w,
                        f->f_globals,
                        f->f_locals == NULL ?
                          Py_None : f->f_locals,
                        v);
            Py_DECREF(v);
            Py_DECREF(u);
            if (w == NULL) {
                u = POP();
                Py_DECREF(x);
                x = NULL;
                break;
            }
            READ_TIMESTAMP(intr0);
            v = x;
            /* 3. CALL __import__ WITH THE module name, fromlist, globals() AND locals() */
            x = PyEval_CallObject(v, w);
            Py_DECREF(v);
            READ_TIMESTAMP(intr1);
            Py_DECREF(w);
            SET_TOP(x);
            if (x != NULL) continue;
            break;

So this answers the question of why IMPORT_NAME needs the fromlist in the first place: it is merely passed along to __import__ to make it available to custom import handling code. But why aren’t the fromlist attributes added to the namespace inside IMPORT_NAME? I’m guessing it was a design decision: we already have an opcode for adding elements to the namespace, so why have a special case for imports? Of course the details may be more involved than that, but it’s the most obvious explanation I can think of.

In any case, thanks for the thought-provoking post, Paul!

UPDATE: Seems my comment made it through to his blog after all. Sorry for the double-up!

Categories: Compilers, Python, Software Development | 2 Comments

Python 2.6a2: Compile ASTs from within Python code

April 4th, 2008

I’m not going to go into too much depth because Georg Brandl has already covered it, but it’s an interesting topic. I couldn’t help but write a little entry about it. :)

A new alpha of Python has just been released, including a patch I wrote for compiling Python Abstract Syntax Trees down to bytecode. This means it’s now possible to manipulate ASTs from within your Python program, which lets you do all sorts of crazy things – like this, for example.

Piping this little program into itself yields the following:

$ ./python wacky.py <wacky.py
Bwahaha! I was once an Assign node!
Bwahaha! I was once an Assign node!

Neat huh?

Anyway, since then – on Neal Norwitz’s advice – I’ve started on an experimental patch for what I hope will one day be an optimizer for Python ASTs. Even though it’s early days, the possibilities offered by optimizing at the AST level are very interesting. For example, the (dirty, filthy, ugly, hack of a) patch I’m working on at the moment has support for optimizing this code:

if 1:
    'true'
else:
    'false'

Down to this (remember, no bytecode has been generated yet):

'true'

Very, very exciting stuff.

Anyway, I have a train to catch. More on this when I have more to show!

Categories: Compilers, Python, Software Development | No Comments

When Speed Matters

April 3rd, 2008

I’d be the first to tell you if, when confronted with a task that requires a degree of automation, you use a scripting language like Python, Ruby, Perl, Bash or some mix of the four. However, I recently had a problem that involved changing the permissions on a large number of files.

My initial approach used a small (10 lines or so) Bash script to traverse the filesystem hierarchy and change the permissions based on whether we were processing a file or a directory. The resulting script ran over more than 3000 files and directories in about a minute and a half. Which wasn’t exactly slow, but I had to run this script about twenty to thirty times throughout the day and it felt very unproductive to have to wait for a whole minute and a half before I could continue my work. So I did something crazy: I rewrote that little Bash program in C.

The resulting C program – of maybe 75 lines of code – finished in 1.3 seconds.

Initially I thought I had made a mistake, so I checked the processed files. They were all exactly as I expected them to be. I was astounded: I could execute this program almost 150 times before my old Bash solution finished even once! This decision made my day – the rest of my afternoon was much more productive. I even felt happier to know I wasn’t wasting so much time on something trivial and secondary to the actual task at hand.

Again, I’m not one to go preaching about how important performance is with respect to any given language – often it’s much, much easier to write a few Python/Ruby/Perl scripts and push data between them with some Bash glue. In this particular case, however, the choice of a lower-level language was clearly a massive win.

Even in retaining the pragmatic perspective, it really makes you wonder just how much time is lost to “inefficient” software stacks…

Categories: C, Linux, Software Development | 4 Comments

On Bloom Filters

March 15th, 2008

I’ve got to admit that I don’t come from a hard core CS background (and I’m sure it shows in some of my articles on more complicated topics :P ), but I know a little bit about data structures and algorithms. While I’m no expert on the topic of data structures, I was surprised that – up until now – I had never even heard of Bloom filters.

What are Bloom filters?
Let’s say you have a set of millions of records in a database. Running a query over such a database can be quite time consuming. It would be great if we a cheap way to determine if a given record is likely to be within the database without necessarily performing a query. Bloom filters to the rescue!

Rather than executing a complex database query, we can ask our Bloom filter whether or not a certain record exists. It will then respond with either a definitive “no” (i.e. the record is definitely not in the database) or a “maybe”. In the event our Bloom filter says “maybe”, we must (unfortunately) query the database to determine whether or not the record really exists.

While this may sound like double the workload for our poor application, querying the Bloom filter is likely to be very fast. The goal of the Bloom filter here is to reduce the number of times we actually hit the database, a way to determine the probability a given input exists in our data set. Bloom filters are also small in terms of memory usage, requiring nothing more than a simple bitfield.

A quick overview

Bloom filters primarily consist of a bitfield of size m and one or more (call it k) hash functions. The classic Bloom filter provides two main operations:

  1. add, which is used to add new records to the Bloom filter; and
  2. query, which is used to determine whether a record is likely to be found in the data set.

Starting out with a zero-value bitfield, each time we call add our hash functions are each called on the input record in turn to produce a set of k indices (one for each hash function). For each one of these indices, we set the corresponding bit in the bitfield.

To query this filter, we again generate k indices using our hash functions on the input record. Rather than setting the bits in the bitfield, we instead check each bit. If any one of the bits is not set, then there is zero chance our input record exists in the data set (a definitive “no”). Otherwise, if all k bits are set, we have a “maybe”.

Problems with the classic implementation

There’s one glaring problem with the classic Bloom filter: there are no deletions! If records in your source data set can be deleted, your Bloom filter may need to be regenerated (depending on your accuracy needs). There exists a variation on the classic filter which uses an array of counters instead of a bitfield. The add operation becomes a counter increment. This allows for a delete operation which, predictably, becomes a decrement of the counter. This variant trades memory for flexibility.

A reference implementation

I’m uploading a C reference implementation of a Bloom filter with 1 (super naive) hash function and a 32-bit-long bitfield. Hopefully there is somebody out there who finds it informative.

Download my crummy (but hopefully instructive!) reference implementation here.

WARNING I wrote this implementation from a quick read of the Wikipedia page, so if there are any CS geeks out there who can see I’m clearly doing something wrong, please tell me how wrong I am :)

Categories: C, Software Development | 3 Comments

Fixing Rails Nested Forms (or: HashWithIndifferentAccess is evil)

March 12th, 2008

I’ve moved over to Shine’s Ruby on Rails team and, as such, have been exposed to a whole lot of Rails code over the past few weeks. Something I ran into a few weeks back was this bug, relating to a parsing bug in nested forms. I’ve submitted a series of patches, the last of which I hope to see in HEAD sometime over the next few days.

Anyway, the actual bug was caused by weirdo (and non-Hash-like) semantics in HashWithIndifferentAccess that causes certain types (Hash and Array) to be stored as a copy of the value passed in. The following code will give you an idea of the crazy logic that might result from such behavior:

def mktom(); {:name => 'Tom', :age => 23}; end

tom = mktom
people = {:tom => tom}
tom[:name] = 'Thomas'
puts people[:tom][:name] # displays 'Thomas' - cool!
puts "The same object" if tom.object_id == people[:tom].object_id # displays 'The same object'

tom = mktom
people = {:tom => tom}.with_indifferent_access
tom[:name] = 'Thomas'
puts people[:tom][:name] # displays 'Tom' - wtf!
puts "Not the same object!" if tom.object_id != people[:tom].object_id # displays 'Not the same object!'

The lesson learned from tracking down and fixing this bug?

  • Do not treat HashWithIndifferentAccess as though it were any old Hash. It has different semantics to Hash and may store copies of the keys and/or values instead of the original objects.
  • Do not nest HashWithIndifferentAccess instances in Hash instances unless you enjoy headaches. It’s so easy to think that HashWithIndifferentAccess is semantically identical to Hash (especially thanks to Hash#with_indifferent_access), but this assumption was the cause of two nasty, hard-to-find bugs in this instance.
  • When providing an API virtually identical to (and easily mistakable for) core Ruby classes, ensure that semantics are consistent with convention.

I’m not sure why they didn’t just override the reader for #[]. Maybe there’s more to that than I realize. Anyway, I’ve spent enough time thinking about this for tonight, time for bed. Sorry for the blog drought, I’m hoping to finalize a few more installments of the Ocaml series in along with the final part of my Scala/BCEL parser tutorial over the next few months. Stay tuned. :)

UPDATE Since I wrote this post, I’ve written a plugin that replaces Rails’ parameter parsing implementation to make dealing with complex nested forms easier still. Refer to Taking the Pain Out of Complex Forms in Rails.

Categories: Ruby, Software Development | No Comments

Don’t Just Read Other People’s Code: Understand It!

January 30th, 2008

Becoming a Better Programmer
The Internet’s software development stratosphere is forever spewing forth lists of N steps to becoming a better programmer. Among those touting their personal take on the road to code nirvana, I’ve noticed a single step that seems to be almost ubiquitous: read other people’s code. Since reading the perfectly distilled common sense of Steve McConnell’s Code Complete a few years back, I’ve always considered this to be a stylistic thing: when you read other people’s code, you do so to improve your own aesthetic.

McConnell’s book encouraged readers to (and I’m paraphrasing here) take what they found in the code of others and evaluate it. I took that at face value at the time (and I have ever since), and used others’ code as a judge of good style. If somebody’s code was easy to read, I’d integrate elements of their style into my own. If the code was … uh … labyrinthine … well, I’d tend to lean away from the styles in play. Easy.

The Goal is to Grok
Some time ago it struck me that maybe, after all this time, I’ve been quite wrong about what I should be focusing on when looking at others’ code. I think the one thing I’ve dearly undervalued in all my years of programming is this: reading code is just a step on the long path to understanding it. That is, don’t strive to merely “read” the individual statements in a source file. Develop a general understanding of how the system fits together as a whole and at multiple levels of granularity. My narrow-minded approach was never going to be nearly as effective as truly understanding more complex code bases.

Style is doubtlessly important, but there’s more to be gained from reading code than merely what works stylistically. There is really so, so much more to take from the nitty gritty of the implementation details of even the most poorly constructed software systems. By all means, discard the stylistic abortions of programmers who don’t care for aesthetics or code of a language which doesn’t necessarily lend itself to the problem at hand. But damn it, play with it. Become familiar with how all the little ugly parts of a system fit together. Step through the interesting bits in a debugger, watch the data move around, watch the state changes. Read the documentation back to front. Know the libraries, know the subsystems. Come to know the code, the software and the domain at a level you never thought possible when you first started playing with it. Understand it! Again, I guarantee you’ll learn as much from all the bad code you encounter as you do from the good – but only if you take the time to fit the pieces together.

Finding direction
Having said that, slogging through code for software you find mundane or boring is going to both test your patience and limit your potential for learning something new. Pick an open-source project (or two) that interests you and spend a few weeks or months tinkering with it. For example, you might be interested to learn how Pidgin/Gaim protocols and plugins are handled internally, or how JEdit’s editor component is rendered. How does Glade work? Delve into the source code behind your favourite libraries like wxWidgets, GTK+ and Qt/KDE and see how user interface libraries are built. How do Mono’s C# compiler and Python’s internals tick? How do you implement a robust HTTP server like Apache?

There’s a lot of code out there, so much to learn and a lot of smart people. Don’t be afraid to get your hands dirty with non-trivial projects, even when it’s initially frustrating to understand the basics of an unfamiliar domain. Understanding complex software systems will make you a better programmer and help you grow as a professional.

Categories: Software Development | 1 Comment

Wordpress upgrade and the Spotlight theme

January 24th, 2008

I just upgraded my wordpress installation and set up a new theme. Please let me know if you encounter any issues with the site.

Categories: Uncategorized | No Comments

GTK Hello World in Six Different Languages

January 23rd, 2008

I’m still somewhat in holiday mode, so this entry is probably going to feel a little cheap for those of you following my more technical posts. I’m a big fan of GTK+ for user interfaces. If you don’t have the option or desire to use the Java platform and Swing, GTK+ is one of the better cross-platform user interface toolkits out there.

It’s high-level enough that it is easy to build quick, effective GUIs but low-level enough not to get in your way when you need to start messing around at the pixel level. GUIs can be built by hand using code, or designed using Glade and exported to an XML document to be loaded at runtime by any GTK+ binding. Currently it runs on Windows and Linux (the toolkit actually has its roots in GNOME) and has bindings for most popular programming languages. The only major downer is that Mac users are left out in the cold unless they go to the (herculean) effort of getting X11 up and running.

All the little differences between the various GTK+ bindings out there tend to get my goat when I move from one language to another. You would think that a GTK+ example in C could easily be translated to other languages without referencing documentation right? For one reason or another, the GTK+ bindings for other languages tend to diverge from the pleasant consistency of the C API to varying degrees. This post is all about those little differences that crop up even in the simplest applications: GTK’s take on “Hello World” in a few different languages.

C

C, being the language GTK+ is actually written in, is probably the most consistent with function naming across the different objects … the GTK_* and G_* macros are quite ugly though.

#include <gtk/gtk.h>

int main (int argc, char **argv) {
  GtkWidget *window;
  gtk_init(&argc, &argv);

  window = gtk_window_new (GTK_WINDOW_TOPLEVEL);
  gtk_window_set_title (GTK_WINDOW (window), "Hello, World");
  g_signal_connect (G_OBJECT (window), "delete-event", gtk_main_quit, NULL);
  gtk_widget_show_all (window);

  gtk_main();
  return 0;
}

C#/Mono

GTK# adds stuff like the Application object and delegates to produce a “Hello World” example I had to go digging around in documentation for. Not too bad on the whole, though.

using Gtk;
using GtkSharp;

public class HelloWorld {
  public static void Main(string[] args) {
    Gtk.Window window = new Gtk.Window();
    window.Title = "Hello, World";
    window.DeleteEvent += delegate { Application.Quit(); };
    window.ShowAll();
    Application.Run();
  }
}

Ocaml

The Ocaml bindings diverge from the original C API much more so than the other languages listed here, most likely due to design decisions that had to be made to provide a C binding to a functional language. My only real grumble is the inconsistency with window#event#connect vs window#connect. I’m guessing there was a technical reason for that, but it still irks me every time I see it.

let delete_event evt = false

let destroy () = GMain.Main.quit ()

let main () =
  let window = GWindow.window in
  let _ = window#set_title "Hello, World" in
  let _ = window#event#connect#delete ~callback:delete_event in
  let _ = window#connect#destroy ~callback:destroy in
  let _ = window#show () in
  GMain.Main.main ()
;;

let _ = main () ;;

Perl

Although I find writing Perl to be painful for everything but processing text files in a terminal, I found the Perl GTK bindings to be relatively straightforward.

use strict;
use Gtk2 '-init';

my $window = Gtk2::Window->new;
$window->set_title("Hello, World");
$window->signal_connect('delete-event', sub { Gtk2->main_quit; });
$window->show_all;
Gtk2->main;

Python

I’m more familiar with PyGTK than with other bindings, so this was a snap. Why they chose GtkObject.connect over GtkObject.signal_connect is a mystery and the pygtk.require(’…’) crap is a little weird, but aside from that there should be nothing surprising here (this is a good thing!).

import pygtk
pygtk.require('2.0')
import gtk

window = gtk.Window()
window.set_title('Hello, World')
window.connect('delete-event', gtk.main_quit)
window.show_all()
gtk.main()

Ruby

RubyGNOME provides a GTK+ binding for Ruby. Take an almost 1:1 port of the C API, take away the ugly casting macros, mix in closures for handling signals and Ruby really is one of the nicest ways to get intimate with GTK+.

require 'gtk2'

window = Gtk::Window.new
window.title = 'Hello, World'
window.signal_connect(:delete-event) { Gtk.main_quit }
window.show_all

Gtk.main

That’s all for now. There are many more language bindings for GTK+ out in the wild for languages like Lisp/Scheme, C++, Haskell and Erlang. If you’re looking around for a GUI toolkit, be sure to give GTK a go. There’s plenty of documentation available for all the bindings listed here, often with some very detailed and easy to follow tutorials.

UPDATE: Miguel and she suggested some changes to the C#/Mono and RubyGNOME examples.
UPDATE 2: Aristotle suggested an easier way to initialize the Perl GTK bindings.

Categories: C, GTK, Ocaml, Python, Ruby, Software Development | 11 Comments

HTML Sucks for Rich Web Applications

December 2nd, 2007

This post started out as a reply to a work colleague’s article – Ben Teese’s CSS Layout Sucks for Panel-Based Web Apps- but it eventually expanded out into such a large little rant that I thought it would be better suited to my blog. So here it is.

Trying to Make CSS Suck Less

Ben, since I come from a web development background … I hear ya. At my last job, XHTML and CSS were our primary building blocks for web sites and applications. By the end of my time there, we got pretty good at building CSS layouts – but there was always some little exception, some little element that didn’t *quite* look right due to some weird CSS nuance. Worse, client change requests could often be downright painful for exactly the reasons you mention: recalculating measurements, layout “knowledge” scattered throughout the stylesheet(s), etc. Worse, trying to make it work in every version of Firefox, IE, Safari and Opera? No deal. Not without many, many painful hacks. Despite being a standard for *years*, most vendors’ implementations of CSS are half-assed at best and will remain so for the foreseeable future.

Regarding the DRY problems mentioned with CSS: a half-baked idea of mine was to use $TEMPLATE_LANGUAGE to generate CSS files as part of the build process to alleviate some of the pain. I never actually got around to this myself, but others seem to be scared of this approach. Why? If your template engine provides you with variables, conditionals, loops and a means for performing calculations in your CSS templates why aren’t you generating your CSS using Velocity, Erb or even PHP!?

HTML and Rich UI Don’t Mix

All gripes aside, when it comes to the web I remain a somewhat stalwart supporter of CSS and XHTML – if only because tables clutter up HTML like a dog. However, I do think that the HTML hemisphere of the web world is starved for choice: both tables and CSS suck in their own hairy little warty ways. Tables suck because they’re awful to maintain and effectively render screen-readers useless. CSS sucks because it’s error-prone and void of expressive syntax & semantics.

The web world should be paying a lot of attention to rich client solutions such as Flex and Silverlight. Flex would be my first choice for a client platform if I had to start building a web application tomorrow. In fact, I’d go so far as to argue that the decision between CSS/tables for web applications is moot – the real problem is the current trend of using HTML as a medium for rich web applications. GWT is a perfect example of the ridiculous lengths we’re going to in order to just make things work. It’s a godawful hack around a limited technology.

If your requirements permit it, don’t try to make your HTML jump through hoops – opt for Flex or Silverlight instead if you plan on having a truly interactive UI. If you’re stuck with HTML, inexpressive CSS stylesheets can be augmented with a template language to reduce the amount of knowledge being spread throughout your code.

Categories: Software Development | 7 Comments