Python Language Internals: From Source to Execution

December 10th, 2008

Last week I gave a presentation in Sydney for the Open Source Developer’s Conference on the internals of the Python compiler. Now that the conference is over, I figure there’s no better place for the original paper than right here on my blog.

Thanks again to my employer, Shine for all their wonderful support with this. Thanks too to the folks who reviewed my paper before it was submitted and my colleagues at Shine who endured a painful test run that ran well over time! ;)

OSDC 2008 was a whole lot of fun. If you happened to miss the conference, I strongly recommend you make it along next year. It’s well worth the trip.

Download Python Language Internals: From Source to Execution, and a corresponding patch against Python 2.6 beta 3 implementing the “unless” statement. If you want to test the patch, you also be aware of the nasty little circular dependency that may cause you some grief — apply the latest (non-review) patch from the tracker before you start the build and you should be fine.

Categories: Compilers, Python, Software Development | 1 Comment

Speaking at OSDC 2008

August 15th, 2008

I’ve been offered a position to speak at this year’s Open Source Developers’ Conference on the internals of the Python compiler. I’ll be introducing the audience to the basic workings of the Python bytecode compiler, demonstrating how one would add a new construct to the language.

If you’re at the conference and interested in messing with the guts of a real compiler but not quite sure where to start, please come along. The only real prerequisite is a sound understanding of the C language.

The conference runs from the 2nd to the 5th of December. Registration should open some time in October if you’re interested in tagging along.

Now, back to my paper …

OSDC 2008: Sydney

Categories: Compilers, Python, Software Development | 2 Comments

The Internals of Python’s IMPORT_NAME Bytecode

April 14th, 2008

This was originally planned as a response to this post by Paul Bonser, but grew a little unwieldy (and his comment submission form seems to be broken?).

Effectively, Paul was (somewhat sleepily) mulling over the workings of the IMPORT_NAME bytecode. This bytecode is generated in response to Python code like the following:

import sys

And also for:

from foo import bar, baz

You’ll have to see the original post for the actual bytecode generated for this code, but Paul was asking why the latter syntax generates an IMPORT_NAME bytecode instruction which seems to do nothing at all with the fromlist, then generates additional IMPORT_FROM bytecodes that fetch the fromlist attributes from the parent module.

The documentation for __import__ somewhat solves this mystery:

Note that even though locals() and ['eggs'] are passed in as arguments, the __import__() function does not set the local variable named eggs; this is done by subsequent code that is generated for the import statement. (In fact, the standard implementation does not use its locals argument at all, and uses its globals only to determine the package context of the import statement.)

Essentially, when the IMPORT_NAME is executed for ‘from foo import bar, baz’, the __import__ builtin is called with the fromlist and a few other arguments (namely the globals() and locals() from the current frame of execution) to provide custom import handling for your Python programs. For example, you may want to prevent users of your program from writing scripts that import certain modules. I imagine Google’s App Engine might be using something like this to prevent access to certain evil or unavailable modules (but that’s just a wild, unfounded guess).

The code in Python/ceval.c for IMPORT_NAME seems to back this up (I’ve annotated the code with a few comments):


        case IMPORT_NAME:
            w = GETITEM(names, oparg);
            /* 1. LOCATE THE __import__ BUILTIN */
            x = PyDict_GetItemString(f->f_builtins, "__import__");
            if (x == NULL) {
                PyErr_SetString(PyExc_ImportError,
                        "__import__ not found");
                break;
            }
            Py_INCREF(x);
            v = POP();
            u = TOP();
            /* 2. BUILD THE LIST OF ARGUMENTS FOR __import__ USING THE fromlist, globals() AND locals() */
            if (PyInt_AsLong(u) != -1 || PyErr_Occurred())
                w = PyTuple_Pack(5,
                        w,
                        f->f_globals,
                        f->f_locals == NULL ?
                          Py_None : f->f_locals,
                        v,
                        u);
            else
                w = PyTuple_Pack(4,
                        w,
                        f->f_globals,
                        f->f_locals == NULL ?
                          Py_None : f->f_locals,
                        v);
            Py_DECREF(v);
            Py_DECREF(u);
            if (w == NULL) {
                u = POP();
                Py_DECREF(x);
                x = NULL;
                break;
            }
            READ_TIMESTAMP(intr0);
            v = x;
            /* 3. CALL __import__ WITH THE module name, fromlist, globals() AND locals() */
            x = PyEval_CallObject(v, w);
            Py_DECREF(v);
            READ_TIMESTAMP(intr1);
            Py_DECREF(w);
            SET_TOP(x);
            if (x != NULL) continue;
            break;

So this answers the question of why IMPORT_NAME needs the fromlist in the first place: it is merely passed along to __import__ to make it available to custom import handling code. But why aren’t the fromlist attributes added to the namespace inside IMPORT_NAME? I’m guessing it was a design decision: we already have an opcode for adding elements to the namespace, so why have a special case for imports? Of course the details may be more involved than that, but it’s the most obvious explanation I can think of.

In any case, thanks for the thought-provoking post, Paul!

UPDATE: Seems my comment made it through to his blog after all. Sorry for the double-up!

Categories: Compilers, Python, Software Development | 2 Comments

Python 2.6a2: Compile ASTs from within Python code

April 4th, 2008

I’m not going to go into too much depth because Georg Brandl has already covered it, but it’s an interesting topic. I couldn’t help but write a little entry about it. :)

A new alpha of Python has just been released, including a patch I wrote for compiling Python Abstract Syntax Trees down to bytecode. This means it’s now possible to manipulate ASTs from within your Python program, which lets you do all sorts of crazy things – like this, for example.

Piping this little program into itself yields the following:

$ ./python wacky.py <wacky.py
Bwahaha! I was once an Assign node!
Bwahaha! I was once an Assign node!

Neat huh?

Anyway, since then – on Neal Norwitz’s advice – I’ve started on an experimental patch for what I hope will one day be an optimizer for Python ASTs. Even though it’s early days, the possibilities offered by optimizing at the AST level are very interesting. For example, the (dirty, filthy, ugly, hack of a) patch I’m working on at the moment has support for optimizing this code:

if 1:
    'true'
else:
    'false'

Down to this (remember, no bytecode has been generated yet):

'true'

Very, very exciting stuff.

Anyway, I have a train to catch. More on this when I have more to show!

Categories: Compilers, Python, Software Development | No Comments

GTK Hello World in Six Different Languages

January 23rd, 2008

I’m still somewhat in holiday mode, so this entry is probably going to feel a little cheap for those of you following my more technical posts. I’m a big fan of GTK+ for user interfaces. If you don’t have the option or desire to use the Java platform and Swing, GTK+ is one of the better cross-platform user interface toolkits out there.

It’s high-level enough that it is easy to build quick, effective GUIs but low-level enough not to get in your way when you need to start messing around at the pixel level. GUIs can be built by hand using code, or designed using Glade and exported to an XML document to be loaded at runtime by any GTK+ binding. Currently it runs on Windows and Linux (the toolkit actually has its roots in GNOME) and has bindings for most popular programming languages. The only major downer is that Mac users are left out in the cold unless they go to the (herculean) effort of getting X11 up and running.

All the little differences between the various GTK+ bindings out there tend to get my goat when I move from one language to another. You would think that a GTK+ example in C could easily be translated to other languages without referencing documentation right? For one reason or another, the GTK+ bindings for other languages tend to diverge from the pleasant consistency of the C API to varying degrees. This post is all about those little differences that crop up even in the simplest applications: GTK’s take on “Hello World” in a few different languages.

C

C, being the language GTK+ is actually written in, is probably the most consistent with function naming across the different objects … the GTK_* and G_* macros are quite ugly though.

#include <gtk/gtk.h>

int main (int argc, char **argv) {
  GtkWidget *window;
  gtk_init(&argc, &argv);

  window = gtk_window_new (GTK_WINDOW_TOPLEVEL);
  gtk_window_set_title (GTK_WINDOW (window), "Hello, World");
  g_signal_connect (G_OBJECT (window), "delete-event", gtk_main_quit, NULL);
  gtk_widget_show_all (window);

  gtk_main();
  return 0;
}

C#/Mono

GTK# adds stuff like the Application object and delegates to produce a “Hello World” example I had to go digging around in documentation for. Not too bad on the whole, though.

using Gtk;
using GtkSharp;

public class HelloWorld {
  public static void Main(string[] args) {
    Gtk.Window window = new Gtk.Window();
    window.Title = "Hello, World";
    window.DeleteEvent += delegate { Application.Quit(); };
    window.ShowAll();
    Application.Run();
  }
}

Ocaml

The Ocaml bindings diverge from the original C API much more so than the other languages listed here, most likely due to design decisions that had to be made to provide a C binding to a functional language. My only real grumble is the inconsistency with window#event#connect vs window#connect. I’m guessing there was a technical reason for that, but it still irks me every time I see it.

let delete_event evt = false

let destroy () = GMain.Main.quit ()

let main () =
  let window = GWindow.window in
  let _ = window#set_title "Hello, World" in
  let _ = window#event#connect#delete ~callback:delete_event in
  let _ = window#connect#destroy ~callback:destroy in
  let _ = window#show () in
  GMain.Main.main ()
;;

let _ = main () ;;

Perl

Although I find writing Perl to be painful for everything but processing text files in a terminal, I found the Perl GTK bindings to be relatively straightforward.

use strict;
use Gtk2 '-init';

my $window = Gtk2::Window->new;
$window->set_title("Hello, World");
$window->signal_connect('delete-event', sub { Gtk2->main_quit; });
$window->show_all;
Gtk2->main;

Python

I’m more familiar with PyGTK than with other bindings, so this was a snap. Why they chose GtkObject.connect over GtkObject.signal_connect is a mystery and the pygtk.require(’…’) crap is a little weird, but aside from that there should be nothing surprising here (this is a good thing!).

import pygtk
pygtk.require('2.0')
import gtk

window = gtk.Window()
window.set_title('Hello, World')
window.connect('delete-event', gtk.main_quit)
window.show_all()
gtk.main()

Ruby

RubyGNOME provides a GTK+ binding for Ruby. Take an almost 1:1 port of the C API, take away the ugly casting macros, mix in closures for handling signals and Ruby really is one of the nicest ways to get intimate with GTK+.

require 'gtk2'

window = Gtk::Window.new
window.title = 'Hello, World'
window.signal_connect(:delete-event) { Gtk.main_quit }
window.show_all

Gtk.main

That’s all for now. There are many more language bindings for GTK+ out in the wild for languages like Lisp/Scheme, C++, Haskell and Erlang. If you’re looking around for a GUI toolkit, be sure to give GTK a go. There’s plenty of documentation available for all the bindings listed here, often with some very detailed and easy to follow tutorials.

UPDATE: Miguel and she suggested some changes to the C#/Mono and RubyGNOME examples.
UPDATE 2: Aristotle suggested an easier way to initialize the Perl GTK bindings.

Categories: C, GTK, Ocaml, Python, Ruby, Software Development | 11 Comments

Verlet Integration: A Little Physics Demo in Python

September 28th, 2007

Just before I crawl into bed, here’s a little demo based on the first fifth of Thomas Jakobsen’s Advanced Character Physics. I don’t really have the mathematical background to talk about verlet at any length other than to say it’s very interesting. :)

Verlet demo screenshot

The screenshot doesn’t really do the neat “physics” justice – it’s a cheap yet interesting effect.

In a nutshell, velocity verlet is an alternative to Euler (i.e. new_position = current_position + velocity * elapsed). In verlet, velocity is implicitly calculated based on the last position of an entity such that when elapsed is fixed:

new_position = 2 * current_position - last_position + acceleration * elapsed * elapsed

This equation tends to change the way you do pretty much everything related to your game physics. Rather than simply setting your game object’s velocity, you might push it instead via a force or hack the position/last position to achieve the same effect. I’m finding the leap a little daunting myself.

I actually touched on verlet briefly years ago using DirectX thanks to one of my more interesting lecturers at QANTM. Hope you don’t consider the Googling of your name to be excessively creepy, Christian. It’s been a few years. ;)

The supporting code is a little heavy for such a little demo – I plan to put it to wider use, so please excuse the mess for now.

You’ll need a reasonably recent version of Pygame and PyOpenGL installed in order to run this demo. Once it’s up and running, click your mouse on the window to watch some OpenGL triangles drop from the sky. Click, drag and release to fling triangles … almost like you were throwing them yourself.

Hopefully I’ll get a chance to expand on this among the thousand other things I want to do with my free time (not the least of which is writing more Ocaml tutorials)!

Download the Python verlet demo

Categories: Graphics Programming, OpenGL, Python, Software Development | 5 Comments

Passing Data Between GTK Applications With GtkClipboard

June 27th, 2007

GtkClipboard is a wonderful (if somewhat recent) addition to GTK. As you would expect, it allows users to pass data between your application and other GTK applications via the X11 clipboard (and vice versa). A common past annoyance with the clipboard under linux is that data stored disappeared from the clipboard when the application which set the data terminated. No longer so, thanks to gtk_clipboard_store and the GNOME Clipboard Daemon.

Now, tutorials on GtkClipboard are a little sparse and the documentation seems to leave a few questions unanswered, so here’s a quick and dirty primer on passing text data between your GTK applications using GtkClipboard.

Python

import pygtk
pygtk.require('2.0')
import gtk

# get the clipboard
clipboard = gtk.clipboard_get()

# set the clipboard text data
clipboard.set_text('Hello!')

# make our data available to other applications
clipboard.store()

Ruby

require 'gtk2'

# initialize Ruby's GTK bindings
Gtk.init

# get the clipboard
clipboard = Gtk::Clipboard.get(Gdk::Selection::CLIPBOARD)

# set the text. Ruby-Gnome2 also provides a text= setter
clipboard.set_text('Hello, World')

# make the clipboard data available to external applications
clipboard.store

C

#include <gtk/gtk.h>
#include <gdk/gdk.h>
#include <string.h>

int main (int argc, char **argv) {
    const char *message = "Hello, World";

    /* initialize GTK */
    gtk_init (&argc, &argv);

    /* set the clipboard text */
    gtk_clipboard_set_text(gtk_clipboard_get(GDK_SELECTION_CLIPBOARD), message, strlen(message));

    /* store the clipboard text */
    gtk_clipboard_store(gtk_clipboard_get(GDK_SELECTION_CLIPBOARD));

    return 0;
}

You can also use gtk_clipboard_set_image (for Ruby and Python, there are equivalents) to pass GdkPixbuf data to the clipboard. Check the GTK/PyGTK/Ruby-Gnome2 documentation for more details.

It’s actually all quite easy, but I thought it might be nice to see the code in practice.

Categories: C, GTK, Linux, Python, Ruby, Software Development | 3 Comments