Enterprise LAMP

Python News: Python 3.2 alpha 1 released

The first alpha release of Python 3.2 has been released for testing.

Dave Beazley: Yieldable Threads (Part 1)

Disclaimer: This whole post is one big thought experiment. It might, in fact, be a really dumb idea. Don’t say that you weren’t warned! – Dave.

Introduction

I’ll admit that I’m an unabashed fan of Python generator functions–especially when applied to problems in data processing (e.g., setting up processing pipelines, cranking on big datasets, etc.). Generators also have a rather curious use in the world of concurrency–especially in libraries that aim to provide an alternative to threading (i.e., tasklets, greenlets, coroutines, etc.). Just in case you missed them, I’ve given past PyCon tutorials on both generators and coroutines.

A common theme surrounding the use of generators and concurrency is that you can define functions that seem to operate as “tasks” without using any system threads (sometimes this approach is known as microthreading or green threading). Typically this is done by playing clever tricks with I/O operations. For example, suppose you initially had the following function that served a client in a multithreaded server.

def serve_client(c):
    request = c.recv(MAXSIZE)        # Read from a socket
    response = processing(result)    # Process the request
    c.send(response)                 # Send on a socket
    c.close()

With the assistance of a coroutine or tasklet library, you might be able to get rid of threads and rewrite the function slightly, using yield statements like this (keep in mind this is a high-level overview–the actual specifics might vary):

def serve_client(c):
    request = yield c.recv(MAXSIZE)        # Read from a socket
    response = processing(result)          # Process the request
    yield c.send(response)                 # Send on a socket
    c.close()

If you’ve never seen anything like this before, it will probably make your head spin (see my coroutines tutorial from PyCon’09). However, the gist of the idea is that the yield statements cause the function to suspend execution at points where I/O operations might block. Underneath the covers, the I/O request operations (recv, send, etc.) are handled by a scheduler that uses nonblocking I/O and multiplexing (e.g., select(), poll(), etc.) to drive the execution of multiple generator functions at once, giving the illusion of concurrent execution. To be sure, it’s a neat trick and it works great–well, so long as the processing() operation in the middle is well behaved.

Sadly, it’s not generally safe to assume that the processing step will play nice. In fact, a major limitation of coroutine (and event-driven) approaches concerns the handling of processing steps that block or consume a large number of CPU cycles. This is because any kind of blocking causes the entire program and all “tasks” to grind to a halt until that operation completes. This becomes a major concern if you are going to use other programming libraries as most code has not been written to operate in such a non-blocking manner. In fact, libraries based on polling and non-blocking I/O typically take great pains to work around this limitation (for instance, consider the difficulty of performing a blocking third-party library database query in this environment).

Threads : I’m Not Dead Yet

A simple solution that almost always eliminates the problem of blocking is to program with threads. Yes, threads–everyone’s favorite public enemy number one. In fact, threads work great for blocking operations. Not only that, the resulting threaded code tends to have readable and comprehensible control-flow (e.g., organized as a logical sequence of steps as opposed to being fragmented across dozens of asynchronous callbacks and event handlers). Frankly, I suspect that most Python programmers would prefer to use threads if it weren’t for the simple fact that their performance on CPU-bound processing sucks (damn you GIL!). However, I digress–I’ve already said more than enough about that.

A Different Premise (Thought Experiment)

Generator and event-based based alternatives to threads are usually based on a premise that thread programming should be avoided. However, all of these thread alternatives are also strongly based on reimplementing the one part of thread programming that actually works reasonably well–handling of blocking I/O.

As a thought experiment, I got to wondering–why do thread alternatives fix what isn’t broken about threads? If you really wanted to fix threads, wouldn’t you want to address the part of thread programming that actually is broken? In particular, the poor execution of CPU-intensive work.

Yielding CPU-intensive work

Generator-based alternatives to threads use the yield statement to have I/O operations carried out elsewhere (by a scheduler sitting behind the scenes). However, what if you fully embraced threads, but applied that same idea to CPU intensive processing instead of I/O? For example, consider a threaded function that looked like this:

def serve_client(c):
    request = c.recv(MAXSIZE)                # Read from a socket
    response = yield processing, (result,)   # Request processing (somehow)
    c.send(response)                         # Send on a socket
    c.close()

In this code, yield is not used to perform non-blocking I/O. Instead, it’s used to “punt” on CPU-intensive processing. For example, instead of directly calling the processing() function above, the yield statement merely spits it out to someone else. In a sense, the thread is using yield to say that it does NOT want to do that work and that it wants to suspend until someone else finishes it.

Some careful study is probably required, but just to emphasize, the above generator freely performs blocking I/O operations (something threads are good at), but kicks out problematic CPU-intensive processing using yield. It’s almost the exact opposite of what you normally see with generator-based microthreading.

A Yieldable Thread Object

To run such a function as a thread, you need to have a little bit of extra runtime support. The following code defines a new thread object that knows how to utilize our new use of yield:

# ythr.py
# Author : David Beazley
#
# A yieldable thread object that offloads CPU intensive work
# to a user-defined apply function

from threading import Thread
from types import GeneratorType

# Compatibility function (for Python 3)
def apply(func,args=(),kwargs={}):
    return func(*args,**kwargs)

class YieldableThread(Thread):
    def __init__(self,target,args=(),kwargs={},cpu_apply=None):
        Thread.__init__(self)
        self.__target = target
        self.__args = args
        self.__kwargs = kwargs
        self.__cpu_apply = cpu_apply if cpu_apply else apply

    # Run the thread and check for generators (if any)
    def run(self):
        # Call the specified target function
        result = self.__target(*self.__args,**self.__kwargs)
        # Check if the result is a generator.  If so, run it
        if isinstance(result, GeneratorType):
            genfunc   = result    # generator function to run
            genresult = None      # last result to send into the generator
            while True:
                try:
                    # Run to the next yield and get work to do
                    work = genfunc.send(genresult)
                    # Execute the work using the user-defined apply function
                    genresult = self.__cpu_apply(*work)
                except StopIteration:
                    break

The key to this implementation is the bottom part of the run() method that checks to see if the target function produced a generator. If so, the run method manually steps through the generator (using its send() method). The yielded results are assumed to represent CPU-intensive functions that need to execute. For each of these, the work is passed to a user-supplied apply function (the __cpu_apply attribute). By default, this function is set to apply() which makes the thread run the work as if yield wasn’t used at all. However, as we’ll see shortly, there are many different things that can be done by supplying a different implementation.

An Example

To explore this thread implementation, we first need a CPU-intensive function to work with. Here’s a trivial one just for the purpose of exploring the concept:

# A trivial CPU-bound function
def sumn(n):
    total = 0
    while n > 0:
        total += n
        n -= 1
    return total

This function just computes the sum of the first N integers in a really dumb way. Here is an example:

>>> sumn(25000)
312512500
>>> timeit('sumn(25000)','from __main__ import sumn',number=1000)
4.500338077545166
>>>

As you can see, summing the first 25000 integers 1000 times takes about 4.5 seconds (4.5 msec to do it just once). Remember that number–we’ll return to it shortly.

Next, we need to mix in some I/O. Let’s write a really simple multithreaded TCP server that turns the above function into a internet service. This code is just a standard threaded server that uses none of our magic (yet).

# sumserver.py
#
# A server that computes sum of n integers

from socket import *
from threading import Thread

# CPU-bound function
def sumn(n):
    total = 0
    while n > 0:
        total += n
        n -= 1
    return total

# Function that handles clients
def serve_client(c):
    n = int(c.recv(16))
    result = sumn(n)
    c.send(str(result).encode('ascii'))
    c.close()

# Plain-old threaded server
def run_server(addr):
    s = socket(AF_INET, SOCK_STREAM)
    s.setsockopt(SOL_SOCKET, SO_REUSEADDR,1)
    s.bind(addr)
    s.listen(5)
    while True:
        c,a = s.accept()
        thr = Thread(target=serve_client,args=(c,))
        thr.daemon = True
        thr.start()

if __name__ == '__main__':
    run_server(("",10000))

Finally, let’s write a test client program that can be used to make a bunch of requests and time the performance.

# sumclient.py
from socket import *
def run_client(addr,repetitions,n):
    while repetitions > 0:
        s = socket(AF_INET, SOCK_STREAM)
        s.connect(addr)
        s.send(str(n).encode('ascii'))
        resp = s.recv(128)
        s.close()
        repetitions -= 1

if __name__ == '__main__':
    import sys
    import time
    import threading

    ADDR = ("",10000)
    REPETITIONS = 1000
    N = 25000

    nclients = int(sys.argv[1])
    requests_per_client = REPETITIONS//nclients

    # Launch a set of client threads to make requests and time them
    thrs = [threading.Thread(target=run_client,args=(ADDR,requests_per_client,N)) for n in range(nclients)]
    start = time.time()
    for t in thrs:
        t.start()
    for t in thrs:
        t.join()
    end = time.time()
    print("%d total requests" % (nclients*requests_per_client))
    print(end-start)

This client simply initiates 1000 requests with the server, using different numbers of threads. Let’s run the server and try the client with different numbers of threads.

bash-3.2$ python sumclient.py 1         # 1 client thread
1000 total requests
4.34612298012
bash-3.2$ python sumclient.py 2         # 2 client threads
1000 total requests
7.81390690804
bash-3.2$ python sumclient.py 4         # 4 client threads
1000 total requests
9.5317029953
bash-3.2$ python sumclient.py 8         # 8 client threads
1000 total requests
10.2061738968
bash-3.2$

Observe that with only 1 client thread, the performance of the server is comparable with the performance of timeit(). Making 1000 requests takes about 4.3 seconds (in fact, it seems to be a little faster). However, if we start increasing the concurrency the performance degrades fast. With 4 client threads, the server is already running twice as slow. This is not a surprise–we already know that Python threads have problems with CPU bound processing.

A Modified Example (Using yield)

Let’s modify the server code to use our new YieldableThread object. Here is the code:

# ysumserver.py
#
# A server that computes sum of n integers (using yieldable threads)

from socket import *
from ythr import YieldableThread

# CPU-bound function (unmodified)
def sumn(n):
    total = 0
    while n > 0:
        total += n
        n -= 1
    return total

# Function that handles clients
def serve_client(c):
    n = int(c.recv(16))
    result = yield sumn, (n,)               # Notice use of yield
    c.send(str(result).encode('ascii'))
    c.close()

# Threaded server that uses yieldable threads. Note extra cpu_apply
# argument that allows a user-defined apply() function to be passed
def run_server(addr,cpu_apply=None):
    s = socket(AF_INET, SOCK_STREAM)
    s.setsockopt(SOL_SOCKET, SO_REUSEADDR,1)
    s.bind(addr)
    s.listen(5)
    while True:
        c,a = s.accept()
        thr = YieldableThread(target=serve_client,args=(c,),cpu_apply=cpu_apply)
        thr.daemon = True
        thr.start()

if __name__ == '__main__':
    run_server(("",10000))

Observe that this version of the code is only slightly modified.

By default, yieldable threads should have performance comparable to normal threads. Try the client again with this new server:

bash-3.2$ python sumclient.py 1
1000 total requests
4.95635294914
bash-3.2$ python sumclient.py 2
1000 total requests
7.82525205612
bash-3.2$ python sumclient.py 4
1000 total requests
9.25957417488
bash-3.2$ python sumclient.py 8
1000 total requests
9.95880198479

Yep, the same lousy performance as before. So, where is this going?

Add Some Special Magic

Recall that yieldable threads allow the user to pass in their own custom apply() function for performing CPU-bound processing. That’s where the magic enters the picture.

Let’s write a new apply function and try running the server again. Try this one:

# ysumserver.py
...
# A locked version of apply that only allows one thread to run
from threading import Lock
_apply_lock = Lock()
def exclusive_apply(func,args=(),kwargs={}):
    with _apply_lock:
         return func(*args,**kwargs)

if __name__ == '__main__':
    run_server(("",10000),cpu_apply=exclusive_apply)

Let’s try our client with this new server:

bash-3.2$ python sumclient.py 1
1000 total requests
4.55530810356
bash-3.2$ python sumclient.py 2
1000 total requests
5.75427007675
bash-3.2$ python sumclient.py 4
1000 total requests
5.75416207314
bash-3.2$ python sumclient.py 8
1000 total requests
5.81962108612
bash-3.2$

Wow! Look at the change for the threaded clients. When running with 8 threads, this new server serves requests about 1.7x faster. No code modifications were made to the server–only a different specification of the apply() function.

How is this possible you ask? Well, if you recall from my GIL talk, CPU-bound threads tend to fight with each other on certain multicore machines. By putting that lock in the apply function, threads aren’t allowed to fight anymore (only one gets to run CPU-intensive work at once). Again, keep in mind that the work in this example only takes about 4.5 milliseconds–we’re getting a nice speedup even though none of the threads are running in the apply function for very long.

Here is another more interesting example. Let’s farm the CPU-intensive work out to a multiprocessing pool. Change the server slightly:

# ysumserver.py
...
if __name__ == '__main__':
    import multiprocessing
    pool = multiprocessing.Pool()
    run_server(("",10000),cpu_apply=pool.apply)

Now, let’s try our client again.

bash-3.2$ python sumclient.py 1
1000 total requests
4.50634002686
bash-3.2$ python sumclient.py 2
1000 total requests
2.29651284218
bash-3.2$ python sumclient.py 4
1000 total requests
1.45105290413
bash-3.2$ python sumclient.py 8
1000 total requests
1.59892106056
bash-3.2$

Hey, look at that–the performance is actually getting better! For example, the performance with 4 client threads is more than 3 times faster than with just one thread. This is because the CPU-intensive work is now being handled on multiple cores through the use of the multiprocessing module.

Wrapping up (for now)

Since this post is already getting long, I’m going to wrap it up. However, let’s conclude by revisiting a previous bit of code. In our server, we defined a client handler function like this:

def serve_client(c):
    n = int(c.recv(16))
    result = yield sumn, (n,)
    c.send(str(result).encode('ascii'))
    c.close()

In this code, there are no dependencies on any libraries or special objects. It fact, all it does is spit out a bit of CPU-bound processing with the yield statement. Behind the scenes, the YieldableThread object is free to take this work and do whatever it wants to with it. For example, run it in a special environment, pass it to the multiprocessing module, send it out to the cloud, etc. I think that’s kind of cool.

Of course, at this point, you might be asking yourself, “what can possibly go wrong?” To answer that, you’ll have to wait for the next installment. However, as a preview, I’ll just say that the answer is “a lot!”

Postscript

All of my tests were performed using Python 2.7 running on a 4-core Mac Pro(2 x 2.66 GHz, Dual-Core Intel Xeon) running OS X version 10.6.4.

Although I’ve never seen generators used quite like this before, I don’t want to steal anyone’s thunder–if you are aware of prior work, please send me a link so I can post it here.

Additional Postscript

If you like messing around with concurrency, distributed systems, and other neat things, then you would probably like the Python Networks, Concurrency, and Distributed Systems course I’m running in Chicago.

Old code, new home – Travis Swicegood

Finally got around to converting some old code from SVN to Git and getting it up on GitHub. It’s like looking back through a time-warp actually, as most of the code hasn’t been touched since the summer of 2007.

Nearly all of the code is usable, but it’s all abandoned at this point. If there’s something there that strikes your fancy and you’d be interested in forking it into your own project, feel free.

There are still a few more to go, but you can start checking them out now at the Domain51 Github account. Just search for Domain51_ to filter the listing as they’re all named in the old PEAR-style package naming scheme.

Sylvain Hellegouarch: Integrating SQLAlchemy into a CherryPy application

Quite often, people come on the CherryPy IRC channel asking about the way to use SQLAlchemy with CherryPy. There are a couple of good recipes on the tools wiki but I find them a little complex to begin with. Not to the recipes’ fault, many people don’t necessarily know about CherryPy tools and plugins at that stage.

The following recipe will try to make the example complete whilst as simple as possible to allow folks to start up with SQLAlchemy and CherryPy.

# -*- coding: utf-8 -*-
import os, os.path
 
import cherrypy
from cherrypy.process import wspbus, plugins
 
from sqlalchemy import create_engine
from sqlalchemy.orm import scoped_session, sessionmaker
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Column
from sqlalchemy.types import String, Integer
 
# Helper to map and register a Python class a db table
Base = declarative_base()
 
class Message(Base):
    __tablename__ = 'message'
    id = Column(Integer, primary_key=True)
    value =  Column(String)
 
    def __init__(self, message):
        Base.__init__(self)
        self.value = message
 
    def __str__(self):
        return self.value.encode('utf-8')
 
    def __unicode__(self):
        return self.value
 
    @staticmethod
    def list(session):
        return session.query(Message).all()
 
 
class SAEnginePlugin(plugins.SimplePlugin):
    def __init__(self, bus):
        """
        The plugin is registered to the CherryPy engine and therefore
        is part of the bus (the engine *is* a bus) registery.
 
        We use this plugin to create the SA engine. At the same time,
        when the plugin starts we create the tables into the database
        using the mapped class of the global metadata.
 
        Finally we create a new 'bind' channel that the SA tool
        will use to map a session to the SA engine at request time.
        """
        plugins.SimplePlugin.__init__(self, bus)
        self.sa_engine = None
        self.bus.subscribe("bind", self.bind)
 
    def start(self):
        db_path = os.path.abspath(os.path.join(os.curdir, 'my.db'))
        self.sa_engine = create_engine('sqlite:///%s' % db_path, echo=True)
        Base.metadata.create_all(self.sa_engine)
 
    def stop(self):
        if self.sa_engine:
            self.sa_engine.dispose()
            self.sa_engine = None
 
    def bind(self, session):
        session.configure(bind=self.sa_engine)
 
class SATool(cherrypy.Tool):
    def __init__(self):
        """
        The SA tool is responsible for associating a SA session
        to the SA engine and attaching it to the current request.
        Since we are running in a multithreaded application,
        we use the scoped_session that will create a session
        on a per thread basis so that you don't worry about
        concurrency on the session object itself.
 
        This tools binds a session to the engine each time
        a requests starts and commits/rollbacks whenever
        the request terminates.
        """
        cherrypy.Tool.__init__(self, 'on_start_resource',
                               self.bind_session,
                               priority=20)
 
        self.session = scoped_session(sessionmaker(autoflush=True,
                                                  autocommit=False))
 
    def _setup(self):
        cherrypy.Tool._setup(self)
        cherrypy.request.hooks.attach('on_end_resource',
                                      self.commit_transaction,
                                      priority=80)
 
    def bind_session(self):
        cherrypy.engine.publish('bind', self.session)
        cherrypy.request.db = self.session
 
    def commit_transaction(self):
        cherrypy.request.db = None
        try:
            self.session.commit()
        except:
            self.session.rollback()
            raise
        finally:
            self.session.remove()
 
 
 
 
class Root(object):
    @cherrypy.expose
    def index(self):
        # print all the recorded messages so far
        msgs = [str(msg) for msg in Message.list(cherrypy.request.db)]
        cherrypy.response.headers['content-type'] = 'text/plain'
        return "Here are your list of messages: %s" % '\n'.join(msgs)
 
    @cherrypy.expose
    def record(self, msg):
        # go to /record?msg=hello world to record a "hello world" message
        m = Message(msg)
        cherrypy.request.db.add(m)
        cherrypy.response.headers['content-type'] = 'text/plain'
        return "Recorded: %s" % m
 
if __name__ == '__main__':
    SAEnginePlugin(cherrypy.engine).subscribe()
    cherrypy.tools.db = SATool()
    cherrypy.tree.mount(Root(), '/', {'/': {'tools.db.on': True}})
    cherrypy.engine.start()
    cherrypy.engine.block()

The general idea is to use the plugin mechanism to register functions on an engine basis and enable a tool that will provide an access to the SQLAlchemy session at request time.

Mike Driscoll: A py2exe tutorial – Build a Binary Series!

I received a request to create an article on how to use py2exe and wxPython to create an executable. I d, ecided to do a series on packaging instead. It is my intention to go over the major Windows binary building utilities and show you, dear reader, how to use them to create a binary that you can distribute. Once those articles are done, I’ll show how to use Inno and NSIS. To kick things off, we’ll go over how to use py2exe, probably the most popular of the Windows executable packages.

Let’s Get Started

For this tutorial, we’re going to use a wxPython script that doesn’t do anything. This is a contrived example, but we’re using wx to make it more visually interesting than just doing a console “Hello World” program. Note also that I am using py2exe 0.6.9, wxPython 2.8.11.0 and Python 2.6. Here’s what the end product should look like when run:

Now that we know what it looks like, here’s a look at the code:

import wx
 
########################################################################
class DemoPanel(wx.Panel):
    """"""
 
    #----------------------------------------------------------------------
    def __init__(self, parent):
        """Constructor"""
        wx.Panel.__init__(self, parent)
 
        labels = ["Name", "Address", "City", "State", "Zip",
                  "Phone", "Email", "Notes"]
 
        mainSizer = wx.BoxSizer(wx.VERTICAL)
        lbl = wx.StaticText(self, label="Please enter your information here:")
        lbl.SetFont(wx.Font(12, wx.SWISS, wx.NORMAL, wx.BOLD))
        mainSizer.Add(lbl, 0, wx.ALL, 5)
        for lbl in labels:
            sizer = self.buildControls(lbl)
            mainSizer.Add(sizer, 1, wx.EXPAND)
        self.SetSizer(mainSizer)
        mainSizer.Layout()
 
    #----------------------------------------------------------------------
    def buildControls(self, label):
        """"""
        sizer = wx.BoxSizer(wx.HORIZONTAL)
        size = (80,40)
        font = wx.Font(12, wx.SWISS, wx.NORMAL, wx.BOLD)
 
        lbl = wx.StaticText(self, label=label, size=size)
        lbl.SetFont(font)
        sizer.Add(lbl, 0, wx.ALL|wx.CENTER, 5)
        if label != "Notes":
            txt = wx.TextCtrl(self, name=label)
        else:
            txt = wx.TextCtrl(self, style=wx.TE_MULTILINE, name=label)
        sizer.Add(txt, 1, wx.ALL, 5)
        return sizer
 
 
 
########################################################################
class DemoFrame(wx.Frame):
    """
    Frame that holds all other widgets
    """
 
    #----------------------------------------------------------------------
    def __init__(self):
        """Constructor"""
        wx.Frame.__init__(self, None, wx.ID_ANY,
                          "Py2Exe Tutorial",
                          size=(600,400)
                          )
        panel = DemoPanel(self)
        self.Show()
 
#----------------------------------------------------------------------
if __name__ == "__main__":
    app = wx.App(False)
    frame = DemoFrame()
    app.MainLoop()

This is fairly straightforward, so I’ll leave it the reader to figure out. This article is about py2exe after all.

The py2exe setup.py file

The key to any py2exe script is the setup.py file. This file controls what gets included or excluded, how much we compress and bundle, and much more! Here is the simplest setup that we can use with the wx script above:

from distutils.core import setup
import py2exe
 
setup(windows=['sampleApp.py'])

As you can see, we import the setup method from distutils.core and then we import py2exe. Next we call setup with a windows keyword parameter and pass it the name of the main file inside a python list object. If you were creating a non-GUI project, than you would use the console key instead of windows. To run this, open up a command prompt and navigate to the appropriate location. Then type “python setup.py py2exe” to run it. This is what I got when I first ran it:

It looks like wxPython requires the “MSVCP90.dll” and Windows can’t find it. A quick Google search yielded the consensus that I needed the “Microsoft Visual C++ 2008 Redistributable Package”, found here. I downloaded it, installed it and tried py2exe again. Same error. This would have probably worked had I been using Visual Studio to create an exe of a C# program. Anyway, the trick was to search the hard drive for the file and then copy it to Python’s DLL folder, which on my machine was found at the following location: C:\Python26\DLLs (adjust as necessary on your machine). Once the DLL was in the proper place, the setup.py file ran just fine. The result was put into a “dist” folder which contains 17 files and weighs in at 15.3 MB. I double-clicked the “sampleApp.exe” file to see if my shiny new binary would work and it did! In older versions of wxPython, you would have needed to include a manifest file to get the right look and feel (i.e the themes), but that was taken care of in 2.8.10 (I think) as was the side-by-side (SxS) assembly manifest file that used to be required.

Note that for non-wxPython scripts, you will probably still need to mess with the SxS manifests and all the hoops that includes. You can read more about that in the py2exe tutorial.

Creating an Advanced setup.py File

Let’s see what other options py2exe gives us for creating binaries by creating a more complex setup.py file.

from distutils.core import setup
import py2exe
 
includes = []
excludes = ['_gtkagg', '_tkagg', 'bsddb', 'curses', 'email', 'pywin.debugger',
            'pywin.debugger.dbgcon', 'pywin.dialogs', 'tcl',
            'Tkconstants', 'Tkinter']
packages = []
dll_excludes = ['libgdk-win32-2.0-0.dll', 'libgobject-2.0-0.dll', 'tcl84.dll',
                'tk84.dll']
 
setup(
    options = {"py2exe": {"compressed": 2,
                          "optimize": 2,
                          "includes": includes,
                          "excludes": excludes,
                          "packages": packages,
                          "dll_excludes": dll_excludes,
                          "bundle_files": 3,
                          "dist_dir": "dist",
                          "xref": False,
                          "skip_archive": False,
                          "ascii": False,
                          "custom_boot_script": '',
                         }
              },
    windows=['sampleApp.py']
)

This is pretty self-explanatory, but let’s unpack it anyway. First we set up a few lists that we pass to the options parameter of the setup function.

  • The includes list is for special modules that you need to specifically include. Sometimes py2exe can’t find certain modules, so you get to manually specify them here.
  • The excludes list is a list of which modules to exclude from your program. In this case, we don’t need Tkinter since we’re using wxPython. This list of excludes is what GUI2Exe will exclude by default.
  • The packages list is a list of specific packages to include. Again, sometimes py2exe just can’t find something. I’ve had to include email, PyCrypto, or lxml here before. Note that if the excludes list contains something you’re trying to include in the packages or includes lists, py2exe may continue to exclude them.
  • dll_excludes – excludes dlls that we don’t need in our project.

In the options dictionary, we have a few other options to look at. The compressed key tells py2exe whether or not to compress the zipfile, if it’s set. The optimize key sets the optimization level. Zero is no optimization and 2 is the highest. The bundle_files key bundles dlls in the zipfile or the exe. Valid values for bundle_files are: 3 = don’t bundle (default) 2 = bundle everything but the Python interpreter 1 = bundle everything, including the Python interpreter. A couple of years ago, when I was first learning py2exe, I asked on their mailing list what the best option was because I was having issues with bundle option 1. I was told that 3 was probably the most stable. I went with that and stopped having random problems, so that’s what I currently recommend. If you don’t like distributing more than one file, zip them up or create an installer. The only other option I use in this list is the dist_dir one. I use it to experiment with different built options or to create custom builds when I don’t want to overwrite my main good build. You can read about all the other options (including ones not even listed here) on the py2exe website. By setting optimize to 2, we can reduce the size of folder by about one megabyte. There’s a thread on the wxPython mailing list about reducing the size of the results more, but I was unable to make it work. You can read about it here.

Update (08/03/2010)

One of my readers, ProgMan, played with a couple compression programs to see if he could make the results smaller. Here are the results:

Wrapping Up

You now know the basics for creating binaries with py2exe. I hope you have found this helpful for your current or future projects. If so, let me know in the comments!

Further Reading

Rakudo Star, for early adopters of Perl 6, now available

By Patrick Michaud, release manager for Rakudo Perl 6

On behalf of the Rakudo and Perl 6 development teams, I’m happy to announce the July 2010 release of “Rakudo Star”, a useful and usable distribution of Perl 6. The tarball for the July 2010…

Perl 6 screencast – part 5 – hashes

Direct link to the Perl 6 screencast about hashes

See more Perl 6 entries.

Perl 6 Code examples

Hashes in Perl 6 are denoted using % sign:
Creating a hash

my %h = “Foo” => 1, “Bar” => 2;

printing it out for debugging purposes…

Jonathan Hartley: Flying High: Hobbyist OpenGL from Python

This is a transcript-from-memory (what I wish I’d said) of the talk I just gave at EuroPython 2010, for which I owe a debt of gratitude to Richard Jones for his last-minute moral support while wrestling with projectors and refresh rates; and to the whole team of hard-working volunteers, especially John Pinner & Richard Taylor, who gave so much time and effort to make EuroPython in the UK brilliant once again.

The demonstrated code is available via Mercurial, from http://code.google.com/p/flyinghigh-opengl-from-python


With this talk I want to give an overview of creating 3D graphics in OpenGL from Python. Instead of covering topics already covered by a thousand OpenGL tutorials, I want to shift attention towards some ideas of how to generate the input to your renderer – how to algorithmically create geometry. I’ll show that with just a paltry few hundred lines of relatively simple code, you can generate some interestingly chunky shapes – virtual sculptures, if you will. Since this talk has the word hobbyist in the title, I want to emphasise how easy this is, and I want to have some fun with the pretty pictures.

Out of interest, how many people here are already expert OpenGL users (a few hands hesitantly go up, then some think about it and go down again) err, I mean how many have already used OpenGL to do anything at all (about half the people raise their hand.) Alright, well, I want you all to leave here enthused to go generate your own images or animations or games.

Inspirations

As the field of computer graphics advances, there’s an understandable tendency for more photorealism, This is laudable, but I also feel that the effort expended on achieving this technical goal is often undertaken without considering whether photorealism is the best aesthetic choice for a particular project.

In the past, games and other applications adopted particular visual styles out of technical necessity. As is often the case, these restrictions resulted in a diverse blossoming of creative ideas, producing an enormous set of distinctive visual styles and experiences.

Non-photo-realistic Quake

Non-photo-realistic Quake

Crucially, the most successful and memorable examples of these were projects that found ways to work in harmony with the restrictions of the medium, rather than attempting to gloss over them.

Rez HD

Rez HD

Advances in computing power and technique provide modern games and applications with a far wider range of options in how to present themselves visually, and yet the greater proportion of them seem content with a conventional and unimaginative ‘near-photorealistic’ appearance. This disappoints me, because I feel that projects that opt for a more highly stylised look, when appropriately chosen, can create a vastly more striking and memorable artistic experiences. This is true in movies and all kinds of art.

Waking Life

Waking Life

As an amateur graphics programmer, I don’t have large resources nor much experience to throw at the problem, so my options and my abilities are limited. But, like a good artist, I believe it should still be possible to create things that are both strikingly beautiful and highly functional, either by working with the restrictions of the medium, or by finding creative ways to exploit or extend them.

Love

Love

In particular, the kind of minimal, clean-lined aesthetic that amateur OpenGL programs take on by default are useful for their crisp precision, as charting and visualisation tools. But above that, I love them for their stark minimalism, their clean lines and homogeneous fields of colour.

Tron Legacy

Tron Legacy

I wish more professional game developers had an incentive to aim for less conventional aesthetics – whether they be deliberately retro, or else striking out in some new direction of their own. It’s that brave minority of projects which do this which form my inspiration.

Starting Point

I’m assuming we already have a minimal OpenGL application, that:

  • Opens a window
  • Provides an OpenGL context for us to render to
  • Sets appropriate 3D projection matrix
  • Sets the initial modelview matrix state based on the position and orientation of a ‘camera’ object
  • Calls our empty ‘draw’ function once per monitor refresh.

This results in a blank screen, at 60fps. Here’s a screenshot, so you can see exactly what it’s doing:

A blank screen

A blank screen

I’m using pyglet & PyOpenGL for this, but this isn’t important. Any framework that provides the above abilities, such as PyGame, along with bindings to OpenGL, will be just fine. Whichever framework you use, this minimal application might take on the order of about 150 lines of code, and is covered in countless tutorials all over the web.

From here on in I plan to show (or at least describe) pretty much all of the code that I add on top of this minimal OpenGL loop.

Goal

To begin with, I’m going to lead you as quickly as I can through a Shape class, that model 3D shapes, in a way useful for the creation of geometry, and then a Glyph class that converts these geometries into arrays for OpenGL. Finally these arrays get passed into a Render class, which simply calls glDrawElements to render them.

Our Goal

Our Goal

Once the above infrastructure is in place, we can have some fun generating interesting shapes to make pretty pictures with. The conventional way to provide geometry to your OpenGL code is by loading your data from files. Today though, I want to stick with generating geometry from code, to see where that leads.

Modelling Polyhedra

A polyhedron is a 3D shape with flat faces and straight edges. We can model coloured polyhedra using a simple Shape class:

Vec3 = namedtuple('Vec3', 'x y z')
Color = namedtuple('Color', 'r g b a')

class Shape(object):

    def __init__(self, vertices, faces, face_colors):
        # list of Vec3s
        self.vertices = vertices

        # list of faces, each face is a list of indices into 'vertices'
        self.faces = faces

        # List of colors, one per face
        self.face_colors = face_colors

An instance of this class, for example, might represent a yellow cube, or a tetrahedron with green and black faces, or any other coloured polyhedron we can imagine.

To demonstrate how classes Shape, Glyph and Render hang together, let’s examine an even simpler example, a yellow triangle joined to a red square:

Red Triangle & Yellow Square

Red Triangle & Yellow Square

You can see this geometry features five vertices (v0 to v4), which are used by the two faces. This might be represented by an instance of Shape:

v0 = Vec3( 1,  1, 0)
v1 = Vec3( 1, -1, 0)
v2 = Vec3(-1, -1, 0)
v3 = Vec3(-1   1, 0)
v4 = Vec3( 1,  0, 2)

red = Color(255, 0, 0, 255)
yellow = Color(255, 255, 0, 255)

shape = Shape(
    vertices=[v0, v1, v2, v3, v4],
    faces=[
        [2, 3, 4],    # f0, triangle
        [0, 1, 2, 3], # f1, square
    ],
    face_colors=[red, yellow],
)

The integers in the ‘faces’ member are indices into the vertices list. So the triangular face, for example, is formed by linking vertices 2, 3 and 4.

Step 1. Creating a Ctypes Vertex array

In order to render our Shape, we need to convert it to some ctypes arrays that OpenGL will eat:

  • glvertices – an array of GLfloats (three for each vertex)
  • glindices – an array of GLubytes (one for each index of each face)
  • glcolors – an array of GLubytes (four for each vertex)

To generate glvertices, we need to dereference the indices in Shape.faces, to produce a new list of vertices, rearranged into the order they are going to be drawn:

Step 1. Dereference indices

Step 1. Dereference indices

The most visible aspect of this change is that the vertices are re-ordered, such that the indices now simply read ’0, 1, 2, 3, 4, 5…’. However that isn’t actually necessary. The important part of this transformation is that vertices which are re-used are now duplicated in the vertex list. For example v0 now occurs twice. As a result of this vertex duplication, one the two instances of ’0′ in the faces lists now instead reads ’3′ (referencing the new second copy of v0).

This duplication of vertices is required, because when v0 is used for the first time, it is as part of the red triangle, and when it is used the second time it is as part of the yellow square. The color of the vertex changes from one occurrence to the next. All the attributes of a vertex (position, color, texture co-ords, normals, etc) are treated as an atomic unit, so whenever any attribute changes, as the color is changing here, the vertex position needs to be redundantly specified again, so as to create a new unique vertex with its own unique attribute values. Even if the color of v0 in our example was identical for each use, we will see later that other vertex attributes such as surface normals will still differ. Don’t sweat trying to eliminate these redundancies, they are necessary, unless every single attribute of the re-used vertex (including surface normals) are identical.

The code in Glyph.get_glverts() performs this dereferencing step:

class Glyph(object):

    def get_glverts(self, shape, num_glverts):
        glverts = chain.from_iterable(
            shape.vertices[index]
            for face in shape.faces
            for index in face
        )
        ArrayType = GLfloat * (num_glverts * 3)
        return ArrayType(*glverts)

This uses a generator to produce the vertices in the order that we need them. ‘ArrayType’ shows the standard idiom to create a ctypes array – we take the datatype of the array elements, in this case GLfloat since our vertex positions consist of three floats, and multiply it by the required length of the array. This yields a new array type. The final return statement instantiates this array type using the data supplied by the glverts generator.

Step 2. Creating Ctypes Index Arrays

The second job Glyph has to do is create a ctypes indices array, which is derived from the Shape’s faces. In doing this, it has to break the Shape’s faces down into individual triangles.

Step 2. Tessellate indices

Step 2. Tessellate indices

The vertex list is unchanged by this step, and the first face – the triangle – is also unchanged. The second face, the square, has been broken into two triangles.

There are well-known algorithms for breaking an arbitrary polygon down into individual triangles. Using the utility functions found in the GLU library, this can be done in about 150 lines of Python. But in the interests of keeping it simple, I decided to restrict our code to just handling convex faces. Tessellating these faces can be done using a considerably simpler algorithm:

def tessellate(face):
    '''
    Break the given face into triangles.
    e.g. [0, 1, 2, 3, 4] ->
    [[0, 1, 2], [0, 2, 3], [0, 3, 4]]
    Does not work on concave faces.
    '''
    return (
        [face[0], face[i], face[i + 1]]
        for i in xrange(1, len(face) - 1)
    )

We again use a generator, to simply join up the face’s first vertex with all the other vertices, like this:

Tessellation of convex faces

Tessellation of convex faces

Now we have our tessellate function, Glyph can now create the glindices array in much the same way as it generated the glvertices. I wasn’t smart enough to write this as a generator first time around, I presume it would require more than one generator to do it (anyone?), so I’m needlessly creating an in-memory copy of the sequence, but it turns out I need to take its length right afterwards anyway, so what the heck:

class Glyph(object):

    def get_glindices(self, faces):
        glindices = []
        face_offset = 0
        for face in faces:
            indices = xrange(face_offset, face_offset + len(face))
            glindices.extend(chain(*tessellate(indices)))
            face_offset += len(face)
        ArrayType = GLubyte * len(glindices)
        return ArrayType(glindices)

This is more complex than get_glvertices because it is performing both of the transformations described in steps 1 and 2. But it’s still pretty straightforward. Note that the type of the index array will have change from GLubytes to GLushorts (or GLuints) if the number of vertices rises above 256 (or 65,536.)

Step 3. Creating Ctypes Color Arrays

Finally, we need an array of vertex colors. This is the simplest of the lot, generated by repeating the face_color for each face, once per vertex:

class Glyph(object):

    def get_glcolors(self, faces, face_colors, num_glvertices):
        glcolors = chain.from_iterable(
            repeat(color, len(face))
            for face, color in izip(faces, face_colors)
        )
        ArrayType = GLubyte * (num_glvertices * 4)
        return ArrayType(chain(*glcolors))

First Light

It’s might seem like a teensy bit of a slog to get here, but it hasn’t been more than sixty lines of code, and now we’re in a position to pass our ctypes arrays into OpenGL’s drawElements. This happens in our Render.draw() method:

class Render(object):

    def draw(self, world):
        for item in world:
            glVertexPointer(3, GL_FLOAT, 0, item.glyph.glvertices)
            glColorPointer(4, GL_UNSIGNED_BYTE, 0, item.glyph.glcolors)

            # TODO: handle the item's position and orientation

            glDrawElements(
                GL_TRIANGLES,
                len(item.glyph.glindices),
                GL_UNSIGNED_BYTE,
                item.glyph.glindices
            )

This is canonical OpenGL render code, so I’m not going to dissect it, but now we get some actual visible output:

Red triangle, yellow square

Red triangle, yellow square

Hooray! \o/ We can move our camera position around, and view this 3D object from different angles.

There’s a minor wrinkle here that I’m glossing over. I’ve turned on backface culling, so the triangle and square aren’t visible if we view them from the back. For all our future examples I plan on using closed polyhedra, so we won’t be able to see the ‘backs’ of the faces – those will be on the inside of the polyhedron.

The Fun Stuff

So now we’ve got all our infrastructure in place, we can start creating factory functions to churn out some Shapes. Let’s start with something straightforward, a tetrahedron (triangle-based pyramid):

def Tetrahedron(edge, face_colors=None):
    size = edge / sqrt(2)/2
    vertices = [
        (+size, +size, +size),
        (-size, -size, +size),
        (-size, +size, -size),
        (+size, -size, -size),
    ]
    faces = [ [0, 2, 1], [1, 3, 0], [2, 3, 1], [0, 3, 2] ]
    return Shape(vertices, faces, face_colors)

Which produces:

A tetrahedron

A tetrahedron

Then a cube factory:

def Cube(edge, face_colors=None):
    e2 = edge / 2
    verts = list(itertools.product(*repeat([-e2, +e2], 3)))
    faces = [
        [0, 1, 3, 2], # left
        [4, 6, 7, 5], # right
        [7, 3, 1, 5], # front
        [0, 2, 6, 4], # back
        [3, 7, 6, 2], # top
        [1, 0, 4, 5], # bottom
    ]
    return Shape(verts, faces, face_colors)

The six faces are quite evident, but the use of itertools.product to produce the list of vertices perhaps deserves a bit of exposition. It’s an inspired tip from ΤΖΩΤΖΙΟΥ. Just to spell it out in longhand:

>>> from itertools import repeat, product
>>> list(product(*repeat([-1, +1], 3)))
[(-1, -1, -1), (-1, -1, 1), (-1, 1, -1), (-1, 1, 1),
 (1, -1, -1), (1, -1, 1), (1, 1, -1), (1, 1, 1)]

So there are the eight vertices of the cube, and that gets us the following:

A cube

A cube

We can add a few more vertices and faces, to make ourselves a truncated cube:

A truncated cube

A truncated cube

Once we’ve got truncated cubes, we might as well add one last face to form the entrance:

A truncated cube with entrance

A truncated cube with entrance

There’s nothing to stop us adding several of these shapes into the world at once, but since we haven’t yet moved any of them away from the origin, they just sit there, embedded within one another:

A cube and tetrahedron interpenetrate

A cube and tetrahedron interpenetrate

A truncated cube with two tetrahedrons

A truncated cube with two tetrahedrons

Moving objects around

In our earlier Render.draw() method, we left a ‘TODO’ comment in place, to note that we weren’t yet handling item positions and orientations. Here’s what Render.draw looks like when we fill that code in:

class Render(object):

    def draw(self, world):
        for item in world:
            glVertexPointer(3, GL_FLOAT, 0, item.glyph.glvertices)
            glColorPointer(4, GL_UNSIGNED_BYTE, 0, item.glyph.glcolors)

            glPushMatrix()
            glTranslatef(*item.position)
            glMultMatrixf(item.orientation.matrix)

            glDrawElements(
                GL_TRIANGLES,
                len(item.glyph.glindices),
                GL_UNSIGNED_BYTE,
                item.glyph.glindices
            )
            glPopMatrix()

Again, this is very standard OpenGL usage. To set an item’s position attribute, I’m going to use a bit of code that I already snuck into the demo without telling you about. It’s the code that moves the camera around in space.  A simplified version is here, class Orbit, which will return a new position each time it gets called. The locus of this position is an orbit around the origin:

class Orbit(object):
    def __init__(self, distance, speed, phase=None):
        self.distance = distance
        self.speed = speed
        if phase is None:
            phase = random.uniform(0, 2 * pi)
        self.phase = phase

    def __call__(self, time):
        bearing = time * self.speed + self.phase
        x = self.distance * math.sin(bearing)
        z = self.distance * math.cos(bearing)
        return Vec3(x, 0, z)

The actual camera uses a slightly longer version I call WobblyOrbit (not shown), which operates in exactly the same way.  Any ‘mover’ class, i.e. one that returns a Vec3 position when called, can be used to move the camera, or any other item, around in space:

class GameItem(object):
    def __init__(self, ** kwargs):
       self.__dict__.update(** kwargs)

world.add( GameItem(
    shape=Cube(1, repeat(red)),
    mover=Orbit(distance=20, speed=4),
) )

# then, in world.update():
for item in self.items:
    if item.mover:
        item.position = item.mover(self.time)

Similarly, we can spin items using ‘spinner’ classes, that tweak an item’s orientation as time goes by.

With these all in place, we can now add many Shapes to the world, each moving and rotating independently:

Several independantly positioned and oriented shapes

Several independently positioned and oriented shapes

Next week: Composite Shapes…

This is all great as far as it goes, but it turns out we have a performance problem. Adding more than about 450 shapes at once starts to slow down below 60fps (This is all on my trusty 2005-era Thinkpad T60 laptop.) The bottleneck turns out to be in our Render.draw() loop. Each of those OpenGL functions are from (wrappers around) the OpenGL C library, and calling across the Python / C boundary like this incurs a per-function call overhead. Also, a second looming problem is that creating more interesting shapes is going to become more onerous and error-prone, as we create longer and more complex lists of vertices and faces in our code.

One partial solution to both these problems is to use composite shapes, in which we can compose many copies of our basic shapes into one single, more complex shape. This will allow us to use algorithmic means to produce more fun geometry, and will also help us draw more complex shapes, composed of many simpler shapes, without requiring several separate OpenGL function calls for each of the simple shapes.

On to Part 2 >>

Features in PHP trunk: Array dereferencing – Johannes Schlüter

I was writing about new features in the upcoming PHP version (5.4, 6.0?) before. Today’s topic reads like this in the NEWS file:

- Added array dereferencing support. (Felipe)

Now you might wonder what this typical short entry means. when doing OO-style PHP you might make use of a sntax feature which one might call “object dereferencing” which looks like this:

<?php
class Foo {
    public function bar() { }
}

function func() {
    return new Foo();
}

func()->bar();
?>

So one can chain method calls or property access. Now for a long time people requested the same thing for array offset access. This was often rejected due to uncertainties about memory issues, as we don’t like memory leaks. But after proper evaluation Felipe committed the patch which allows you to do things like

<?php
function foo() {
    return array(1, 2, 3);
}
echo foo()[2]; // prints 3
?>

Of course this also works with closures:

<?php
$func = function() { return array('a', 'b', 'c'); };
echo $func()[0]; // prints a
?>

And even though the following example is stupid I might accept this feature as one of the few places where it is ok to use references in PHP:

<?php
$data = array('me', 'myself', 'you');
function &get_data() {
    return $GLOBALS['data'];
}
get_data()[2] = 'I'; // $data will now contain 'me', 'myself' and 'I'
?>

Wonderful, isn’t it? If you want to test it please take a look at the recent snapshots for PHP trunk and send us your feedback! Please mind that all features in PHP trunk may or may not appear in the next major PHP release.

New Site is Live – Travis Swicegood

This may be premature, but it looks like I’m live with the new site design and new blog engine. The design is html5 (i.e., it looks great to me in Chrome, not sure what it’ll be like elsewhere) and the new engine is jekyll.

What does this mean for you, my loyal reader? Not much, really. I believe my port is transparent.

Actually, the only problem I’m seeing right now is related to disqus—some of my comments that I know are imported are not showing up yet. I just dumped nearly 2,000 comments into their system for this blog, so my guess is that it’s a caching issue and they’ll catch up. The comment count number are correct inside their admin interface, so I know the comments are in their system somewhere. :-)

If anything looks out of whack, please let me know.

keep looking »

Warning: include(/home/remarkwit/enterpriselamp.org/wp-content/themes/Enterprise_LAMP/r_sidebar.php) [function.include]: failed to open stream: No such file or directory in /home/remarkwit/enterpriselamp.org/wp-content/themes/Enterprise_LAMP/archive.php on line 23

Warning: include() [function.include]: Failed opening '/home/remarkwit/enterpriselamp.org/wp-content/themes/Enterprise_LAMP/r_sidebar.php' for inclusion (include_path='.:/usr/local/lib/php:/usr/local/php5/lib/pear') in /home/remarkwit/enterpriselamp.org/wp-content/themes/Enterprise_LAMP/archive.php on line 23