Andy Todd: Generating HTML versions of reStructuredText files
I wanted to quickly and easily convert a series of reStructured text documents into HTML equivalents. For reasons too dull to discuss here I couldn’t just use rst2html.py and didn’t want to go to the trouble of remembering enough bash syntax to write a shell script.
So I thought that as long as docutils is written in Python it would only take a moment or two to knock up a script to do what I needed. Well yes, and no. The script itself is fairly simple;
from docutils import core
def convert_files(name_pattern):
for file_name in glob.glob(name_pattern):
source = open(file_name, 'r')
file_dest = file_name[:-4] + '.html'
destination = open(file_dest, 'w')
core.publish_file(source=source, destination=destination, writer_name='html')
source.close()
destination.close()
The most useful line being the one where I call core.publish_file. But it wasn’t immediately obvious from the docutils documentation what series of incantations would achieve my desired results. Luckily, after some time spent perusing the documents I came across this dissection of rst2html.py. This, in turn, lead me to the description of the Docutils Publisher, which lists the convenience functions available to work with the engine.
The end result isn’t particularly elegant but it does get the job done and I thought I would share it in case anyone else has a similar need in the future.
Konrad Delong: 5 things you can do with a Python list in one line
This is directly inspired by an excellent post by Drew Olson 5 things you can do with a Ruby array in one line. When reading it, I couldn’t help but thinking of the Python versions (and how I like them more :>). So here it is:
- Summing elements
puts my_array.inject(0){|sum,item| sum + item}sum(my_list) - Double every item.
my_array.map{|item| item*2 }[2 * x for x in my_list] - Finding all items that meet your criteria.
my_array.find_all{|item| item % 3 == 0 }[x for x in my_list if x % 3 == 0] - Combine techniques.
my_array.find_all{|item| item % 3 == 0 }.inject(0){|sum,item| sum + item }sum(x for x in my_list if x % 3 == 0) - Sorting.
my_array.sort my_array.sort_by{|item| item*-1}sorted(my_list) sorted(my_list, reverse=True)
Brian Harring: So which python version you want?
For those interested, a kvm instance of a lxc setup w/ python 2.4, 2.5, 2.6, 2.7 (snapshot), 3,1, 3.2 (snapshot), and 2.6-unladen-swallow is now propagating out through gentoo mirrors
PyPy Development: PyPy in Google’s Summer of Code 2010
Good news everyone.
This year, thanks to google generosity and PSF support, we got two and a
half of students for PyPy’s summer of code. We didn’t cut any students, but one
of the projects is a joint project of PyPy and numpy. Hereby I present
descriptions, in my own words with my own opinions and in arbitrary order. For
more details please follow links to particular blogs.
Jason Creighton: 64bit JIT backend for PyPy
Intel 64bit (and I mean x86_64) compatibility for JIT has been one of the top
requested features (along with GIL removal). While GIL removal is not really an
easy task, having our JIT emit 64bit assembler is sort of easy, thanks to our
JIT backend abstraction. It will likely be faster, thanks to abundance of
registers.
Bartosz Skowron: Fast ctypes for PyPy
Historically weak point of PyPy was compatibility with extension modules. We
have progressed quite a bit in recent years, first introducing ctypes for
pypy then progressing towards CPython extension modules. However, ctypes is
well known to be slow (and it’s even slower on PyPy) and writing CPython
extension modules is ugly, and it’s going to be only with compatibility layer
that’ll keep this slow. What happens if we try to employ JIT technology to
ctypes? Maybe we can compile calls to C code from Python as a direct calls in
compiled assembler? Why not?
This project will look how the JIT technology can be employed to do some
sort of FFI. There is no guarantee we’ll get super-fast ctypes as a result,
but it’s good to see progress in that area.
Dan Roberts: Numpy in PyPy
This is a joint project of numpy and PyPy. The main objective is to bring
numpy to PyPy, possibly fast. The official mentor for this project is
Stefan van der Walt from numpy community. During initial meeting it was
agreed that probably the best way to go would be to support original numpy
with CPython extension compatibility and then provide a minimal native numpy
framework for pypy. The former would retain full compatibility, while the
latter would have JIT integration, with line of our previous
numeric experiments. There would be an explicit interface from converting
one array to another for convinience.
Overall, I’m very happy to see so much support for PyPy from SoC. I hope all
three proposals will be successful!
Cheers,
fijal & pypy team.
Python 411 Podcast: Open Allure
This podcast is an interview and wide-ranging discussion with John Graves about his Open Allure open source project, a web based educational tool written in Python with a fascinating user interface utilizing gestures, voice recognition, and text to speech. Open Allure is his PhD project in New Zealand and John is heading to silicon valley this summer as an intern at Ray Kurzweil’s Singularity University. He is a man of many interests and the wide ranging discussion hits on many Pythonic topics, the philosophy of open source, key technological tipping points, and much more. I have been delinquent in producing podcasts lately but this is a good one. John has a lot to say on many topics covered in previous Python411 podcasts.
Tarek Ziade: Distutils2 vs Pip
Note: if you are not familiar with PEP 345, you might want to read it to understand this entry. It adds for instance “Requires-Dist” that is similar to setuptools’ install_requires and provides a standard for dependencies description.
The GSOC has started and we are already working on a lot of tasks about packaging. The main difficulty is to make sure each student works without overlapping with others, and never get blocked. That’s why we will have weekly meetings with (almost) everyone. In parallel, the nice posse from the Montreal user group is organizing Distutils sprints quite often now. That means that we now have an important manpower for Distutils and things are starting to speed up.
There’s one controversial topic though, that we need to straighten up : do we want to add an installer in Distutils2 ? And since Distutils2 goal is to be back in the stdlib for Python 3.2, that means: do we want to add an installer in the stdlib ?
My answer so far is Yes. And that’s what I’ll be working on unless someone is able to change my mind
What is Distutils2 ?
Let me explain first what is the Distutils2 project, and what we want it to provide. Like its predecessor, Distutils2 wants to provide two things:
- a toolbox for third packaging tools, whether they are simple installers or full featured package managers (PyPM, Pip, Enthought Installer etc..). This toolbox will include (if not already) reference implementation of PEP 345, PEP 376, PEP 386. In other words, if you want to create the next killer packaging system, you can use modules like distutils2.version (PEP 386) or distutils2.metadata (PEP 345) to build it, without depending on the “everything is a command” philosophy of Distutils.
- a standalone tool that can be used to install or remove distributions. That’s what Distutils is and that’s what we want to provide in the future in Distutils2. The ability to install projects (and therefore its dependencies since this is a new metadata field we added in PEP 345).
The controversy is about 2. It’s controversial to provide a script that installs dependencies via PyPI into distutils2 because some projects like Pip already provides this feature.
Our current packaging ecosystem explained
A few years ago, before Setuptools added the ability to install dependencies via easy_install, installing a distribution of a given project was as simple as running a python setup.py install. This was installing the distribution in the target system, in proper locations defined by the install command. That’s it.
Setuptools grew organically on the top of Distutils to provide new metadata like the “install_requires” field, that lists dependencies. Setuptools provided two things:
- A new install command that triggers the installation of dependencies, by reading the setuptools-specific “install_requires” metadata, and fetching dependencies at PyPI and installing them recursively.
- An easy_install script that can be used to install a distribution located at PyPI. That’s just a bootstrap on the top of the new install command. In other words, it grabs the archive at PyPI, unpack it, and run “python setup.py install” on it.
In other words, your Python project setup.py is the installer itself because when you use setuptools, it calls its specific install command and triggers the installation chain.
That’s when the mess started: people that didn’t have setuptools installed couldn’t install projects that was using it of course. So the solution that was provided was to propose an ez_setup.py script that you have to include in your project and to run when setup.py is used, to be able to run your installation. In other words, your setup.py is bootstrapping the utilization/installation of setuptools. And that turned out to be really messy since Setuptools has its own way for installing things. I hope I don’t sound harsh here, Setuptools is the best thing that happened to packaging in years. And a lot of our current work is to bring back its features into the “main stream”.
The result is that you, as a end user, do not control what installer is going to be used, and you end up with a site-packages that has projects installed differently, and that uses different installers.
I am strongly against this behavior because of the mess it creates. In my opinion a python source distribution should not embed an installer and force its usage like this. We need to separate concerns: a python source project should be a dumb container with the code, and with some metadata.
Then Pip showed up.
Pip is an installer script that grabs the project you want to install and run “python setup.py install” on it. That’s all it does when the project is a plain Distutils one. When it encounter Setuptools projects, it blocks the installation of the project’s dependencies I have described earlier, and installs it like a simple Distutils project. Then, it analyzes its dependencies and installs each one of them separately.
That’s really the way to go because it breaks what setuptools is enforcing: projects are not installing other projects in the process anymore. And in the long term, it will allow us to get rid of setup.py (but that’s another blog post). And I hope Pip will soon be able to install Distutils2 projects because it is providing unifi ed metadata (distutils+setuptools -> PEP 345).
Distutils2 vs Pip
So as I said before: it’s controversial to provide a script that installs dependencies via PyPI into distutils2 because Pip already provides this feature.
But one Distutils2 goal (like Distutils) is to provide a command to install a Distribution of your system so it works. And the concept of “Distribution” has evolved, thanks to PEP 345. this means that it needs to install dependencies now, exactly the way Pip does.
We could just tell people to install Pip on the top of the stdlib. But the goal is to provide in the stdlib a working packaging environment, that provides a minimum set of features. The goal is to have something that works when you install Python 3.2, like what was provided when distutils was brought in (eg batteries included).
Mac OS X includes easy_install, I don’t see any good reason not to include a package installer in the Python stdlib itself. At least, we will be able to have a control on what script gets installed by default with Python.
That’s why I have proposed to include Pip in Distutils2 but Ian and Carl seems a bit reluctant for various reasons. One of them is that having Pip included in the stdlib will slow down their work. I don’t think this is true as long as it’s included carefully. If Distutils2 allows its installer to be replaced through configuration by another one, then Pip can have new releases independently from the version included in the stdlib and people can upgrade their system without having to wait for the next Python release.
In any case, we are working on the various bits that are composing an installer in Distutils2 during GSOC since one of the goal of the project as I said earlier, is to provide a toolbox. So if the merge does not occur, it’s likely that we will start a installer/uninstaller script in Distutils2, and it will look a lot like Pip I guess.
EDIT: to make things clearer, when I am saying that both projects should merge, I am only referring to the raw “install with dependencies” features in Pip, and not all the other features.
Links for 2010-05-31
Amazon.com: Western Digital WD TV Live Network-ready HD Media Player
: $99, 10-watt, fanless device to stream HD1080p video, in pretty much any format, from a network server to your TV. crazy. quite competitive with the Acer Aspire Revo; downside: l…
URL Sentences – Chris Shiflett
Two and a half years ago, I was helping Jon Tan redesign a web site. We share an affinity for organization and structure, but we also like to experiment with new ideas.
One morning via Skype, I shared a crazy idea that I wasn’t entirely sure of yet, trusting Jon to tell me if it was a bad idea.
What if we make every URL a sentence?
Before he could respond, I pasted in some examples I had been playing with to help clarify what I meant:
-
/is(About)/is/here(Contact)/is/hiring/is/chris-shiflett
-
/does(Work)/does/web-design
-
/helps(Clients)/helps/digg
-
/thinks(Planet)/thinks/about(Tags)/thinks/about/oscon
-
/remembers(Timeline)/remembers/2008(Archive)
-
/writes(Books)/writes/essential-php-security/writes/http-developers-handbook
-
/has(Site Map / Search)/has?php/has/colophon/has/accessibility
-
/shares(Feeds)/shares/news/shares/planet/shares/everything
-
/presents(Talks)
These URLs still adhere to a basic — albeit shallow — hierarchy to help keep things organized, but instead of the usual about, work, and clients, I used verbs like is, does, and helps. I was pleasantly surprised to hear Jon liked the idea. He noted some limitations, like the challenge of avoiding awkward wording when the hierarchy was deep, but he thought it was worth trying to map out the entire site to see if we could make it work.
Because the site was fairly small, it turned out well. As I noted previously, this approach isn’t appropriate for all sites, but it can give URLs a voice of their own. (I don’t use URL sentences on shiflett.org.) It can also help you organize your pages. For example, if a page can’t fit neatly into a sentence that starts with example.org is…, then it probably doesn’t belong in the about section of the example.org site.
There are other ways to make sentences with URLs, especially if your domain name can be used as a verb. And you don’t mind.
Using verbs (present tense) as the top-level hierarchy is just one example.
There have been other uses of URL sentences over the years:
- Jon collaborated with Jon Gibbins on a really neat site for Denna Jones that uses URL sentences and other interesting innovations. (Pages like the colophon do not, but the primary ones do.)
- Clearleft use URL sentences in their latest redesign. Paul Lloyd discusses this and more in a related post about URLs.
- Ann McMeekin cleverly uses URL sentences to indicate categories for her blog. Some posts she considers; others she shares. A full list of categories is available in the sidebar.
- Cameron Koczon used URL sentences when redesigning Jessica Hische’s site. She chose verbs like typographizes and designifies to add a bit of her personality to the mix.
- Martin Geber used past tense verbs as his top-level hierarchy, creating URLs that align with the idea that his site is a personal archive thoughts, memories, and the like. He writes more about the inspiration for the site. Thanks for the nod, Martin!
Truncated by Planet PHP, read more at the original (another 1636 bytes)
MongoDB: A first look – Travis Swicegood
The entire subject of two talks and mentioned in several other, MongoDB was
definitely a buzz at TekX this year. It’s long been in favor in the tech
community in Lawrence and has been used for some data crunching for a few
projects at the local paper. Even with all of this exposure, I’ve yet to sit
down and actually explore it.
That changed Friday afternoon while I sat at O’Hare waiting on my flight back
to Lawrence (which subsequently got canceled). I installed Mongo earlier in
the week and opened up a bunch of tabs on the various intros and tutorials
available on the Mongo wiki. The rest of this article a mix of
stream-of-conscious as I played around with Mongo for the first time and some
of my reflections this past week.
Note on typefaces
I use both Mongo and mongo throughout this article. The first, the
title-case Mongo refers to the software as a whole. Whenever you see mongo
with a lowercase and in monospace, it’s referring to the Mongo client program
you run from the command line.
Installation
On a Mac, it’s a breeze. I use Homebrew to manage software on my Mac, so a
quick brew install mongodb was all I needed and a minute later I was ready to
go.
Starting Up the Server
Mongo is run by the mongod process. I don’t know if it’s pronounced
mongo-d or mon-god though. It’s a fun play on words if the latter is the
case.
Brew includes a basic configuration to get up and running, so I use that inside
a screen instance so I can leave it running in the background while I use the
mongo tool to interact with it.
Interacting with Mongo
I started out with the basic tutorial to get going. It looks like that
needs some love though. It shows the version in the startup as 0.9.8.
Homebrew ships with 1.4.2 and I did find a few things that were out of date.
No, I’ haven’t been a good open source community member and submitted fixes
yet.
The first thing that’s different than a traditional RMDBS with Mongo is that
you don’t have to explicitly create a database. Pretty straight forward: from
within mongo, type use <database>. This creates a brand new database for
you and you’re off. For the examples below, I’m using use mydb to select
mydb as my database.
It’s kind of nice to just be able to connect and go, but it feels odd. Not
good or bad, just odd. Sort of like the first time you run git checkout
inside a repository to switch branches when you’re used to Subversion.
The shell feels like a Javascript console. I don’t have access to the source
code in my off-line mode, so I don’t know but that it is. The syntax seems
remarkably similar, so it’s at least Javascript inspired.
Adding Records
Mongo stores documents, not rows of columns. This distinction allows Mongo to
ignore schema—continuing the theme of leaving it up to the developer.
Those documents can be made up any number key-values that look remarkably like
JSON. Need to store a new data point, just add it as a field to a document
and you’re set.
Here’s an example inspired by Mongo’s tutorial for adding a few records:
> person = {name: "Travis Swicegood"}
> city = {city: "Lawrence", state: "KS"}
> db.things.save(person)
> db.things.save(city)
Here I created two new objects with various data attached to them, then saved
them all inside the things collection. Collections in Mongo are like a table
inside the SQL world. You don’t have to create a collection, you just declare
it on the db object, and you’re set.
Comparing this to the same code in a database, I’ve got to say I love this. No
boilerplate code to get going. I didn’t have to create a database, no tables
were created. I just started using them. This appeals to my
laziness—err, I mean desire for efficiency, but also looks very promising
to teach someone new. Every abstract idea you can remove is one less potential
stumbling block for someone starting out.
Back to the data I entered. Notice that neither have the same fields.
Collections inside Mongo are made up of a series of keys and values—they
can be whatever you want them to be. This is perfect for lazy migrations:
migrating the data as its requested instead of doing it all at once. ming,
a Python wrapper around Mongo “/>
Truncated by Planet PHP, read more at the original (another 8258 bytes)
Patrick Stinson: Mac OS X: No timed semaphore waits between processes
There are a few very clunky things that the average developer might run into when trying to use IPC primitives on OS X.
For one, There are gaping holes in the documentation – like some of the functions don’t even exist. Even a google search won’t turn anything up.
Second, it’s extremely hard to figure out how to do a timed wait on a semaphore shared between processes. There is no timed wait implementation for named semaphores created using semget, and while the native mach semaphores do include a timed wait implementation, it’s too hard to figure out how to share one between processes.
What’s the deal Apple? Why am I forced to read off-topic documentation in detail just to get a timed wait between processes? When I realized I was reading and re-reading about bootstrap contexts and ports in the Kernel Programming Guide, I knew I’d gone too far.
Backing up, all I’m trying to do is signal my daemon when a message is ready, and have the daemon signal my parent process when the request is complete. Considering the response time will always be very small, I’d like to have a timeout on both sides to detect when either process has crashed.
I’ve tried installing a SIGALRM handler which works, but that’s process-wide and extremely clunky when all I want is a timed wait.
Simple enough? Apparently not…
What’s the deal Apple?
=================
30 minutes pass…
=====================
Sometimes all it takes is writing about a problem to help you solve it. Here’s what I found, after reading all the mach documentation and the Jack source code (Thank you, once again, Paul):
It is possible to register a native unnamed mach semaphore (created with semaphore_create()) with a name that another process can use to attach to the same semaphore and do a timed wait (using sempahore_timedwait()). What you have to do is acquire the bootstrap context of the current process and register the semaphore with a name there so that another process that you start can see it. A bootstrap context is like a scope or namespace, and the context in question is the login context, which means that all processes that your user starts uses that namespace.
I created some example code that shows how to create a semaphore and do a timed wait.
/** parent.cpp: Create and register a named semaphore, and wait for child.cpp to attach to and signal it, allowing this process to terminate.*/#include mach.h>>#include semaphore.h>>#include bootstrap.h>>#include #include #include
void sig(int){}
int main(){ semaphore_t sem; mach_port_t task = mach_task_self(); mach_port_t boot_port; kern_return_t err;
err = task_get_bootstrap_port(task, &boot_port); if(err != KERN_SUCCESS) { printf("BOOTSTRAP: %s\n", mach_error_string(err)); exit(1); }
err = semaphore_create(task, &sem, SYNC_POLICY_FIFO, 0); if(err != KERN_SUCCESS) { printf("semaphore_create: %s\n", mach_error_string(err)); exit(1); } printf("Created semaphore\n");
err = bootstrap_register(boot_port, "pksem", sem); if(err != KERN_SUCCESS) { // printf("bootstrap_register: %s\n", mach_error_string(err)); switch(err) { case BOOTSTRAP_SUCCESS : /* service not currently registered, "a good thing" (tm) */ break; case BOOTSTRAP_NOT_PRIVILEGED : /* already exists */ printf("bootstrap_register(): bootstrap not privileged\n"); break; case BOOTSTRAP_SERVICE_ACTIVE : printf("bootstrap_register(): bootstrap service active\n"); break; default : printf("bootstrap_register() err = %s\n", mach_error_string(err)); break; } }
printf("semaphore_wait()\n"); // semaphore_wait(sem);
printf("semaphore_timedwait()\n"); const int ms = 1750000; mach_timespec_t ts; ts.tv_sec = ms / 1000; ts.tv_nsec = (ms % 1000) * 1000000; bool wait = true; while(wait) { err = semaphore_timedwait(sem, ts); switch(err) { case KERN_SUCCESS: printf("signaled\n"); wait = false; break; case KERN_OPERATION_TIMED_OUT: printf("timed out\n"); wait = false; break; case KERN_ABORTED: printf("caught signal, trying again\n"); break; default: printf("default: %s\n", mach_error_string(err)); break; }; }}
/** child.cpp: Attach to the semaphore by name and release it. */
#include mach.h>>#include semaphore.h>>#include bootstrap.h>>#include #include
int main(){ semaphore_t sem; kern_return_t err; mach_port_t boot_port;
err = task_get_bootstrap_port(mach_task_self(), &boot_port); if(err != KERN_SUCCESS) { printf("task_get_bootstrap_port(): %s\n", mach_error_string(err)); exit(1); }
err = bootstrap_look_up(boot_port, "pksem", &sem); if(err != KERN_SUCCESS) { printf("bootstrap_look_up(): %s\n", mach_error_string(err)); exit(1); }
semaphore_signal(sem); printf("success\n");}
Unfortunately I can’t find any documentation for the semaphore functions along with mach_task_self(), task_get_bootstrap_port(), bootstrap_register (), bootstrap_look_up(). In fact, boostrap_register() is deprecated! Unbelievable.
But, as far as I know, using these native unnamed mach semaphores is faster than the POSIX named semaphores created with semget() and managed via semctl(). The native mach semaphores also go away when you kill the process that created them. That means I can get rid of all of my code to manage and cleanup orphaned semaphores based on key files on the disk. What a waste of time that was…
keep looking »Warning: include(/home/remarkwit/enterpriselamp.org/wp-content/themes/Enterprise_LAMP/r_sidebar.php) [function.include]: failed to open stream: No such file or directory in /home/remarkwit/enterpriselamp.org/wp-content/themes/Enterprise_LAMP/archive.php on line 23
Warning: include() [function.include]: Failed opening '/home/remarkwit/enterpriselamp.org/wp-content/themes/Enterprise_LAMP/r_sidebar.php' for inclusion (include_path='.:/usr/local/lib/php:/usr/local/php5/lib/pear') in /home/remarkwit/enterpriselamp.org/wp-content/themes/Enterprise_LAMP/archive.php on line 23
