Posts Tagged development

Jinja2 Bytecode Cache

I have been a bit busy and thus off from turbogears development for a while, but I see 2.2 release shaping up really good, props for Michael and Alessandro for their work.

Now, looking at the stuff that I’m most familiar with (jinja2 support code), I have been thinking for a while that jinja2 does not have bytecode caching enable by default, in fact, there is no support at all in tg2 code for that, while Genshi does have caching available jinja2 is lacking in this area, where enabling the cache is actually very simple.

I created a few tests with the latest 2.2rc2 turbogears with jinja bytecode caching enabled, I used the included filesystem based cache and a custom (simple) in-memory cache, as we can see on the graphics from the benchmark the gains from enabling the cache a bit small, around 4 request per second more, but if you are looking to squeeze performance out of tg2 this could be a good option as its practically free, the cache is not really intrusive, if I make a change to the template it is reloaded automatically thus it doesn’t affect the normal development workflow.

Jinja2 Bytecode Benchmark: Cached vs Uncached

I should, however, do some additional test under gunicorn or uwsgi to see if there are even more substantial benefits to enabling the bytecode cache.

, , , , , ,

No Comments

Programming in C

Currently I don’t program in C as much as I used to, mainly because I try to choose the right tool for the job and the fact that when I do program in C I end spending much of the time fine tuning the code (to me is like a craft). I don’t like optimizations that alter the readability of the code that much or other “underhanded tricks”, but I do care about memory usage / performance trade-offs, memory leaks, etc.

Last weekend while configuring a mail server, I remembered a small pet project I had a while ago to build a mail server, it was made in C++ and had a few interesting bits of code here and there, I still remember benchmarking it, getting around 20 to 48 messages per second.

Finally I decided to do an experiment and build a simple smtp server in pure C, using redis to manage a queue between it and a mail drop server which parses the email and drops it in the correct Maildir. In the end I did some benchmarking, getting up to 54 messages per second, not bad given that I gave up some optimizations such as using a modified tree structure to parse smtp commands.

Another further optimization I did was cutting down the amount of write calls, for clarity sake the SMTP wrote back in 2 calls, one sending the status, another the terminator character (CRLF), moving both to a single call did make an immense difference, now the benchmark is around 1100 messages per second (using 10 threads, 1kb message payload), or rather having 2 write calls slowed all the process considerably, it meant 2 syscalls, 2 roundtrips, potentially slowed the client parser too.

I could probably get even further improvements by implementing the SMTP PIPELINING extension. The maildrop server on a single thread could keep up with the smtp server so I could even squeeze a bit more performance from the whole mail pipeline.

In the end the experiment was meant to produce very lean, basic (but functional) and high performing servers, while having fun! I was very careful with memory management and does not seem to leak any, valgrind still identifies some small leaks related to hiredis client and pthreads. The memory used by the smtp server with 10 threads stays at max 760kb as measured by RSS (ps -a -o comm,rss), the maildrop is only 504kb.

, , , , , ,

No Comments

Static site from template files

Recently while working on a full-blown TurboGears based CMS for our main site it got to me the idea of creating a script to generate a simple static site from a series of Jinja2 templates, simple enough, uh? well at first I looked for a few alternatives but rapidly gave up and ended working on something of my own, as I deemed the task as a mundane and simple thing to do, it was faster creating a script than looking for one and see how it works.

Long story short, it wasn’t really that simple, there are many things to consider, such as paths, if you want to preview the output locally, relative paths will not work very well; I added a switch to my script to transform paths and a function to use on the template, suddenly I found myself adding more things like this, such as a variable for the name of the template so a menu template can check which is the current page and act accordingly.

As far as now is not that complex but a pretty handy script you could easily modify on your own, I posted the code on github:

https://github.com/clsdaniel/templatesite

As next move I would like to add a daemon mode or monitor mode so it will rebuild the output when it detects a change on the files, support for managing other files is also needed, opening the browser to preview after build would be nice, etc.

, ,

No Comments

Python for quick data processing

Lately I had a client ask me to work on a tool to check and email account, look up certain files XML which could be inside a zip file, then extract certain info, push that to the database.

Usually this client has all it’s software under .Net Framework, which under the standard library does not have a POP3 client library, you have to go for a 3rd party one, looking around for options I tough to myself “If I did this in python I would have easily done all that by now”, so yeah I gave it a shot, used poplib to access email, with email module to parse and extract files, ZipFile to compressed files and lxml.etree for XML processing, most of the modules are under the Python Standard Library and are well tested, in a few minutes I had the script done, now what was left was database access, GUI and reporting.

For the database they use a combination of Advantage DB and MS SQL Server, the table I needed to access was on the MSSQL Server, fortunately SQLAlchemy can work with it quite well, for GUI elements I used PySide which is working quite well right now on Windows.

The only thing missing now is Reporting, I would usually go for ReportLab + tRML and PDF output, however this being a Desktop application needed to be a bit different, I looked up at options with Qt (and PySide), NCReport looked nice but is a bit pricey and no Python bindings, most reporting tools are sort of standalone, you cannot plug them in your application and feed them custom data, thus applications such as JasperReports or Eclipse BIRT are out of the window.

Unfortunately there are no High-Level interfaces or toolkits for reporting under Qt + Python, which strikes me as a downside, having to work you way by hand is a no no at this level, it just wastes time and the results are usually not very reusable or flexible. I tried QtWebKit which could be almost perfect if it could do multi page printing easily, but no. I may end up just doing a simple report on VS.Net and launching a separate app for reporting.

Guess it’s time to roll our own Qt based reporting engine, which could be plausible, the QGraphics classes seems to be fit for this kind of job, but there are many missing pieces that would need to be fill in. Layout engine is sort of there, but I don’t see multi-page rendering support, you can render Qt widgets but it really looks out of place, kind of like printing a screen shot for a report.

In conclusion, I guess I could have done all that under C#, but I find Python to be more convenient for me (as I’m way more proefficient on it that on C# + .Net), guess a combination of the two would do for now.

, , , , ,

No Comments