Django, Memcached and the EZTV twitter feed

Back in the days I wrote a HTML-page that used javascript, AJAX and jsonp to parse the eztv-it twitter timeline to something useful (can be found here: http://eztv-mirror.appspot.com/), unfortunately it does not work anymore.

As a fun project I considered implementing a replacement in Django, using the django caching framework and some “nice to know” libraries (eg requests).

The main problem with the old version is that it does not receive a response from the twitter REST api. When you hit the URL directly you however, get a response. I suspect the problem has to do with the callback used to wrap the content from the server side not working correctly (not wrapping the json response in the callback function or similar).

Using the python requests library I was able to get the data I needed, the data can be found on the following URL:
https://api.twitter.com/1/statuses/user_timeline.json?screen_name=eztv_it

Since this project is about farmiliarising myself with a lot of the “nice to know” libraries commonly used, the list is the following (from requirements.txt):

  • Django==1.5.1 – Obviously :)
  • python-dateutil==2.1 – Dateutil has a nice parser method, that attempts to parse the date using common formats, this way I most likely won’t notice if Twitter changes their datetime format.
  • pytz==2013b – required by python-dateutil, but Django uses it internally as well (if available).
  • requests==1.2.0 – used to get the data from Twitter, it’s a wrapper around the urllib/urlib2/httplib mess.
  • python-memcached==1.51 – One of the two popular memcached binding for python.
  • django-memcached==0.1.2 – Django bindings to the python memcached module (above).

Most of the above is convenience, to reduce the number of lines required, it makes the code easier to read and reduces the chance of bugs.

In django you define the caching used in the settings module using something similar to (example is local memcached, over TCP):

CACHES = {
 'default': {
 'BACKEND': 'django.core.cache.backends.memcached.MemcachedCache',
 'LOCATION': '127.0.0.1:11211',
 }
}

The backend can be switched to something else that implements the django cache interface, without much notice. You can also use multiple caching backends simultaneous.

The django caching framework is extremely easy to use, here’s my take on it (from app/eztv.py):

from django.core.cache import cache
...
# Key of the cache.
CACHE_ENTRY_NAME = 'eztv_cache'
# Timeout of the cache (in seconds)
CACHE_ENTRY_TIMEOUT = 600
...
def update_cache():
    ...
    data = list(yield_data())

    cache.set(CACHE_ENTRY_NAME, data, CACHE_ENTRY_TIMEOUT)
    return data

def get_cache():
    ...
    _cache = cache.get(CACHE_ENTRY_NAME)

    # If cache is empty or outdated, update it.
    if _cache is None:
        _cache = update_cache()

    return _cache

As can be seen, the django cache framework is a simple import (after defining the backend in settings) and works along the lines of a key/value store, where the elements  can have a timeout.

The backend can even be memory-only (normally only for development) or simply use files (as keys) on the filesystem.

One of the caveats to the method I use is of course the problem when the cache is outdated and needs to be refreshed, since this takes several seconds. One way to solve this is by using a cronjob for updating the cache. This however brings more complexity and more dependencies to the system as a whole.

It coult be interesting to try and send 100 requests quickly after the cache in outdated or missing. I do not know if this is a problematic case that django solves for me by keeping track of get/set calls to the cache per request, or if it’s a problem I have to solve myself. It is however easy to prove.

Git repository can be found here: https://bitbucket.org/dennishedegaard/eztv/
A running site can be found here: http://ez.dhedegaard.dk/

Raspberry Pi

I finally got my Raspberry Pi in the mail. It’s a small, extremely cheap ARM-based computer, I see it as a fun gadget to mess around with. The official operation system for one of these is a modified version of the Debian GNU/Linux operating system called Raspbian, this means the software is something I already know very well.

Here’s a pic of the system up and running:

IMG_20130402_182958

And here’s a screenshot of the desktop running, Raspbian uses a slightly modified LXDE desktop

raspberrypi

One of the immediate annoyances is the need for 700 mA at 5 Volts. The maximum amount of mA from USB2 is 500 mA at 5 Volts. The 700 mA at 5 Volts means the top effect is around 3,5 watts. When the system is idle it seems to use around 2 watts. This is very impressive for a computer running a modern operation system.

zram and the memory-problem with virtualization

I like to run a lot of VM’s, whenever I start work on a non-trivial project I usually make a VM to isolate the system a bit. This means I run a lot of VMs from time to time.

One of the problems with virtualization is the increase in memory usage, one of the ways I’ve tried to counter this is by using KSM (Kernel SamePage Merging) which merges pages in memory containing the same data. My server has 8 GB of memory, this saves me about 1 GB in average.

A friend of mine keeps talking about how awesome zram is so I gave it a shot. What it does is allocate memory to a compressed block device, this block device can then be used for swap. Swapping from/to compressed memory is super-fast compared to traditional swapping. One of the problems with zram is an obvious processing-overhead caused by the constant compression/decompression of pages to/from the swap.

Here’s a short explanation of the things I did to enable zram.

Upgrading Debian

I like to run stable software (especially for the system my hypervisor(KVM) lives on), and it’s hard to get more stable than a Debian stable (CentOS anyone?), this means running a 2.6.32 kernel (in the case of squeeze/6.0). Since wheezy (or 7.0) is currently in RC1 I decided to upgrade. Needless to say this went smooth. I now run a 3.2 kernel.

Modprobing the module

For debian all I need to do to modprobe at boot is to add “zram” to /etc/modules. One of the things you’d want to give the module as parameter when probing is zram_num_devices, this tells zram how many devices you want. Usually you want as many zram-devices as the number of CPUs on the system.

On debian this is done by making a file in /etc/modprobe.d and entering something like (in my case 2 cores):

options zram zram_num_devices=2

Initializing the SWAP on boot

Since the content of DRAM (Dynamic RAM) is lost when the module are unpowered you need to make new swap every time you boot. There are lots of nice init-scripts out there to do this for you. I like to do it myself, my solution (albeit primitive) is to put the details in /etc/rc.local. First I tell the zram-devices how many bytes I want in each, then I make the swap and then i mount the swap. Details below:

# zram swap
# expects zram modprobed with zram_num_devices=2
# allocates 4gb mem to zram, in 2 separate swap partitions.
echo 2147483648 > /sys/block/zram0/disksize
echo 2147483648 > /sys/block/zram1/disksize
mkswap /dev/zram0
mkswap /dev/zram1
swapon -p 100 /dev/zram0
swapon -p 100 /dev/zram1

The reason for using 4gb of my memory as zram is because my machine has a total of 8 gb of memory. Some scripts use 100% of the memory as zram.

Some statistics

I’ve located a script on the old zram site, that prints some statistics about the zram’s at runtime, it can be found here:

http://compcache.googlecode.com/git/sub-projects/scripts/zram_stats

As always YMMW and I’ll be tweaking my setup over the course of time as I run into new bottlenecks.

Making jpg transparent with PIL

A fun experiment, how hard is it to convert a jpg (or anyting else) to a transparent png using PIL (Python Imaging Library) ?

Not very hard it seems, the source can be found here:

https://bitbucket.org/dennishedegaard/transparentpng/src/194d8e5f1dad1491259ad697e9085f3726445b9b/transparentpng.py?at=master

In later commits a webinterface for GAE can be found here: http://transparentpng.appspot.com/

Long-polling chat application in Django

Back when I was a student I messed around with websockets (back then it was grizzly on glassfish). Now adays most of my development is done in python. The nice thing about websockets is that it’s like a TCP-socket where both parties can send data, websockets are like this, but over HTTP. This allows the server to send data to the client without the client being the active party.

The basic “usecase” for long-polling is described below.

  1. The client initiates an ajax-request to the server.
  2. The server check to see if it has something to return, if it has, it returns it and we go back to step 1.
  3. If the server had nothing to return, it waits for a period of time (in my case 20 seconds), check now and again if it has something to return. If it finds something, it returns it and we go to step 1.
  4. If the 20 seconds pass and the server still does not have anything to send the connection is closed (ie the server returns 200 OK or similar), and the client makes a new connection from step 1.

This method brings a lot of overhead, on the plus side it is supported on all browsers that can make ajax-requests reasonably well.

My implementation of long-polling is done in Django, it is focused on keeping the model clean, the javascript tight and the long-polling technique robust. I have tested it on IE 6,7,9 and 10 as well as firefox and chrome.

A running version can at the time of writing be found here:

http://wc.dhedegaard.dk/

The source can be found here:

https://bitbucket.org/dennishedegaard/webchat

I will most likely try to put it up in the cloud (ie GAE, which support Django 1.4 these days), to make sure it stays up.

Getting CentOS 6 to play nice with a serial port

Serial ports might seem like ancient technology now adays. I virtualize everything on my server, for my server virtualization needs I use KVM together with libvirt. This means I usually use virt-manager for managing my VM’s, this is nice when you have a linux environment on the same LAN as the server. However if you’re doing it over the net it takes a long time to do anything (especially in the VNC-client in virt-manager).

Most of my new VM’s these days are CentOS 6 machines, to enable them to send data to tty0 and well as ttyS0 do the following (in /etc/boot/boot):

Add the following to make grub pass through to tty0 and ttyS0:

serial --unit=0 --speed=19200
terminal --timeout=8 console serial

To tell the kernel that it should send the serial port append the following to the kernel parameters:

console=tty0 console=ttyS0,19200n8

In CentOS 6, when the last console statement is to a ttyS, CentOS automatically spawns a getty on the serial port (as explained in /etc/init/serial.conf):

# On boot, a udev helper examines /dev/console. If a serial console is the
# primary console (last console on the commandline in grub),  the event
# 'fedora.serial-console-available  ' is emitted, which
# triggers this script. It waits for the runlevel to finish, ensures
# the proper port is in /etc/securetty, and starts the getty.

In Debian I’ve been doing this for years by adding/changing the following in /etc/default/grub (and running update-grub afterwards):

GRUB_TERMINAL=serial
GRUB_SERIAL_COMMAND="serial --speed=9600 --unit=0 --word=8 --parity=no --stop=1"

GRUB_CMDLINE_LINUX_DEFAULT="quiet console=tty0 console=ttyS0,9600n8"

And uncommenting ttyS0 in /etc/inittab:

T0:23:respawn:/sbin/getty -L ttyS0 9600 vt100

Slashdot Quotes database

On the Slashdot website at the bottom there is a quote that changes from time to time (according to my tests, every hour). I’ve collected a database of the quotes and is currently at 2381 entries.

I’ve implemented a nice graphical interface to searching the database in Django, the database runs on PostgreSQL.

The site can be found here: http://sd.dhedegaard.dk/

I’ve also implemented a REST-like interface that responds to a GET-request on /json/random (for random quotes) and /json/latest  (for the latest quotes). Any of these urls take a count parameter, count is currently capped at 200 entries returned per request.

Game of Life

I’ve spent the day implementing Conway’s Game of Life in javascript and HTML5 using the Canvas element.

Here’s an example of how it looks:

It’s been tested on the following browsers:

  • Google Chrome
  • Firefox
  • Internet Explorer 9

I’ve implemented different interesting patterns as well as a “random” feature. Suggestions and bugfixes are welcome.

Feel free to browse the source: https://bitbucket.org/dennishedegaard/gameoflife.js

You can try it out here: http://p.dhedegaard.dk/gameoflife

Snake game in HTML5

I spend my sunday implementing a snake game in javascript with the drawing done in a HTML5 canvas element. I’ve tested it in firefox 3.6 on lucid as well as chrome 15 and firefox 8.0.1 on mint isadora.

Any information whether it works on Internet Explorer is appreciated since I do not have to opportunity to test it myself :)

Here’s the link: http://p.dhedegaard.dk/snake/

Feel free to look in the source and find all my bugs :)

Sierpinski for android 2.2

It’s been a while, since last time I got my hands on an android. Like many others of my kind I love messing around with new technology so I decided to try and make a basic application to get some hands-on experience with android and the android sdk. A long time ago I made a project for making a javabean that draws sierpinski triangles, so a nice app would be to port that to android. This also proved very useful since I knew the algorithm was correct.

The application is free and on the android market, it can be found here: https://market.android.com/details?id=org.dhedegaard.sierpinski2

From this little project that took ~1½ day I learned about the basic UI, how to draw on a “Canvas” and how to maintain a global state in an “Application”. Moreover I got to a point where I had to extend my application to two Activites and thereby got an indea of what an “Intent” is.

All ideas, bugs and suggestions are as always welcome. Exception more applications in the future, android seems like a very nice way to develop with a more strict MVC-pattern compared to swing for the desktop.