Kiwix is a special software. Special because it's difficult to define:
- A Desktop software for browsing offline content with ZIM on Mac, Windows
- A Server allowing to serve ZIM content on those platforms + arm Linux.
- A library of ZIM files for popular content: Wikipedias, Wikileaks,
- Very few developers (most of the time it's 2).
- Very large (and growing) number of users.
- “Small” code base (about 50,000 lines of code).
So far, all this was maintained by hand, the ZIM files are created from a
complex procedure of scripts and mirror setups, the releases are created
manually on all platforms, etc.
You got it, it's difficult to keep up with improving the software, fixing
bugs, generating new content ZIM files, building and testing the software on
Our first step into the right direction was getting the translations done on
it got us to ship Kiwix now with 80+ languages.
Thanks to sponsorship by Wikimedia CH,
we decided to first tackle the build problem as it's the most annoying.
Kiwix releases on the following:
- Mac OSX 10.6+ Intel Universal (Intel 32b, Intel64b)
- Linux 32b “static“ (no dependencies)
- Linux 64b static
- Sugar .xo for OLPC
- Windows 32b.
- Armel5 (kiwix-serve only)
- Source code.
- Debian wheezy package 32b.
- Debian wheezy package 64b.
- Linux 32/64b with dependencies (used to be a PPA for Ubuntu until they
Knowing that only reg has a Mac, that the Windows setup for building Kiwix
is complicated and that both Kelson and reg are using Linux, testing and
distributing new versions of the code is very difficult.
This is not a unique problem ; most large multi platform softwares faces the
same issue and we did nothing but imitate them: we deployed a build farm.
The solution looks like the following schema: a bunch of Virtualbox VM, a
Qemu one for arm, buildbot on all of them.
As you can see, the builbot master controls all the slaves which creates
their own builds and sends them to the Web server's repository.
A compile farm is a
set of servers ; each building a platform or target of the software. To manage
those, [a large number of
After some research, we chose Buildbot because:
- It seemed easy to install
- It looked very powerful
- Its documentation was clear.
- It's written in Python (including the configuration file).
The deal-maker was really the tutorial on the
website wich allowed us to imagine the required steps without to actually get
our hands dirty.
The Python configuration file is a great feature as it allows a very
flexible configuration without a dedicated syntax.
Builbot is divided into two softwares:
- the master which holds the whole configuration (the only file you care
- slaves which only needs to run the slave software (python). Those are
Kiwix already rent a very powerful server in a data center for serving
downloads. We used it to hold
All the build slaves (except for the arm target which is not supported by
VBox) are VirtualBox Virtual Machines
- 512MB RAM
- 20GB HD
- 2 NIC: NAT for accessing Internet (dhcp) ; Host-only for buildbot (Fixed
OSX VM is 1GB RAM and 40GB HDD.
All the VMs were installed through VRDP (VNC-like
protocol) until the network is configured and ssh access is enabled.
See also: VM
QEmu was required to get an
armel VM. We used aurel32's debian
Note: In order to ease the SSH connexion, halt and start of
the VMs, we wrote a
wrapper script around VirtualBox.
Configuring and running
Configuring buildbot is pretty straightforward once you know what you want
to do. Configuration is composed of the following components:
- Slave definitions (name, login, password)
- Builders: targets composed of steps (commands) executed on a slave.
- Schedulers: Triggers for when to start builders.
- Status: What to do with output of builders.
The hard part is defining the builders as this is where you indicate how to
retrieve source code, launch your configure script, compile, and tranfer your
build somewhere else.
Take a look at ours as an example:
We don't use any advanced feature so it's easy to understand. We chose:
- fixed time daily to run our builds (at night – server time)
- builds (tarball, etc) are uploaded to the server's /var/www/ for direct web
Buildbot handles the transfer of files between master and slave.
- We can trigger builds at any time from the web interface.
- We list build results on the web page and by mail in a dedicated
- Builds are announced and controllable by the IRC bot.
Although it's simple, it takes a lot of tweaks and tests (fortunately it's
easy) to get the configuration as wished, you need to have a proper and
documented build mechanism for all your targets otherwise you'll probably go
crazy. We completed a complete rewrite of our autotools Makefiles for all
platforms before we setup buildbot. It sounds dumb but it's
- Every day, a new
release for all targets available for download ; properly named with the
SVN revision and the date.
- Ability to fire a build at any time from the Web UI.
- Kiwix to be integrated into Debian sid in the coming days (and thus in the
next stable release).
Things you should know:
If you intend to reproduce, here's a few things we've learned and want to
- Installing OSX on non-Mac hardware is tricky: you need a recent Intel CPU
(support for VTx) but not too recent otherwise your OSX Install DVD won't know
about it (and refuse to install).
- On OSX, Apple packages (MacOSX updates, XCode) have an expiration period.
If you install an XCode version 2 years (that's an example) after it's been
released, the installer will fail with no useful feedback. It's due to the
package's signature being too old. You can still install it by
unpacking/repacking the packages.
- SSH to your QEmu VM is done using a QEmu proxy so you ssh to localhost on a
- VRDP requires a good connexion if you intend to do a lot of configuration
inside Windows (384k clearly is a pain!).
- Buildbot slaves freezes frequently. Not sure why but sometimes it fails to
answer to build request and stays attached doing nothing. As a workaround, we
delete and recreate the buildbot slave folder daily in a cron job.
- Buildbot slaves have network issues some times. We're not sure if it's
related to buildbot, VBox or something else but it's frequent that the slave
can't checkout the source tree or can't download our dependencies from the
- Windows slave frequently loose connexion to the master. Might just be a
Windows configuration issue.
- Improve our wrapper script to handle VRDP access to VMs by controlling
- Add SSH to the Windows slave so we can do basic tests in console.
- Investigate the network/slaves problems so that it works 24/7.
- Automate & build a similar platform for the creation of ZIM files so we
can focus only on code thereafter.