RPM Fusion Mirrorlist Server

On the last day of the last year (2009-12-31) both RPM Fusion’s mirrorlist server were most of the time not reachable. The problem started at 00:53 (UTC) and it was at least going on until 16:00 (UTC). Both mirrorlist servers have been on the same network and the router for that network  broke down. If it would have been the link to our provider the router had a backup route to stay on-line, but this time it actually hit the single point of failure – and everything was off-line. See: error report of the provider (german).

I was never happy that both mirrorlist server were running in the same network and I especially wanted to get the mirrorlist server off my mirror server. Thanks to Patrick I have now access to another VM at a different provider where I am running a new mirrorlist server instance. It does not require much in terms of resources and bandwidth, but having root access makes everything so much easier.

RPM Fusion’s mirrorlist server are now two dedicated VMs at two different providers and that should protect the functionality from failures like the one on 2009-12-31.

Storage Trouble

In the night from Friday to Saturday a disk (slot 7) from our external RAID, containing most of the mirror server data, failed and was marked as BAD. No really a big problem, yet. The hot spare drive was activated and the rebuild started. About 24 hours later the rebuild finished. On Sunday (around 16:00) another drive (slot 5) failed and we immediately started to sync all the data to another box in case another drive decides to go off-line, which would mean a complete data loss. All the data on that RAID are (only) mirrored, but to re-sync all the 9TB we currently have would probably take a few weeks. Unfortunately the sync to another box will also take a few days until it is finished, so it is still possible that we might lose a lot. We are waiting for the replacement disks which have been promised to be here by Monday (today), but as the rebuild needs over 24 hours there is still the chance of a data loss.

Update (2009-12-14 23:20): The replacement disks have arrived and after more than twelve hours 25% of the array has been rebuilt.

Update (2009-12-15 11:00): After more than 24 hours 58% of the array has been rebuilt. It seems to rebuild faster during the night.

Back In School

Not really back in school, but it has been now more than one week that I started my new job at my old university in Esslingen at the beginning of December 2009. After only 11 months at my previous workplace (Matrix Vision) I am now working for the faculty of Information Technology.

I will be responsible for the setup and installation of the new cluster of the university. The cluster will be part of the bwGRiD and it will have around 1500 cores and is currently being installed. It is partly water-cooled and a few days ago the racks were delivered and installed. The cluster is from NEC and we are expecting the servers to be delivered in the next few days. The cluster will be running Scientific Linux.

I am now in the same building as my mirror server. This might be a good thing, because now I am much closer to the hardware and can act faster if something unexpected happens… It might also be a bad thing, because now I am much closer and can experiment with things I would not do if I was not in the same building.

First Text Then HTML

I finally have mutt configured in such a way that it first tries to display the plain text part of a mail and only the HTML part if there is no plain text available. For years I had mutt configured to display HTML mails using lynx but it was displaying the HTML part even if there was plain text available.

To display HTML mails I was using auto_view text/html in my .muttrc like it is described everywhere with the following corresponding entry in my .mailcap:

text/html;      lynx -dump %s; copiousoutput; nametemplate=%s.html

The problem with this setup is that it displays the HTML part of a mail even if there is a plain text part available. So I had auto_view text/html disabled for most of the time and edited the configuration file manually to enable it again for the rare cases in which I received a HTML only mail.

But as this is mutt and almost everything can be configured I finally searched and found a solution:

auto_view text/html
alternative_order text/plain text/html

If the message has a plain text part and a HTML part mutt shows me the plain text part, but if there is only a HTML part available I get the HTML converted to plain text. Exactly what I always wanted.

Lightning Talk

Last Thursday (2009-08-13) I gave a lightning talk about the mirror server I am maintaining. According to the description from Wikipedia lightning talks are usually between 1 and 10 minutes with a 5 minute limit being common[¹]. In this session there were four talks with each being about 15 minutes with additional 5 to 10 minutes for questions. So it was a bit longer than the definition but that was a good length for all four talks.

The lightning talks were organized by the Chaos Computer Club Stuttgart (CCCS). They are organizing a talk every month and in the summer it is usually a lightning talks session. The given talks were:

I am very happy that I decided to give my talk and altogether it was a very nice event. There was also an audio recording (which unfortunately has not yet been released).

[¹] http://en.wikipedia.org/wiki/Lightning_Talk

More Mining

I am doing now even more data-mining in the log files of our mirror server and in the last few weeks I have added more graphical output to the information interface of our mirror server.

The first, pretty simple, new diagram is the Disk Usage By Mirrored Project.  It is created daily from a du run over the whole mirror area and then visualized using the existing code to draw a pie chart. The reason that Fedora is the largest mirrored project is due to the fact that I count all the Fedora related projects I am mirroring (fedora, archive.fedoraproject.org, secondary.fedoraproject.org) as fedora. If this diagram is now compared to the Overall Traffic Breakdown diagram it is interesting to see that there is not much difference in traffic generated by mirroring Fedora and Ubuntu, but the space required to mirror Ubuntu is much lesser. If I would know a lot more about Ubuntu I probably could start mirroring exotic Ubuntu things (just like I do with Fedora).

In addition to the new pie charts I have also created world maps on which the client distribution around the world can be seen. The code to generate the maps is based on generate-worldmap.py. With the help of this I am now able to create a world map with the client distribution for all downloads on which it can be seen that Europe is clearly the location of many (maybe even most) clients connecting to our mirror server, but also that there are basically connection coming from all over the world.

In addition to the overall map I am also creating a map for each mirrored project. It can be seen that there are mirrored projects (like Ubuntu, Fedora and OpenSUSE) which have pretty good redirectors so that only clients in the vicinity are redirected to our mirror. There are also mirrored projects (dag) which do nothing like that which results in connection from all over the world. And then there are projects which have a good redirector but do not have enough mirrors around the world like fedoraproject. What is called fedoraproject on our mirror server is a mirror of archive.fedoraproject.org (Fedora 8 and older right now, which still seems to be downloaded).

Music Player

About two years ago I started to build a music player. I had an old notebook and a friend gave me his old CD player. The plan was to get the important parts of the notebook into the chassis of the CD player. This was done about two years ago and since then it stayed half-finished. Last week I finally had enough motivation (and the other notebook acting as our music player using mpd was actually being really used again) so that the project is now finally finished. Before I started it looked something like this:

On the left side the board is mounted on two pieces of wood. I had to cut new holes on the backside for all the connectors. Although it is booted over network and running from NFS it needs a hard-disk, because the BIOS needs 15 seconds longer if there is no hard-drive. As I wanted a fast boot I had to connect something to it and that is why I bought a 256MB flash drive with a 44-pin IDE connector.

All the cables which can be seen on the right side are from my idea to use the original buttons of the CD player chassis. The cables are connected to the keyboard controller and they are a very good example that I am more a software than a hardware guy. The plan was that certain combinations of the cables generate certain keyboard events (characters). That worked perfectly and I tried out many combinations and found enough keyboard events to connect all the cables to buttons of the CD player chassis. The problems started when I soldered all the cables to the board with the buttons. After I had soldered all cables to the board behind the buttons I powered it on again and it was generating lots of keyboard events even if I was not pressing any button, probably because I did not think about it that all cables which were connected to ground pins were now all connected together. So it was a bad idea and everything I did was pretty much useless. So I removed everything again and connected it again, but this time I tested the result after every button.

The most important button was the power on button, which works. I have now only four working buttons which is far from what I wanted but better than no possibility to control anything directly at the chassis.

To use the characters coming from the buttons I took mingetty, removed most of the code and modified it to read just one character (readfromtty1.c) and return that immediately to a shell script wrapped around. So if I now press the REPEAT button I get a message in the syslog saying “REPEAT pressed“.

I also had to cut two more holes into the CD player chassis on the left side for the CPU fan and for the audio connectors.

That was the only change to finally use it. If I now press the STANDBY/ON button, about 20 seconds later it starts playing music. I am using mpd so that I can control it from anywhere and pressing the same button again shuts the system down in about 5 seconds. Without much work I could get it probably to boot faster but right now I am using an only slightly modified Fedora 11 with a custom kernel so that I probably leave it the way it is now.

In addition to the buttons and all the available mpd clients I can also control the music player using my phone. I have extended my phone2jabber script to not only send jabber notifications when somebody calls but also to play the next song when one of my unused MSNs is called.

Leonidas Traffic (Part 2)

In one of my last posts I wrote about how much Fedora related traffic we had on our mirror server during the Fedora 11 release. I got one huge comment from Jef with three questions which I am trying to answer now.

1) Assuming you could find an accurate count of EU mirrors on F10 release. Can you use the ratio of available mirrors then to available mirrors now to re-scale the activity…sort of like a mirror inflation correction to scale activity in terms of available bandwidth.

As the mirrorlist is very dynamic I do not think I can answer that.  But if somebody has some useful numbers concerning the number of EU mirrors during the F10 release as well as during the F11 release it can probably be done.

2) Can you trend the “shape” of the first week of F-11 compared to the first week “shape” of F-10 activity on your mirror?  Forget about absolute scales. Normalize each to the maximum associated with the first 24 hours of activity and see how the activity trends in time relative to that normalization. Does F-10 for example see the same second day uptick relative to the first day that you see in F-11?

That should be possible. After some gnuplot-ing I have the following diagrams:

Downloaded Data (Normalized)

This shows the data transferred for each release (normalized to the first day). It is important, however, to know that the first day of a Fedora release is not 24 hours, but only 8 hours on our mirror server as the release usually happens at 16:00 local time. Therefore I also made another diagram using the absolute numbers:

Downloaded Data (Absolute)

It can be seen that basically only the first day differs for some reason. The following days were pretty much the same, just a bit less traffic than during the Fedora 10 release. So maybe this difference is related to my assumption that there are more mirrors in Europe. Although the amount of a traffic does not differ so much the mirror server has been in a much better state during the whole release. The load used to be much higher and the http server had no free connection slots available. This time the load was not really high and after the first day it was always possible to make a http connection (although it took longer than usual).

3)I’d be really interested to know if you could identify any upticks related to F-10 downloads further away from F-10 release that correlate with ambassador activity at an EU event.

No idea, but I have the amount of data downloaded per day for each mirrored project for at least the last year available here. So if there are certain dates it can be looked up.

For my own reference these are the gnuplot commands used to create the two diagrams:

gnuplot> set terminal png size 400,300
gnuplot> set output "absolute.png"
gnuplot> set xlabel 'Days Since Release'
gnuplot> set ylabel 'Terrabytes'
gnuplot> plot 'F10' smooth csplines title 'Fedora 10', 'F11' smooth csplines title 'Fedora 11'
gnuplot> set output "normalize.png"
gnuplot> a=2.23
gnuplot> b=1.54
gnuplot> set ylabel 'Downloaded Data (Normalized)'
gnuplot> plot 'F10' using 1:($2/a) smooth csplines title 'Fedora 10', \
>'F11' using 1:($2/b) smooth csplines title 'Fedora 11'

Bounce/Forward All Mails From A MBOX

So I found out that emails which I expected to have been forwarded a long time ago from my home server to my normal email address where still sitting in the mbox of my home server for over 6 months. So instead of using the easy way of copying the mbox to my real mail server I tried to forward them all using SMTP. I remembered that there exists a program called formail which should exactly do what I want.

To forward/bounce all mails from a mbox to my email address adrian@lisas.de I created a .procmailrc like this:

:0
*
! adrian@lisas.de

and then it does not take more than cat /var/spool/mail/adrian | formail -s procmail and all the mails were in the queue and ready to be delivered.

It took me a few attempts to get it right, so instead of forwarding/bouncing each mail one time I manage to bounce each mail over 200 times so that I had about 6000 new mails in my INBOX.

Leonidas Traffic

The Fedora 11 (Leonidas) release has been the Fedora release with the least pressure on our mirror server. This is probably due to the fact that there are more mirror servers in Europe than ever. In addition to the usual http/ftp/rsync traffic I had bittorrent running for the first time, but the bittorrent client was never using more than 50 MBit/s of the bandwidth (and that also dropped after the second day). Compared to the normal mirror traffic that is not really much. After running for about a week the bittorrent client has uploaded around 600 GB.

The Fedora traffic during the first few days after the release was of course higher than the average ~450 GB/day:

  • 2009-06-09: 1.54 TB
  • 2009-06-10: 2.88 TB
  • 2009-06-11: 1.62 TB
  • 2009-06-12: 1.01 TB

On the second day of the release almost 70% of the mirror traffic was Fedora related:

2009-06-10

And on the bandwidth graph it can also be seen when the bitflip was (around 15:00 local time) and when the release actually went live (16:00 local time):

2009-06-10

I do not know why the traffic dropped so significantly at around 00:00, but probably our mirror was dropped from the mirror list, because the crawler (from MirrorManager) was no longer able to connect and verify that our mirror was up to date.