Michael Stapelberg’s Debian Blog

2014-02-19: Using your TPM for SSH authentication

perma

Thomas Habets has blogged about using your TPM (Trusted Platform Module) for SSH authentication a few weeks ago. We worked together to get his package simple-tpm-pk11 into Debian, and it has just arrived in unstable :-).

Using simple-tpm-pk11, you can let your TPM generate a key, which you then can use for SSH authentication. This key will never leave the TPM, so it is safer than having your key on the filesystem (e.g. ~/.ssh/id_rsa), since file system access is not enough to steal your key anymore. Instead, you’ll need remote code execution.

To use this software, first make sure your TPM is enabled in the BIOS. In my ThinkPad X200 from 2008, the TPM is called “Security Chip”.

Afterwards, claim ownership of your TPM using tpm_takeownership -z (from the tpm-tools package) and enter a password. You will not need to enter this password for every SSH authentication later (but you may choose to set a separate password for that).

Then, install simple-tpm-pk11, create a key, set it as your PKCS11Provider and install the public key on the host(s) where you want to use it:

mkdir ~/.simple-tpm-pk11
stpm-keygen -o ~/.simple-tpm-pk11/my.key
echo key my.key > ~/.simple-tpm-pk11/config
echo -e "\nHost *\n    PKCS11Provider libsimple-tpm-pk11.so" >> ~/.ssh/config
ssh-keygen -D libsimple-tpm-pk11.so | ssh shell.example.com tee -a .ssh/authorized_keys

You’ll now be able to ssh into shell.example.com without having the key for that on your file system :-).

In case you have any feedback about/troubles with the software, please feel free to contact Thomas directly.

2014-01-17: debmirror 6x speed-up, or how to fill a 1 GBps Rackspace link

perma

As explained in more detail in my my last blog post, Rackspace is providing hosting for Debian Code Search. For those of you who don’t know, Rackspace is a cloud company that provides (among other services) a public cloud based on OpenStack. That means you can easily (and programmatically, if you want) bring up virtual servers, block storage volumes, configure the network between them, etc.

As part of my initial performance experiments, I was running debmirror to clone a full Debian source mirror, and I noticed this took about one hour. Given that the peak network and storage write rates I have observed in my benchmarks are much higher, I wondered why it took so long. About 43 GB (that’s how big the Debian sid sources are) in 60 minutes means ≈ 12 MB/s download rate, which is a bit more than a sustained 100 MBit/s connection. Luckily, at least the Rackspace servers I looked at are connected with 1 GBit/s to the internet, so more should be possible. Note that these are the old Rackspace servers. There is a new generation of servers, which I have not yet tried, that apparently offer even higher performance.

After a brief look at the debmirror source code, I concluded that it can only use a single mirror and downloads files sequentially. There is some obvious potential for improvement here, and the fact that I could come up with a proof of concept written in Go to determine the files to download in a couple of minutes encouraged me to spend my saturday on this “problem” :-).

About 7 hours later, my prototype had gone through various iterations and the code could sustain about 115 MB/s incoming bandwidth for most of the time. Here is a screenshot of dstat measuring the performance:

Another 3 hours later, the most obvious bugs were weeded out and the code successfully cloned an entire mirror for the first time. I verified the correctness by running debmirror afterwards, and no files were downloaded that had not actually arrived on the mirror in the meantime.

Features

  1. Parallel downloading from multiple mirrors.
    Of course, it is not guaranteed that the mirrors are consistent, but that is a solvable problem: whenever a non-200 HTTP response is encountered, the file is put back into the queue and gets rescheduled onto any mirror. Eventually, it will be rescheduled onto the mirror from which the sources.gz was downloaded, and that mirror has to have it.
  2. HTTP keep-alive and pipelining.
    debmirror also does HTTP keep-alive (I read that in the source, did not verify it), but it downloads files sequentially, i.e. it downloads one file, then sends the request for the next file. This means that after one file was received, no data is being received until another round-trip finished. For a lot of small files (think of .dsc and .debian.tar.gz files or small orig tarballs) that really adds up.
    Therefore, I use HTTP pipelining: I send 99 requests and then just receive all the responses at full speed. The magic number 99 comes from the fact that this is the smallest common denominator that all mirrors I have tested allow. nginx by default allows 100 requests.
  3. Request ordering.
    Instead of sending requests at random, I sort the download queue by size and make sure that the first request on each connection is for a big file. The intention is that TCP will adapt to that quickly and use a bigger window size as soon as possible instead of ramping up slower later on. Note that I have not actually measured the window sizes, so this is just a hunch.
    Another nice side-effect of the ordering is that the amount of data that is being sent through each connection is roughly equal. This means we don’t waste TCP connections by sending 99 requests for small files only.
  4. (Goroutines).
    I’m a bit hesitant to even write about it, so let’s make it quick: Perl (and thus debmirror) is typically single-threaded, whereas Goroutines scale nicely on multi-core systems. I don’t think it makes a big difference on the machines I am running this on as they have only two cores and are not maxed out on CPU usage at all. But maybe this will be more important for saturating 10 GBit/s? :-)
  5. (Mirror selection).
    It’s not a feature of the code, but certainly relevant: of course I selected mirrors that are connected to the internet with at least a Gigabit link, serve files reasonably fast and have low latency to Rackspace’s Chicago presence.
    Interestingly, while I used to think that universities have access to big uplinks and beefy machines, none of the mirrors that I use are located at a university. In fact, every mirror ending in .edu that I tried was really slow — most of them providing something like 2-4 MB/s.

Results

With the latest iteration of the code, I can clone an entire Debian source mirror with ≈ 43 GB of data in about 11.5 minutes, which corresponds to a ≈ 63 MB/s download rate:

GOMAXPROCS=20 ./dcs-debmirror -num_workers=20
68,24s user 384,90s system 66% cpu 11:25,97 total

It should be noted that the download rate is very low in the first couple of seconds since the sources.gz file is downloaded from a single mirror, then unpacked and analyzed.

The peak download rate is about 115 MB/s (= 920 MBit/s) which is reasonably close at what you can achieve with a Gigabit link, I think. If the entire uplink was available to me at all time, the Rackspace hardware would be able to saturate that easily, both in terms of reading from the network and in terms of writing to block storage. I tested this on an SSD volume, but I see about 113 MB/s throughput with the internal hard disk, so I think that should work, too.

There is another dstat screenshot of the final version (writing to disk).

Perhaps even more interesting to some readers is the time for an incremental update:

GOMAXPROCS=20 ./dcs-debmirror -tcp_conns=20
2013/10/13 11:09:55 Downloading 307 files (447 MB total)
…
2013/10/13 11:10:04 All 307 files downloaded in 9.105605931s. Download rate is 49.186567 MB/s
4,91s user 3,99s system 67% cpu 13,205 total

The wall clock time is higher than the time reported by the code because the code does not count the time for downloading and parsing the sources.gz file.

The entire program is about 400 lines (not SLOC) of Go code. It’s part of the Debian Code Search source. If you’re interested, you can read dcs-debmirror.go in your browser.

Conclusions

The outcome of this experiment is that I now know (and have shown!) that there are significantly more efficient ways of cloning a Debian mirror than what debmirror does. Furthermore, I have a good grasp on what kind of performance the Rackspace cloud offers and I am fairly happy with the numbers :-).

My code is useful to me in context of Debian Code Search, but unless you need a sid-only source-only mirror, it will not be useful to you directly. Of course you can take the ideas that I implemented and implement them elsewhere — personally, I don’t plan to do that.

If you have hardware, bandwidth and a use-case for 10 GBit/s+ mirroring, I’d like to hear about it! :-)

2013-12-25: Hosting for Debian Code Search provided by Rackspace

perma

For a number of weeks now, I have been forwarding traffic being sent to codesearch.debian.net to an instance of Debian Code Search running at Rackspace’s public cloud offering. I feel like it’s overdue to announce how they have been supporting the project and what that means.

Context

First of all, let’s provide some context. Debian Code Search was launched in November 2012, hosted entirely on my private server. Since DebConf 13, I am in contact with DSA in order to get Code Search running on a DSA-provided machine and make it an official Debian service. Code Search runs best on flash storage, and only with adequate resources can we enable some use cases (e.g. package-level results) and make sure the existing use cases work at all (we currently have timeouts for some queries). Unfortunately, flash storage is scarce in Debian, hence we couldn’t get any for Code Search.

At some point, I got the suggestion to ask Rackspace for help, since they are quite friendly to the Open Source community. They agreed to help us out with a generous amount of resources in their public cloud offering! This means I can run Code Search at Rackspace the way it is meant to be run: with the index sharded onto 6 different machines and the source code being searched on fast flash storage.

Concerns

Now, when using third-party infrastructure, there are always two big concerns: proprietary infrastructure and vendor lock-in.

As for the proprietary aspect, Rackspace’s public cloud offering is based on OpenStack, which is FOSS. Granted, they have a couple of extensions that are not (yet?) released in OpenStack, but those are minor details and any automation that we use can trivially be ported to any other OpenStack offering. Who knows, maybe in the future we use OpenStack in Debian, too?

Given the OpenStack situation, I am not concerned about vendor lock-in. However, to specifically address this concern and err on the side of caution, attention will be paid to keep Code Search able to run on “non-cloud” infrastructure, such that we can run our own instance on DSA-provided hardware. My current intention is that this instance will be the fall-back that we can use should there ever be problems with Rackspace. So far, however, I am very happy with Rackspace’s stability.

Future

There are a number of improvements waiting: for users in the United States, traffic first goes to my server in Europe and then back to Rackspace in the US. Eventually, we should get rid of this indirection. Perhaps we can also spin up another instance in Europe eventually. Furthermore, Rackspace recently announced new available hardware, which we are not yet making use of. I expect a big speed-up, but will have to do some careful benchmarking. Also, I have some interesting performance data, which will be shared in subsequent blog posts. Stay tuned :).

Conclusion

I’d like to thank Rackspace very much for their generous support of the Debian project and Code Search in particular. I hope you look forward to the upcoming posts with more details.

Also, to be perfectly clear about it: I also thank DSA for their help, and hope they will continue working with me to have our independent (yet slower) instance of Code Search running on Debian hardware.

2013-11-27: let’s disable pdiffs already!

perma

Richi’s post about the pdiff-by-default agony resonates with me a lot.

On EVERY Debian installation I have ever done in the last few years, without any exceptions, I have turned off pdiffs. Even on all the oddball cases (Raspberry Pi, account on a remote machine, …) where I don’t run my install-configs script, I have ended up turning off pdiffs eventually, because it is just so insanely slow on modern internet connections. And by modern I mean even the DSL link my parents’ place has since 14 years.

Let’s disable pdiffs by default already.

2013-11-07: First Debian meetup in Zürich

perma Debian meetup

Thanks to Axel Beckert (abe@), 12 people interested in Debian met last Tuesday in Zürich and celebrated the start of our monthly Debian meetup.

New faces are always very welcome. If you live in Zürich, or if you’re visiting, please feel free to attend our meetup — no registration necessary.

See the initial announcement, and subscribe to community@lists.debian.ch for updates.

Thanks to everyone for the nice evening, see you next time!


Debian meetup

2013-08-30: How git performs when you throw all of Debian at it

perma

During DebConf, Asheesh presented the idea of using git instead of the file system for storing the contents of Debian Code Search. The hope was that it would lead to fewer disk seeks and less data due to gits delta-encoding. Maybe the reduction would be big enough that enough data could be held in RAM to allow for fast retrieval.

Joey Hess helped me out with a couple of details: he revealed that using git repack -a -d will lead to a single packfile that optimally contains HEAD, not caring about the history. Also, he showed me how to use git cat-file. We did a small-scale experiment and the results were promising. I told him I’ll do the experiment of just committing the entire unpacked source mirror to git and promised to follow up with the results, so here goes:

stapelberg@couper ~/unpacked master $ time cat <<EOT | git cat-file --batch >/dev/null
:linux_3.2.32-1/sound/pci/ice1712/aureon.c
:linux_3.2.32-1/sound/soc/codecs/sgtl5000.c
:linux_3.2.32-1/sound/pci/hda/patch_conexant.c
EOT
cat <<<''  0,00s user 0,00s system 62% cpu 0,006 total
git cat-file --batch > /dev/null  4,38s user 1,08s system 99% cpu 5,477 total

stapelberg@couper ~/unpacked master $ time cat linux_3.2.32-1/sound/pci/ice1712/aureon.c linux_3.2.32-1/sound/soc/codecs/sgtl5000.c linux_3.2.32-1/sound/pci/hda/patch_conexant.c >/dev/null
cat linux_3.2.32-1/sound/pci/ice1712/aureon.c   > /dev/null  0,00s user 0,00s system 69% cpu 0,006 total

stapelberg@couper ~/unpacked master $ time cat <<EOT | git cat-file --batch >/dev/null
:i3-wm_4.6-1/libi3/font.c
EOT
cat <<<':i3-wm_4.6-1/libi3/font.c'  0,00s user 0,00s system 51% cpu 0,008 total
git cat-file --batch > /dev/null  4,30s user 1,18s system 99% cpu 5,533 total

stapelberg@couper ~/unpacked master $ time cat i3-wm_4.6-1/libi3/font.c >/dev/null
cat i3-wm_4.6-1/libi3/font.c > /dev/null  0,00s user 0,00s system 0% cpu 0,004 total

Even after repeating those tests a couple of times to get a warm page cache, the result stays the same: git takes about 5 seconds to resolve the deltas in a large repository. Even if this was 5 seconds startup time and a very small amount of additional time per file, it would not be acceptable for our use case.

The conclusion is that git is clearly not suitable for this kind of usage, which is not surprising after having heard a couple of times that git does not scale :-). For the curious, the .git/objects directory is 29 GiB for roughly 140 GiB of source code, so the delta encoding is quite impressive in terms of saving space. However, keep in mind that the compressed (!) Debian source archive is about 35 GiB, so the savings are not that huge.

2013-08-16: My DebConf13 talks are online

perma

I gave two talks at this year’s DebConf, both about systemd. A huge thanks goes to the video team for their excellent work and putting up the videos that quickly! Find the recordings and slides here:

  1. Making your package work with systemd (508 MiB ogv) (Slides (≈ 230 KiB PDF))
  2. systemd myths debunked! (456 MiB ogv) (Slides (≈ 230 KiB PDF))

2013-08-08: Going to DebConf

perma

I will arrive at DebConf 2013 on Sunday afternoon. In case you are interested in Go (the programming language), systemd, i3 or getting your package reviewed, please talk to me! :-)

Looking forward to meeting many of you in real life.

2013-07-31: dh-golang in unstable

perma

Good news, everyone! dh-golang is now in Debian unstable. With this debhelper addon, packaging software written in Go is very simple.

Have a look at the example/ directory in dh-golang to see how it is meant to be used. Essentially, export the DH_GOPKG variable containing the canonical upstream location of the package (e.g. github.com/stapelberg/godebiancontrol) and then use dh $@ --buildsystem=golang --with=golang. That’s it!

Now, given that this package is very new and only two packages use it so far (both of which are not yet in Debian, but real soon™), there are most likely bugs and missing features. This is where you come in: when packaging software written in Go, please contact me and tell me whether dh-golang worked for you and what could be improved. In case something does not work, definitely report it. Even if everything worked, I’d be happy to have a look at your packaging before you upload.

Also see https://wiki.debian.org/MichaelStapelberg/GoPackaging.

2013-07-25: Looking for a USB WLAN stick

perma

Posting this on behalf of a friend of mine in the hope that you can help:

I’ve failed several times now to find a suitable WLAN USB dongle that works out of the box on Debian testing. Often manufacturers change the chipsets without changing the version numbers, the product pages are incomplete or even state wrong information.

I turn to you, in the hopes that someone can give me helpful pointers. The dongle should have:

If you have any hints, please send them directly to breunig AT uni-hd DOT de. Thanks!


Older posts