Thomas Habets has blogged about using your TPM (Trusted Platform Module) for SSH authentication a few weeks ago. We worked together to get his package simple-tpm-pk11 into Debian, and it has just arrived in unstable :-).
Using simple-tpm-pk11, you can let your TPM generate a key, which you then can
use for SSH authentication. This key will never leave the TPM, so it is safer
than having your key on the filesystem (e.g.
file system access is not enough to steal your key anymore. Instead, you’ll
need remote code execution.
To use this software, first make sure your TPM is enabled in the BIOS. In my ThinkPad X200 from 2008, the TPM is called “Security Chip”.
Afterwards, claim ownership of your TPM using
tpm-tools package) and enter a password. You will
not need to enter this password for every SSH authentication
later (but you may choose to set a separate password for that).
simple-tpm-pk11, create a key, set it as your
PKCS11Provider and install the public key on the host(s) where you want to use
mkdir ~/.simple-tpm-pk11 stpm-keygen -o ~/.simple-tpm-pk11/my.key echo key my.key > ~/.simple-tpm-pk11/config echo -e "\nHost *\n PKCS11Provider libsimple-tpm-pk11.so" >> ~/.ssh/config ssh-keygen -D libsimple-tpm-pk11.so | ssh shell.example.com tee -a .ssh/authorized_keys
You’ll now be able to ssh into shell.example.com without having the key for that on your file system :-).
In case you have any feedback about/troubles with the software, please feel free to contact Thomas directly.
As explained in more detail in my my last blog post, Rackspace is providing hosting for Debian Code Search. For those of you who don’t know, Rackspace is a cloud company that provides (among other services) a public cloud based on OpenStack. That means you can easily (and programmatically, if you want) bring up virtual servers, block storage volumes, configure the network between them, etc.
As part of my initial performance experiments, I was running debmirror to clone a full Debian source mirror, and I noticed this took about one hour. Given that the peak network and storage write rates I have observed in my benchmarks are much higher, I wondered why it took so long. About 43 GB (that’s how big the Debian sid sources are) in 60 minutes means ≈ 12 MB/s download rate, which is a bit more than a sustained 100 MBit/s connection. Luckily, at least the Rackspace servers I looked at are connected with 1 GBit/s to the internet, so more should be possible. Note that these are the old Rackspace servers. There is a new generation of servers, which I have not yet tried, that apparently offer even higher performance.
After a brief look at the debmirror source code, I concluded that it can only use a single mirror and downloads files sequentially. There is some obvious potential for improvement here, and the fact that I could come up with a proof of concept written in Go to determine the files to download in a couple of minutes encouraged me to spend my saturday on this “problem” :-).
About 7 hours later, my prototype had gone through various iterations and the code could sustain about 115 MB/s incoming bandwidth for most of the time. Here is a screenshot of dstat measuring the performance:
Another 3 hours later, the most obvious bugs were weeded out and the code successfully cloned an entire mirror for the first time. I verified the correctness by running debmirror afterwards, and no files were downloaded that had not actually arrived on the mirror in the meantime.
.eduthat I tried was really slow — most of them providing something like 2-4 MB/s.
With the latest iteration of the code, I can clone an entire Debian source mirror with ≈ 43 GB of data in about 11.5 minutes, which corresponds to a ≈ 63 MB/s download rate:
GOMAXPROCS=20 ./dcs-debmirror -num_workers=20 68,24s user 384,90s system 66% cpu 11:25,97 total
It should be noted that the download rate is very low in the first couple of seconds since the sources.gz file is downloaded from a single mirror, then unpacked and analyzed.
The peak download rate is about 115 MB/s (= 920 MBit/s) which is reasonably close at what you can achieve with a Gigabit link, I think. If the entire uplink was available to me at all time, the Rackspace hardware would be able to saturate that easily, both in terms of reading from the network and in terms of writing to block storage. I tested this on an SSD volume, but I see about 113 MB/s throughput with the internal hard disk, so I think that should work, too.
There is another dstat screenshot of the final version (writing to disk).
Perhaps even more interesting to some readers is the time for an incremental update:
GOMAXPROCS=20 ./dcs-debmirror -tcp_conns=20 2013/10/13 11:09:55 Downloading 307 files (447 MB total) … 2013/10/13 11:10:04 All 307 files downloaded in 9.105605931s. Download rate is 49.186567 MB/s 4,91s user 3,99s system 67% cpu 13,205 total
The wall clock time is higher than the time reported by the code because the code does not count the time for downloading and parsing the sources.gz file.
The entire program is about 400 lines (not SLOC) of Go code. It’s part of the Debian Code Search source. If you’re interested, you can read dcs-debmirror.go in your browser.
The outcome of this experiment is that I now know (and have shown!) that there are significantly more efficient ways of cloning a Debian mirror than what debmirror does. Furthermore, I have a good grasp on what kind of performance the Rackspace cloud offers and I am fairly happy with the numbers :-).
My code is useful to me in context of Debian Code Search, but unless you need a sid-only source-only mirror, it will not be useful to you directly. Of course you can take the ideas that I implemented and implement them elsewhere — personally, I don’t plan to do that.
If you have hardware, bandwidth and a use-case for 10 GBit/s+ mirroring, I’d like to hear about it! :-)
For a number of weeks now, I have been forwarding traffic being sent to codesearch.debian.net to an instance of Debian Code Search running at Rackspace’s public cloud offering. I feel like it’s overdue to announce how they have been supporting the project and what that means.
First of all, let’s provide some context. Debian Code Search was launched in November 2012, hosted entirely on my private server. Since DebConf 13, I am in contact with DSA in order to get Code Search running on a DSA-provided machine and make it an official Debian service. Code Search runs best on flash storage, and only with adequate resources can we enable some use cases (e.g. package-level results) and make sure the existing use cases work at all (we currently have timeouts for some queries). Unfortunately, flash storage is scarce in Debian, hence we couldn’t get any for Code Search.
At some point, I got the suggestion to ask Rackspace for help, since they are quite friendly to the Open Source community. They agreed to help us out with a generous amount of resources in their public cloud offering! This means I can run Code Search at Rackspace the way it is meant to be run: with the index sharded onto 6 different machines and the source code being searched on fast flash storage.
Now, when using third-party infrastructure, there are always two big concerns: proprietary infrastructure and vendor lock-in.
As for the proprietary aspect, Rackspace’s public cloud offering is based on OpenStack, which is FOSS. Granted, they have a couple of extensions that are not (yet?) released in OpenStack, but those are minor details and any automation that we use can trivially be ported to any other OpenStack offering. Who knows, maybe in the future we use OpenStack in Debian, too?
Given the OpenStack situation, I am not concerned about vendor lock-in. However, to specifically address this concern and err on the side of caution, attention will be paid to keep Code Search able to run on “non-cloud” infrastructure, such that we can run our own instance on DSA-provided hardware. My current intention is that this instance will be the fall-back that we can use should there ever be problems with Rackspace. So far, however, I am very happy with Rackspace’s stability.
There are a number of improvements waiting: for users in the United States, traffic first goes to my server in Europe and then back to Rackspace in the US. Eventually, we should get rid of this indirection. Perhaps we can also spin up another instance in Europe eventually. Furthermore, Rackspace recently announced new available hardware, which we are not yet making use of. I expect a big speed-up, but will have to do some careful benchmarking. Also, I have some interesting performance data, which will be shared in subsequent blog posts. Stay tuned :).
I’d like to thank Rackspace very much for their generous support of the Debian project and Code Search in particular. I hope you look forward to the upcoming posts with more details.
Also, to be perfectly clear about it: I also thank DSA for their help, and hope they will continue working with me to have our independent (yet slower) instance of Code Search running on Debian hardware.
Richi’s post about the pdiff-by-default agony resonates with me a lot.
On EVERY Debian installation I have ever done in the last few years, without any exceptions, I have turned off pdiffs. Even on all the oddball cases (Raspberry Pi, account on a remote machine, …) where I don’t run my install-configs script, I have ended up turning off pdiffs eventually, because it is just so insanely slow on modern internet connections. And by modern I mean even the DSL link my parents’ place has since 14 years.
Let’s disable pdiffs by default already.
Thanks to Axel Beckert (abe@), 12 people interested in Debian met last Tuesday in Zürich and celebrated the start of our monthly Debian meetup.
New faces are always very welcome. If you live in Zürich, or if you’re visiting, please feel free to attend our meetup — no registration necessary.
See the initial announcement, and subscribe to email@example.com for updates.
Thanks to everyone for the nice evening, see you next time!
During DebConf, Asheesh presented the idea of using git instead of the file system for storing the contents of Debian Code Search. The hope was that it would lead to fewer disk seeks and less data due to gits delta-encoding. Maybe the reduction would be big enough that enough data could be held in RAM to allow for fast retrieval.
Joey Hess helped me out with a couple of details: he revealed that using
git repack -a -d will lead to a single packfile that optimally
contains HEAD, not caring about the history. Also, he showed me how to use
git cat-file. We did a small-scale experiment and the results were
promising. I told him I’ll do the experiment of just committing the entire
unpacked source mirror to git and promised to follow up with the results, so
stapelberg@couper ~/unpacked master $ time cat <<EOT | git cat-file --batch >/dev/null :linux_3.2.32-1/sound/pci/ice1712/aureon.c :linux_3.2.32-1/sound/soc/codecs/sgtl5000.c :linux_3.2.32-1/sound/pci/hda/patch_conexant.c EOT cat <<<'' 0,00s user 0,00s system 62% cpu 0,006 total git cat-file --batch > /dev/null 4,38s user 1,08s system 99% cpu 5,477 total stapelberg@couper ~/unpacked master $ time cat linux_3.2.32-1/sound/pci/ice1712/aureon.c linux_3.2.32-1/sound/soc/codecs/sgtl5000.c linux_3.2.32-1/sound/pci/hda/patch_conexant.c >/dev/null cat linux_3.2.32-1/sound/pci/ice1712/aureon.c > /dev/null 0,00s user 0,00s system 69% cpu 0,006 total stapelberg@couper ~/unpacked master $ time cat <<EOT | git cat-file --batch >/dev/null :i3-wm_4.6-1/libi3/font.c EOT cat <<<':i3-wm_4.6-1/libi3/font.c' 0,00s user 0,00s system 51% cpu 0,008 total git cat-file --batch > /dev/null 4,30s user 1,18s system 99% cpu 5,533 total stapelberg@couper ~/unpacked master $ time cat i3-wm_4.6-1/libi3/font.c >/dev/null cat i3-wm_4.6-1/libi3/font.c > /dev/null 0,00s user 0,00s system 0% cpu 0,004 total
Even after repeating those tests a couple of times to get a warm page cache, the result stays the same: git takes about 5 seconds to resolve the deltas in a large repository. Even if this was 5 seconds startup time and a very small amount of additional time per file, it would not be acceptable for our use case.
The conclusion is that git is clearly not suitable for this kind of usage,
which is not surprising after having heard a couple of times that git does not
scale :-). For the curious, the
.git/objects directory is 29 GiB
for roughly 140 GiB of source code, so the delta encoding is quite impressive
in terms of saving space. However, keep in mind that the compressed (!) Debian
source archive is about 35 GiB, so the savings are not that huge.
I gave two talks at this year’s DebConf, both about systemd. A huge thanks goes to the video team for their excellent work and putting up the videos that quickly! Find the recordings and slides here:
I will arrive at DebConf 2013 on Sunday afternoon. In case you are interested in Go (the programming language), systemd, i3 or getting your package reviewed, please talk to me! :-)
Looking forward to meeting many of you in real life.
Good news, everyone! dh-golang is now in Debian unstable. With this debhelper addon, packaging software written in Go is very simple.
Have a look at the
example/ directory in dh-golang to see how it is meant to be used.
Essentially, export the DH_GOPKG variable containing the canonical upstream
location of the package (e.g. github.com/stapelberg/godebiancontrol)
and then use
dh $@ --buildsystem=golang --with=golang. That’s it!
Now, given that this package is very new and only two packages use it so far (both of which are not yet in Debian, but real soon™), there are most likely bugs and missing features. This is where you come in: when packaging software written in Go, please contact me and tell me whether dh-golang worked for you and what could be improved. In case something does not work, definitely report it. Even if everything worked, I’d be happy to have a look at your packaging before you upload.
Also see https://wiki.debian.org/MichaelStapelberg/GoPackaging.
Posting this on behalf of a friend of mine in the hope that you can help:
I’ve failed several times now to find a suitable WLAN USB dongle that works out of the box on Debian testing. Often manufacturers change the chipsets without changing the version numbers, the product pages are incomplete or even state wrong information.
I turn to you, in the hopes that someone can give me helpful pointers. The dongle should have:
If you have any hints, please send them directly to breunig AT uni-hd DOT de. Thanks!