What happened in the reproducible builds effort this week:

Toolchain fixes

akira submitted a patch to make cdbs export SOURCE_DATE_EPOCH. She uploded a package with the enhancement to the experimental “reproducible” repository.

Packages fixed

The following 15 packages became reproducible due to changes in their build dependencies: dracut, editorconfig-core, elasticsearch, fish, libftdi1, liblouisxml, mk-configure, nanoc, octave-bim, octave-data-smoothing, octave-financial, octave-ga, octave-missing-functions, octave-secs1d, octave-splines, valgrind.

The following packages became reproducible after getting fixed:

Some uploads fixed some reproducibility issues but not all of them:

In contrib, Dmitry Smirnov improved libdvd-pkg with 1.3.99-1-1.

Patches submitted which have not made their way to the archive yet:

  • #793705 on mime-support by akira: set the mtimes of all files which are modified during builds to the latest debian/changelog entry.
  • #793922 on polymake by Chris Lamb: sort Perl hash keys.
  • #793980 on lsof by Valentin Lorentz: removes extra informations from the build system..
  • #793996 on at by Valentin Lorentz: use absolute values for chmod.
  • #794005 on python-dtcwt by Chris Lamb: use a constant seed for the random number generator.
  • #794011 on gzip by Valentin Lorentz: remove date from texinfo header.
  • #794014 on moin by Dhole: normalizes the timezone and fixes timestamps from the files compressed with zip.
  • #794106 on cryptsetup by Valentin Lorentz: use date from latest debian/changelog entry in manpage.
  • #794130 on coq by Valentin Lorentz: use C locale and UTC timezone when generating the build date.
  • #794225 on libsyncml by akira: export SOURCE_DATE_EPOCH.
  • #794239 on [ zipios++](https://tracker.debian.org/ zipios++) by akira: export SOURCE_DATE_EPOCH.
  • #794247 on whizzytex by Dhole: remove current date from documentation.
  • #794248 on cortado by Dhole: allow the build date to be set externally and use time from the latest debian/changelog entry.
  • #794395 on classified-ads by Reiner Herrmann: remove timestamps from PNG images.
  • #794398 on clhep by Reiner Herrmann: sort source file list.
  • #794399 on parsec47 by Reiner Herrmann: use C locale when sorting source files.
  • #794400 on tumiki-fighters by Reiner Herrmann: use C locale when sorting source files.

reproducible.debian.net

Four armhf build hosts were provided by Vagrant Cascadian and have been configured to be used by jenkins.debian.net. Work on including armhf builds in the reproducible.debian.net webpages has begun. So far the repository comparison page just shows us which armhf binary packages are currently missing in our repo. (h01ger)

The scheduler has been changed to re-schedule more packages from stretch than sid, as the gcc5 transition has started… This mostly affects build log age. (h01ger)

A new depwait status has been introduced for packages which can't be built because of missing build dependencies. (Mattia Rizzolo)

debbindiff development

Finally, on August 31st, Lunar released debbindiff 27 containing a complete overhaul of the code for the comparison stage. The new architecture is more versatile and extensible while minimizing code duplication. libarchive is now used to handle cpio archives and iso9660 images through the newly packaged python-libarchive-c. This should also help support a couple other archive formats in the future. Symlinks and devices are now properly compared. Text files are compared as Unicode after being decoded, and encoding differences are reported. Support for Sqlite3 and Mono/.NET executables has been added. Thanks to Valentin Lorentz, the test suite should now run on more systems. A small defiency in unquashfs has been identified in the process. A long standing optimization is now performed on Debian package: based on the content of the md5sums control file, we skip comparing files with matching hashes. This makes debbindiff usable on packages with many files. Fuzzy-matching is now performed for files in the same container (like a tarball) to handle renames. Also, for Debian .changes, listed files are now compared without looking the embedded version number. This makes debbindiff a lot more useful when comparing different versions of the same package.

Based on the rearchitecturing work has been done to allow parallel processing. The branch now seems to work most of the time. More test needs to be done before it can be merged.

The current fuzzy-matching algorithm, ssdeep, has showed disappointing results. One important use case is being able to properly compare debug symbols. Their path is made using the Build ID. As this identifier is made with a checksum of the binary content, finding things like CPP macros is much easier when a diff of the debug symbols is available. Good news is that TLSH, another fuzzy-matching algorithm, has been tested with much better results. A package is waiting in NEW and the code is ready for it to become available.

A follow-up release 28 was made on August 2nd fixing content label used for gzip2, bzip2 and xz files and an error on text files only differing in their encoding. It also contains a small code improvement on how comments on Difference object are handled.

This is the last release name debbindiff. A new name has been chosen to better reflect that it is not a Debian specific tool. Stay tuned!

Documentation update

Valentin Lorentz updated the patch submission template to suggest to write the kind of issue in the bug subject.

Small progress have been made on the Reproducible Builds HOWTO while preparing the related CCCamp15 talk.

Package reviews

235 obsolete reviews have been removed, 47 added and 113 updated this week.

42 reports for packages failing to build from source have been made by Chris West (Faux).

New issue added this week: haskell_devscripts_locale_substvars.

Misc.

Valentin Lorentz wrote a script to report packages tested as unreproducible installed on a system. We encourage everyone to run it on their systems and give feedback!