The effect of Covid-19 on the Debian Med project

Time

Abstract

Starting with the official Covid-19 hackathon the Debian Med project received a lot of new contributions. We have new contributors, increased effort of bug fixing and autopkgtest writing, increased support by other teams in Debian like ftpmaster, new communication channels (weekly video conference) and a broader attention in the world of bioinformatics. All in all the Covid-19 "crisis" gave the Debian Med project a huge push forward.

This is a short report about the COVID-19 sprint which turned from a one week effort into a permanent high activity phase of the Debian Med project.

Intro

  • Every crisis has effects where some people or organisations gain positive effects.
  • Its very obvious that producers of respirator masks make more profit now.
  • Its also obvious that bioinformatics and epidemiologic research meets new challenges in the crisis.
  • Since Debian Med is active in the field of bioinformatics and epidemiology the project experienced a huge rise in awareness and contributions.
  • The effort started with a global sprint for Free Software in Bioinformatics

Personal situation

  • Since I'm working in a medical institute in a group of epidemiologists my payed workload bumped starting with the advent of COVID-19 in Germany.
  • At the same time the challenges in Debian Med increased drastically.
  • So I had to cut even more time from my private life and I hereby want to say repeat my "thanks a lot to my wife" as I did in my talk at DebConf17 who tolerated that I even worked on Easter holidays for Debian Med. This is definitely not normal and I love her a lot for this (but not only for this 😉).

Tackling a new task

  • As a physicist by profession who is not using all packages its frequently hard for me to know what to package next to have the best user experience.
  • I started asking around members of the Debian Med team to categorise our packages existing in Debian or at least in preparation on Salsa to find out which packages might help in researching COVID-19.
  • This list was implemented in the usual blends technique as task covid-19 which enables grouping Debian packages to certain topics. It is also featuring a bugs page. The latter enabled us to direct bug hunting volunteers to the relevant packages.
  • Several people helped to assemble this list. It remains work in progress and needs further curation.
  • Another pretty comprehensive list was provided by Jun Aruga who is usually contributing to Fedora Medical SIG. It provides a comparison of tools between software packaged in Fedora, Debian and Conda. It is assembled according to the usage of tools in nextflow pipelines which are popular in sequencing viruses.
  • The latter provided a great TODO list and we were able to inject a lot of new packages and started packaging most of what is missing but harder to package.

New contributors

  • With the announcement of the Debian @ COVID-19 Biohackathon a lot of new contributors showed up to help the Debian Med team.
  • Interestingly many of them did not come from the field of bioinformatics or medicine like me and were really helpful anyway. Packaging and bug-fixing is basically neutral to the topic the software is covering. That's a bit different with writing autopkgtests which usually requires more insight into the topic.
  • For instance there is a lot of relevant tools that are using the machine learning framework TensorFlow which is build using the Bazel build system - which needs to be packaged for Debian first to enable us building it. Luckily Olek Wojnar dedicated to work on the Bazel packaging. So its very promising that we will see Bazel and TensorFlow in Debian in the foreseeable future. This would not have been possible without the COVID-19 sprint.
  • Similarly we have some long standing missings on our tasks like snpeff that were blocked by missing dependencies. Pierre Gruet dedicated his time and Java knowledge to fix this.
  • Another example is the snakemake package which is a quite popular package in bioinformatics to define workflows. The package was de facto orphaned inside the Debian Med team since the original Uploader went away and so I had to struggle hard to keep that package alive. In the Debian Med sprint Rebecca N. Palmer from Debian Science took over this task. Thanks a lot Rebecca.
  • There are lot of other examples. Last but not least two GSoC students were accepted to work on Debian Med tasks one of these titled Packaging and Quality assurance of COVID-19 relevant applications. Both GSoC students (Pranav Ballaney and Nilesh Patra) are very active even before the official GSoC project had started.
  • Amongst those new contributors we do not want to forget old contributors like previous GSoC and Outreachy students and several long term contributors who increased their involvement.

Distributing tasks amongst contributors

  • I considered my main challenge the distribution of tasks to all those great distributors.
  • My personal agenda for this week became basically void since organising tasks and teaching newcomers took a lot of my time. But that's perfectly fine.
  • My habit to try hard that nobody needs to wait for my response that is needed to keep on working was well perceived by the mentees. I established this habit as well in the Mentoring of the Month project. Teaching is not always successful to gain a person as a longterm contributor - but teaching well increases the chances for this.

Support from ftpmaster

  • One special experience of the one week of sprint was the cooperation with ftpmaster to get new software accepted.
  • We have setup a Wiki page where we listed software that is relevant for COVID-19 research or preconditions for these.
  • It was a pure pleasure to observe that most of the listed packages were accepted (or at least dealt with in a rejection) by ftpmaster.
  • A big thanks goes to all members of the ftp team who helped us a lot.

Licensing

  • In the sprint we also approached upstream of software with non-free licenses.
  • Very interesting was the now clarified license for r-cran-locfit which opened the gate for several other GNU R packages.
  • We made some progress on our SoftwareLiberation Wiki page. In general it can be stated that the dialog with upstream under COVID-19 circumstances was not always successful but more intense.

New communication channels

  • Usually the main communication in the Debian Med team is vial the Mailing list
  • I switched to "DebConf-Mode" and was following IRC more closely - but IRC is not really a new channel (except for me).
  • We have enhanced the Telegram channel that was used for GSoC and Outreachy communication for some more interested people. I admit I'm not a fan of closed and non-archived communication but to some extend this helped to shorten the communication.
  • The main new channel was a weekly video meeting via Jitsi. It was a really great experience to see a daily development, always a few new people and a constructive discussion about the work that was done and plans for the next day.

New QA tool in development

  • My original work plan besides packaging new software like crazy to develop a new Blends tool to get an overview over QA features per Blends task.
  • The idea is to put data like debci failures, testing migration issues etc. on one page for all packages of one task. So people who are interested in a certain task can immediately see where work is needed.
  • Due to the mentoring work as well as urgent packaging tasks this work went slowly but the UDD query works and some preliminary (!!) web page exists but needs a lot of more work regarding content and layout.

The one week sprint is not over

  • While the official sprint was just one week several participants stayed in high productivity mode.
  • I personally have an about one new package per day output over the last three weeks.
  • The GSoC students are contributing one autopkgtest or fixed bug per day.
  • We have a weekly video conference via Jitsi.

TODO

  • Quantification of the success in team metrics.
  • Finalising QA tools
  • More packaging
  • Keep on nagging upstreams for free licenses
  • Organise another sprint week (probably 2020-06-15 to 2020-06-22)