Debian GNU/Linux port for RISC-V 64-bit (riscv64)

This is a post describing my involvement with the Debian GNU/Linux port for RISC-V (unofficial and not endorsed by Debian at the moment) and announcing the availability of the repository (still very much WIP) with packages built for this architecture.

If not interested in the story but you want to check the repository, just jump to the bottom.

Roots

A while ago, mostly during 2014, I was involved in the Debian port for OpenRISC (or1k) ─ about which I posted (by coincidence) exactly 2 years ago.

The two of us working on the port stopped in August or September of that year, after knowing that the copyright of the code to add support for this architecture in GCC would not be assigned to the FSF, so it would never be added to GCC upstream ─ unless the original authors changed their mind (which they didn't) or there was a clean-room reimplementation (which didn't happen so far).

But a few other key things contributed to the decision to stop working on that port, which bear direct relationship to this story.

One thing that particularly influenced me to stop working on it was a sense of lack of purpose, all things considered, for the OpenRISC port that we were working on.

For example, these chips are sometimes used as part of bigger devices by Samsung to control or wake up other chips; but it was not clear whether there would ever be devices with OpenRISC as the main chip, specially in devices powerful enough to run Linux or similar kernels, and Debian on top. One can use FPGAs to synthesise OpenRISC or1k, but these are slow, and expensive when using lots of memory.

Without prospects of having hardware easily available to users, there's not much point in having a whole Debian port ready to run on hardware that never comes to be.

Yeah, sure, it's fun to create such a port, but it's tons of work to maintain and keep it up to-date forever, and with close to zero users it's very unrewarding.

Another thing that contributed to decide to stop is that, at least in my opinion, 32-bit was not future-proof enough for general purpose computing, specially for new devices and ports starting to take off on that time and age. There was some incipient work to create another OpenRISC design for 64-bits, but it was still in an early phase.

My secret hope and ultimate goal was to be able to run as much a free-ish computer as possible as my main system. Still today many people are buying and using 32-bit devices, like small boards; but very few use it as their main computer or servers for demanding workloads or complex services. So for me, even if feasible if one is very austere and dedicated, OpenRISC or1k failed that test.

And lastly, another thing happened at the time...

Enter RISC-V

In August 2014, at the point when we were fully acknowledging the problem of upstreaming (or rather, lack thereof) the support for OpenRISC in GCC, RISC-V was announced to the world, bringing along papers with suggestive titles such as Instruction Sets Should Be Free: The Case For RISC-V (pdf) and articles like RISC-V: An Open Standard for SoCs - The case for an open ISA in EE Times.

RISC-V (as the previous RISC-n marks) had been designed (or rather, was being designed, because it was and is an as yet unfinished standard) by people from UC Berkeley, including David Patterson, the pioneer of RISC computer designs and co-author of the seminal book “Computer Architecture: A Quantitative Approach”. Other very capable people are also also leading the project, doing the design and legwork to make it happen ─ see the list of contributors.

But, apart from throwing names, the project has many other merits.

Similarly to OpenRISC, RISC-V is an open instruction set architecture (ISA), but with the advantage of being designed in more recent times (thus avoiding some mistakes and optimising for problems discovered more recently, as technology evolves); with more resources; with support for instruction widths of 32, 64 and even 128 bits; with a clean set of standard but optional extensions (atomics, float, double, quad, vector, ...); and with reserved space to add new extensions in ordered and compatible ways.

In the view of some people in the OpenRISC community, this unexpected development in a way made irrelevant the ongoing update of OpenRISC for 64-bits, and from what I know and for better or worse, all efforts on that front stopped.

Also interestingly (if nothing else, for my purposes of running as much a free system as possible), was that the people behind RISC-V had strong intentions to make it useful to create modern hardware, and were pushing heavily towards it from the beginning.

And together with this announcement or soonly after, it came the promise of free-ish hardware in the form of the lowRISC project. Although still today it seems to be a long way from actually shipping hardware, at least there was some prospect of getting it some day.

On top of all that, about the freedom aspect, both the Berkeley and lowRISC teams engaged since very early on with the free hardware community, including attending and presenting at OpenRISC events; and lowRISC intended to have as much free hardware as possible in their planned SoC.

Starting to work on the Debian port, this time for RISC-V

So in late 2014 I slowly started to prepare the Debian port, switching on and off.

The Userland spec was frozen in 2014 just before the announcement, the Privilege one is still not frozen today, so there was no need to rush.

There were plans to upstream the support in the toolchain for RISC-V (GNU binutils, glibc and GCC; Linux and other useful software like Qemu) in 2015, but sadly these did not materialise until late 2016 and 2017. One of the main reasons for the delay was due to the slowness to sort out the copyright assignment of the code to the FSF (again). Still today, only binutils and GCC are upstreamed, and Linux and glibc depend on the Privilege spec being finished, so it will take a while.

After the experience with OpenRISC and the support in GCC, I didn't want to invest too much time, lest it all became another dead-end due to lack of upstreaming ─ so I was just cross-compiling here and there, testing Qemu (which still today is very limited for this architecture, e.g. no network support and very limited character and block devices) and trying to find and report bugs in the implementations, and send patches (although I did not contribute much in the end).

Incompatible changes in the toolchain

In terms of compiling packages and building-up a repository, things were complicate, and less mature and stable than the OpenRISC ones were even back in 2014.

In theory, with the Userland spec being frozen, regular programs (below the Operating System level) compiled at any time could have run today; but in practice what happened is that there were several rounds of profound ─or, at least, disrupting─ changes in the toolchain before and while being upstreamed, which made the binary packages that I had built to not work at all (changes in dynamic loader, registers where arguments are stored when jumping functions, etc.).

These major breakages happened several times already, and kind of unexpectedly ─ at least for the people not heavily involved in the implementation.

When the different pieces are upstreamed it is expected that these breakages won't happen; but still there's at least the fundamental bit of glibc, which will probably change things once again in incompatible ways before or while being upstreamed.

Outside Debian but within the FOSS / Linux world, the main project that I know of is that some people from Fedora also started a port in mid 2016 and did great advances, but ─from what I know─ they put the project in the freezer in late 2016 until all such problems are resolved ─ they don't want to spend time rebootstrapping again and again.

What happened recently on the Debian front

In early 2016 I created the page for RISC-V in the Debian wiki, expecting that things were at last fully stable and the important bits of the toolchain upstreamed during that year ─ I was too optimistic.

Some other people (including Debian folks) have been contributing for a while, in the wiki, mailing lists and IRC channels, and in the RISC-V project mailing lists ─ you will see their names everywhere.

However, due to the combination of lack of hardware, software not upstreamed and shortcomings of emulators (chiefly Qemu) make contributions hard and very tedious, nothing much happened recently visible to the outside world in terms of software.

The private repository-in-the-making

In late 2015 and beginning of 2016, having some free time in my hands and expecting that all things would coalesce quickly, I started to build a repository of binary packages in a more systematic way, with most of the basic software that one can expect in a basic Debian system (including things common to all Linux systems, and also specific Debian software like dpkg or apt, and even aptitude!).

After that I also built many others outside the basic system (more than 1000 source packages and 2000 or 3000 arch-dependent binary packages in total), specially popular libraries (e.g. boost, gtk+ version 2 and 3), interpreters (several versions of lua, perl and python, also version 2 and 3) and in general packages that are needed to build many other packages (like doxygen). Unfortunately, some of these most interesting packages do not compile cleanly (more because of obscure or silly errors than proper porting), so they are not included at the moment.

I intentionally avoided trying to compile thousands of packages in the archive which would be of nobody's use at this point; but many more could be compiled without much effort.

About the how, initially I started cross-compiling and using rebootstrap, which was of great help in the beginning. Some of the packages that I cross-compiled had bugs that I did not know how to debug without a “live” and “native” (within emulators) system, so I tried to switch to “natively” built packages very early on. For that I needed many packages built natively (like doxygen or cmake) which would be unnecessary if I remained cross-compiling ─ the host tools would be used in that case.

But this also forced me to eat my own dog food, which even if much slower and tedious, it was on the whole a more instructive experience; and above all, it helped to test and make sure that the the tools and the whole stack was working well enough to build hundreds of packages.

Why the repository-in-the-making was private

Until now I did not attempt to make the repository available on-line, for several reasons.

First because it would be kind of useless to publish files that were not working or would soon not work, due to the incompatible changes in the toolchain, rendering many or most of the packages built useless. And because, for many months now, I expected that things would stabilise and to have something stable “really soon now” ─ but this didn't happen yet.

Second because of lack of resources and time since mid 2016, and because I got some packages only compiled thanks to (mostly small and unimportant, but undocumented and unsaved) hacks, often working around temporary bugs and thus not worth sending upstream; but I couldn't share the binaries without sharing the full source and fulfill licenses like the GNU GPL. I did a new round of clean rebuilds in the last few weeks, just finished, the result is close to 1000 arch-dependent packages.

And third, because of lack of demand. This changed in the last few weeks, when other people started to ask me to share the results even if incomplete or not working properly (I had one request in the past, but couldn't oblige in time at the time).

Finally, the work so far: repository now on-line

So finally, with the great help from Kurt Keville from MIT, and Bytemark sponsoring a machine where most of the packages were built, here we have the repository:

The lines for /etc/apt/sources.list are:

 deb [ arch=riscv64 signed-by=/usr/share/keyrings/debian-keyring.gpg ] http://riscv.mit.edu/debian unstable main
 deb-src [ signed-by=/usr/share/keyrings/debian-keyring.gpg ] http://riscv.mit.edu/debian unstable main

The repository is signed with my key as Debian Developer, contained in the file /usr/share/keyrings/debian-keyring.gpg, which is part of the package debian-keyring (available from Debian and derivatives).

WARNING!!

This repository, though, is very much WIP, incomplete (some package dependencies cannot be fulfilled, and it's only a small percentage of the Debian archive, not trying to be comprehensive at the moment) and probably does not work at all in your system at this point, for the following reasons:

  • I did not create Debian packages for fundamental pieces like glibc, gcc, linux, etc.; although I hope that this happens soon after the next stable release (Stretch) is out of the door and the remaining pieces are upstreamed (help welcome).
     
  • There's no base system image yet, containing the software above plus a boot loaders and sequence to set up the system correctly. At the moment you have to provide your own or use one that you can find around the net. But hopefully we will provide an image soon. (Again, help welcome).
     
  • Due to ongoing changes in the toolchain, if the version of the toolchain in your image is very new or very old, chances are that the binaries won't work at all. The one that I used was the master branch on 20161117 of their fork of the toolchain: https://github.com/riscv/riscv-gnu-toolchain

The combination of all these shortcomings is specially unfortunate, because without glibc provided it will be difficult that you can get the binaries to run at all; but there are some packages that are arch-dependent but not too tied to libc or the dynamic loader will not be affected.

At least you can try one the few static packages present in Debian, like the one in the package bash-static. When one removes moving parts like the dynamic loader and libc, since the basic machine instructions are stable for several years now, it should work, but I wouldn't discard some dark magic that prevents even static binaries from working.

Still, I hope that the respository in its current state is useful to some people, at least for those who requested it. If one has the environment set-up, it's easy to unpack the contents of the .deb files and try out the software (which often is not trivial or very slow to compile, or needs lots of dependencies to be built first first).

... and finally, even if not useful at all for most people at the moment, by doing this I also hope that efforts like this spark your interest to contribute to free software, free hardware, or both! :-)