Surveys in Debian

This is my page trying to list surveys done by members of the debian project, which also gives some pointers to advice and why I care. Most of the survey discussions are posted to debian-project and I try to contribute constructive criticism when possible. I'd welcome suggestions for the contents of this page.

Surveys by debian developers

  1. bits from the DMs/NMs/AMs?, Paul Wise (2008-03)
  2. The construction and deconstruction of the legitimacy of people and ideas in the Free Software Community - A study about Debian, Daniel Ruoso (2006-07, results, method, pt and partial en)
  3. Poll of debian-user readers about the GNU FDL, Brian Nelson (2005-04, results, some method)
  4. Debian usage survey, Enrico Zini (2004-04, results 2005-03 )
  5. The debian-women supporters survey, Amaya (2004-07, some results, method)
  6. Strawpoll on proper usage of email address, Michael Banck (2004-01, results, method)

Surveys on debian, rather than by developers

  1. Academy of Management Journal, Oct 2007, Vol. 50 Issue 5, p1079-1106, 28p, Siobhán O'Mahony and Fabrizio Ferraro (summary by André Felipe Machado)
  2. Managing Open Source Software as an Integrated Part of Business, Niklas Vainio (2006-01, results to appear?)
  3. Political Motives of Developers for Collaboration on GNU/Linux, Tobias Escher (2004-07, results, method, non-free)
  4. FLOSSPOLS Second Survey, FLOSSPOLS (2005-03, results to appear?)
  5. YAQ - Yet Another Questionnaire, Frauke Lehmann (2004-07, results, method (in German-language dissertation), non-free)
  6. Short Survey, Biella Coleman (2003-05, in dissertation)
  7. Who is Doing it, Gregorio Robles (2001-07, results 2001-08, method, non-free)

Surveys on free software in general

  1. Open Source Case Studies, IDABC - Interchange of Data Between Administrations (viewed 2008-03)
  2. DebianInThePublicSector, WMartinBorgert and TorstenWerner (viewed 2008-03)
  3. Free software in public administration, beta, Ciarán O'Riardon (viewed 2008-03)

I find it disappointing that non-developer surveys take a lot longer to publish results, only about half seem to publish at all, and non-free software terms like CC-nd, -nc and -by are common for ones which do. I have not listed projects which don't look like they'll ever publish.

Advice to surveyors

Because I criticise bad surveys, people expect me to give advice about how to write good surveys. I'm not sure why: there are more than enough elementary statistics texts which cover surveying far better than I can. It is not a subject which is hugely different for debian.

I like "Statistics Explained" by John Parry Lewis and Alasdair Traill, but I'm not sure how easy it is to get and it's rather big. Much easier to obtain is "Statistics Without Tears: A Primer for Non-Mathematicians" by Derek Rowntree. If you want something online, start with Some Aspects of Study Design from Gerard E. Dallal's "Little Handbook of Statistical Practice" and follow the link to the full thing when you're ready, but I strongly suggest getting "Tears" from your local library too. If you know of a free software work on the subject, please let me know and I'll promote it here.

A survey is evidence, not proof. You will probably not convince those who vehemently disagree with you and how many "floating voters" you convince is correlated with how good your survey is. Basically, as with any other surveys, the golden rules are:

  1. Privately, start the write-up before you start collecting data and define:
  2. Be as simple, clear and neutral as you can: in question phrasing and procedure details for a questionnaire, or in measurement methods for automated data collection, or so on;
  3. Be considerate: Remember that respondants are giving time to your research;
  4. Carry out a pilot survey. Use it to debug the survey, as well as estimating the sample size you need to get the desired strength of outcome;
  5. Carry out the main survey;
  6. The first summary results should be a simple descriptive compilation of responses. Leave any commentary until the second part, then try to use any applicable methods to support or debunk your commentary;
  7. Publish as much as you can, so others can check, reproduce or extend your findings.

To recap: define, clear, considerate, pilot, survey, summary, publish.

For more specific details, consider this advice from:

Who am I and why I care

I am a debian developer who studied degree-level statistics for about ten years. I have been trained to do surveys by public and private sector groups. I trained others on-and-off for about 5 years. I'll link an example of my work here when it's online. These days, I earn a living offering web site visitor analysis, accessibility checking and market research, along with some programming.

Primarily, I want to help debian contributors carry out better surveys and so help manage their debian work better. The project is large and anecdotes don't always help. Some think that if you state an opinion, you must be ready to defend it forever, else it has less value: I reject that and believe good solid research will make collections of "ordinary" opinions defensible against this view.

Also, I find it really frustrating to see research projects surveying debian, but wasting our time. It is especially annoying when these projects are funded with large amounts of government money which could have gone to groups related to FSF Europe, but for politics.


Thanks to Andreas Schuldei for motivating me to work on this. Many thanks to Enrico Zini for commenting on an early draft and suggesting some "be considerate" advice. Thanks to all who commented on the announcement to debian-project. All errors are still probably mine, though.

You can copy this page under the same terms as my personal site.

17:00, 23 Apr 2008 MJR