Debian Developer Database Statistics

03.12.2006 at 13:02

I am currently writing a paper for school with the title "Free and OpenSource Software in Switzerland", therefore i am interested in the FOSS community here in Switzerland. So i decided to make some statistic out of the Debian Developer Database, below is my ruby script for that purpose. It basically just queries the webinterface via https and then parses the results and makes some statistics. I plan to set this number in relation to the population of the corresponding countries.

#!/usr/bin/env ruby

# requires libopenssl-ruby1.8
require 'net/https'

class DDDB #Debian Developer Database

private_class_method :new

public
	def self.get_developer_count(country = 'any')
		html = get_html_for_country(country)
		if html =~ /Number of entries matched: <b>([0-9]+)</b>/
			return $1.to_i
		end
		return 0;
	end

	def self.get_developers(country = 'any')
		q = '<font size=+1>(?:<a href="(.*?)">)?(.*?)?(?:</a>)?'
		q+= '</font> (uid=.*?login:</b></td><td> '
		q+= '<a href="mailto:([a-z0-9.@]+)"'
		html = get_html_for_country(country)
		html.scan /#{q}/m
	end
	
	def self.get_countries
		http = Net::HTTP.new("db.debian.org", 443)
		http.use_ssl = true
		http.start { |http|
			res = http.get('/') 
			res.body.scan /<option value="([a-z]{2})">(.*$)/
		}
	end

private
	@@data = {}

	def self.get_html_for_country(country)
		if !@@data.has_key? country 
			@@data[country] = get_html_for_query(
				"country=#{country == 'any' ? '' : country}"
			)
		end
		@@data[country];
	end

	def self.get_html_for_query(query)
		http = Net::HTTP.new("db.debian.org", 443)
		http.use_ssl = true
		http.start { |http|
			res = http.post(
				'/search.cgi', 
				"#{query}&dosearch=Search..."
			)
			return res.body
		}
	end
end

total = 0
data = {}

DDDB.get_countries().each do |code,country|
	count = DDDB.get_developer_count(code)
	if count > 0
		data[code] = { 'country' => country, 'count' => count }
		total += count
	end
end

print "Total Debian developers: ", DDDB.get_developer_count(), "n"
print "Total Debian developers with country specified: ", total, "n"

data.each_value do |entry|
	print entry['country'].ljust(35), entry['count'].to_s.rjust(5),' ',
		"%02.2f%" % (entry['count'].to_f*100/total), "n"
end

exit 0

DDDB.get_developers('ch').each { |www,name,mail|
	print name.ljust(35),(" <"+mail+"> ").ljust(35)
	print www if !www.nil?
	print "n"
}

Oh well after having written the above script, i actually found out that i simply could have queried the ldap directory directly. Sigh.

Comments (0)

There are currently no comments available