For the past few months, we've been building Caliper to help you easily generate code metrics for your Ruby projects. We've recently added another dimension of metrics information: community statistics for all the Ruby projects that are currently in Caliper.
The idea of community statistics is two-fold. From a practical perspective, you can now compare your project's metrics to the community. For example, Flog measures the complexity of methods. Many people wonder exactly defines a good Flog score for an individual method. In Jake Scruggs' opinion, a score of 0-10 is "Awesome", while a score of 11-20 is "Good enough". That sounds correct, but with Caliper's community metrics, we can also compare the average Flog scores for entire projects to see what defines a good average score.
To do so, we calculate the average Flog method score for each project and plot those averages on a histogram, like so:

Looking at the data, we see that a lot of projects have an average Flog score between 6 and 12 (the mean is 10.3 and the max is is 21.3).
If your project's average Flog score is 9, does that mean it has only "Awesome" methods, Flog-wise? Well, remember that we're looking at the average score for each project. I suspect that in most projects, lots of tiny methods are pulling down the average, but there are still plenty of big, nasty methods. It would be interesting to look at the community statistics for maximum Flog score per project or see a histogram of the Flog scores for all methods across all projects (watch this space!).
Since several of the metrics (like Reek, which detects code smells) have scores that grow in proportion to the number of lines of code, we divide the raw score by each project's lines of code. As a result, we can sensibly compare your project to other projects, no matter what the difference in size.
The second reason we're calculating community statistics is so we can discover trends across the Ruby community. For example, we can compare the ratio of lines of application code to test code. It's interesting to note that a significant portion of projects in Caliper have no tests, but that, for the projects that do have tests, most of them have a code:test ratio in the neighborhood of 2:1.

Other interesting observations from our initial analysis:
* A lot of projects (mostly small ones) have no Flay duplications.
* Many smaller projects have no Reek smells, but the average project has about 1 smell per 9 lines of code.
Want to do your own analysis? We've built a scatter plotter so you can see if any two metrics have any correlation. For instance, note the correlation between code complexity and code smells.
Here's a scatter plot of that data (zoomed in):
Over the coming weeks, we'll improve the graphs we have and add new graphs that expose interesting trends. But we need your help! Please let us know if you spot problems, have ideas for new graphs, or have any questions. Additionally, please add your project to Caliper so it can be included in our community statistics. Finally, feel free to grab the raw stats from our alpha API* and play around yourself!
* Quick summary: curl http://api.devver.net/metrics for JSON, curl -H 'Accept:text/x-yaml' http://api.devver.net/metrics for YAML. More details. API is under development, so please send us feedback!

