• Improving Code using Metric_fu

    by Dan

    Often, when people see code metrics they think, "that is interesting, I don't know what to do with it." I think metrics are great, but when you can really use them to improve your project's code, that makes them even more valuable. metric_fu provides a bunch of great metric information, which can be very useful. But if you don't know what parts of it are actionable it's merely interesting instead of useful.

    One thing when looking at code metrics to keep in mind is that a single metric may not be as interesting. If you look at a metric trends over time it might help give you more meaningful information. Showing this trending information is one of our goals with Caliper. Metrics can be your friend watching over the project and like having a second set of eyes on how the code is progressing, alerting you to problem areas before they get out of control. Working with code over time, it can be hard to keep everything in your head (I know I can't). As the size of the code base increases it can be difficult to keep track of all the places where duplication or complexity is building up in the code. Addressing the problem areas as they are revealed by code metrics can keep them from getting out of hand, making future additions to the code easier.

    I want to show how metrics can drive changes and improve the code base by working on a real project. I figured there was no better place to look than pointing metric_fu at our own devver.net website source and fixing up some of the most notable problem areas. We have had our backend code under metric_fu for awhile, but hadn't been following the metrics on our Merb code. This, along with some spiked features that ended up turning into Caliper, led to some areas getting a little out of control.

    Flay Score before cleanup

    When going through metric_fu the first thing I wanted to start to work on was making the code a bit more DRY. The team and I were starting to notice a bit more duplication in the code than we liked. I brought up the Flay results for code duplication and found that four databases models shared some of the same methods.

    Flay highlighted the duplication. Since we are planning on making some changes to how we handle timestamps soon, it seemed like a good place to start cleaning up. Below are the methods that existed in all four models. A third method 'update_time' existed in two of the four models.

     def self.pad_num(number, max_digits = 15)
        "%%0%di" % max_digits % number.to_i
      end
    
      def get_time
          Time.at(self.time.to_i)
      end
    

    Nearly all of our DB tables store time in a way that can be sorted with SimpleDB queries. We wanted to change our time to be stored as UTC in the ISO 8601 format. Before changing to the ISO format, it was easy to pull these methods into a helper module and include it in all the database models.

    module TimeHelper
    
      module ClassMethods
        def pad_num(number, max_digits = 15)
          "%%0%di" % max_digits % number.to_i
        end
      end
    
      def get_time
          Time.at(self.time.to_i)
      end
    
      def update_time
        self.time = self.class.pad_num(Time.now.to_i)
      end
    
    end
    

    Besides reducing the duplication across the DB models, it also made it much easier to include another time method update_time, which was in two of the DB models. This consolidated all the DB time logic into one file, so changing the time format to UTC ISO 8601 will be a snap. While this is a trivial example of a obvious refactoring it is easy to see how helper methods can often end up duplicated across classes. Flay can come in really handy at pointing out duplication that over time that can occur.

    Flog gives a score showing how complex the measured code is. The higher the score the greater the complexity. The more complex code is the harder it is to read and it likely contains higher defect density. After removing some duplication from the DB models I found our worst database model based on Flog scores was our MetricsData model. It included an incredibly bad high flog score of 149 for a single method.

    File Total score Methods Average score Highest score
    /lib/sdb/metrics_data.rb 327 12 27 149

    The method in question was extract_data_from_yaml, and after a little refactoring it was easy to make extract_data_from_yaml drop from a score of 149 to a series of smaller methods with the largest score being extract_flog_data! (33.6). The method was doing too much work and was frequently being changed. The method was extracting the data from 6 different metric tools and creating summary of the data.

    The method went from a sprawling 42 lines of code to a cleaner and smaller method of 10 lines and a collection of helper methods that look something like the below code:

      def self.extract_data_from_yaml(yml_metrics_data)
        metrics_data = Hash.new {|hash, key| hash[key] = {}}
        extract_flog_data!(metrics_data, yml_metrics_data)
        extract_flay_data!(metrics_data, yml_metrics_data)
        extract_reek_data!(metrics_data, yml_metrics_data)
        extract_roodi_data!(metrics_data, yml_metrics_data)
        extract_saikuro_data!(metrics_data, yml_metrics_data)
        extract_churn_data!(metrics_data, yml_metrics_data)
        metrics_data
      end
    
      def self.extract_flog_data!(metrics_data, yml_metrics_data)
        metrics_data[:flog][:description] = 'measures code complexity'
        metrics_data[:flog]["average method score"] = Devver::Maybe(yml_metrics_data)[:flog][:average].value(N_A)
        metrics_data[:flog]["total score"]   = Devver::Maybe(yml_metrics_data)[:flog][:total].value(N_A)
        metrics_data[:flog]["worst file"] = Devver::Maybe(yml_metrics_data)[:flog][:pages].first[:path].fmap {|x| Pathname.new(x)}.value(N_A)
      end
    

    Churn gives you an idea of files that might be in need of a refactoring. Often if a file is changing a lot it means that the code is doing too much, and would be more stable and reliable if broken up into smaller components. Looking through our churn results, it looks like we might need another layout to accommodate some of the different styles on the site. Another thing that jumps out is that both the TestStats and Caliper controller have fairly high churn. The Caliper controller has been growing fairly large as it has been doing double duty for user facing features and admin features, which should be split up. TestStats is admin controller code that also has been growing in size and should be split up into more isolated cases.

    churn results

    Churn gave me an idea of where might be worth focusing my effort. Diving in to the other metrics made it clear that the Caliper controller needed some attention.

    The Flog, Reek, and Roodi Scores for Caliper Controller:

    File Total score Methods Average score Highest score
    /app/controllers/caliper.rb 214 14 15 42

    reek before cleanup

    Roodi Report
    app/controllers/caliper.rb:34 - Method name "index" has a cyclomatic complexity is 14.  It should be 8 or less.
    app/controllers/caliper.rb:38 - Rescue block should not be empty.
    app/controllers/caliper.rb:51 - Rescue block should not be empty.
    app/controllers/caliper.rb:77 - Rescue block should not be empty.
    app/controllers/caliper.rb:113 - Rescue block should not be empty.
    app/controllers/caliper.rb:149 - Rescue block should not be empty.
    app/controllers/caliper.rb:34 - Method name "index" has 36 lines.  It should have 20 or less.
    
    Found 7 errors.
    

    Roodi and Reek both tell you about design and readability problems in your code. The screenshot of our Reek 'code smells' in the Caliper controller should show how it had gotten out of hand. The code smells filled an entire browser page! Roodi similarly had many complaints about the Caliper controller. Flog was also showing the file was getting a bit more complex than it should be. After picking off some of the worst Roodi and Reek complaints and splitting up methods with high Flog scores, the code had become easily readable and understandable at a glance. In fact I nearly cut the Reek complaints in half for the controller.

    Reek after cleanup

    Refactoring one controller, which had been quickly hacked together and growing out of control, brought it from a dizzying 203 LOC to 138 LOC. The metrics drove me to refactor long methods (52 LOC => 3 methods the largest being 23 LOC), rename unclear variable names (s => stat, p => project), move some helpers methods out of the controller into the helper class where they belong. Yes, all these refactorings and good code designs can be done without metrics, but it can be easy to overlook bad code smells when they start small, metrics can give you an early warning that a section of code is becoming unmanageable and likely prone to higher defect rates. The smaller file was a huge improvement in terms of cyclomatic complexity, LOC, code duplication, and more importantly, readability.

    Obviously I think code metrics are cool, and that your projects can be improved by paying attention to them as part of the development lifecycle. I wrote about metric_fu so that anyone can try these metrics out on their projects. I think metric_fu is awesome, and my interest in Ruby tools is part of what drove us to build Caliper, which is really the easiest way try out metrics for your project. Currently, you can think of it as hosted metric_fu, but we are hoping to go even further and make the metrics clearly actionable to users.

    In the end, yep, this is a bit of a plug for a product I helped build, but it is really because I think code metrics can be a great tool to help anyone with their development. So submit your repo in and give Caliper hosted Ruby metrics a shot. We are trying to make metrics more actionable and useful for all Ruby developers out, so we would love to here from you with any ideas about how to improve Caliper, please contact us.

    Devver Caliper: Hosted metric_fu for your Ruby project.
    Get set up in under a minute

    Posted on October 27th, 2009 by Dan in Development, Devver, Hacking, Misc, Ruby, Testing, Tools and tagged , , , , , .

  • Unit Testing Filesystem Interaction

    by Ben

    Like most Rubyists, I write unit tests to verify the non-trivial parts of my code. I also try to use mocks and stubs to stub out interactions with systems external to my code, like network services.

    For the most part, this works fine. But I've always struggled to find a good way to test interaction with the filesystem (which can often be non-trivial and therefore should be tested). On the one hand, the filesystem could be considered "external" and mocked out. But on the other hand, the filesystem is accessible when the tests run. In this way, the filesystem is sort of like a local database - it could be mocked out, but it doesn't have to be, and there are tradeoffs to both approaches.

    Over the past year or so, I've tried out a few approaches for testing interactions with the filesystem, each of which I'll explain below. Since none of the approaches met my needs, Avdi and I built a new testing library, which I'll introduce below.

    Mocking the file system.

    Sometimes, it is simplest to just mock the interaction with the filesystem. This works well for single calls to methods like File#read or File#exist? (these examples use Mocha):

    File.stubs(:read).returns("file contents")
    File.stubs(:exist?).returns(true)
    

    However, this approach breaks down when you want to test more complex code, which, of course, is the code you're more likely to want to test thoroughly. For instance, imagine trying to set up mocks/stubs for the following method (which atomically rewrites the contents of a file):

    require 'tempfile'
    
    class Rewriter
    
      def rewrite_file!(target_path)
        backup_path = target_path + '.bak'
        FileUtils.mv(target_path, backup_path)
        Tempfile.open(File.basename(target_path)) do |outfile|
          File.open(backup_path) do |infile|
            infile.each_line do |line|
              outfile.write(yield(line))
            end
          end
          outfile.close
          FileUtils.cp(outfile.path, target_path)
        end
      rescue Exception
        if File.exist?(backup_path)
          FileUtils.mv(backup_path, target_path)
        end
        raise
      end
    
    end
    

    Now imagine setting up those same mocks/stubs for each of the five or so tests you'd want to test that method. It gets messy.

    Even more importantly, mocking/stubbing out methods ties your tests to a specific implementation. For instance, if you use the above stub (File.stubs(:read).returns("file contents")) in your test and then refactor your implementation to use, say, File.readlines, you'll have to update your tests. No good.

    MockFS

    MockFS is a library that mocks out the entire filesystem. It allows you write test code like this:

    require 'test/unit'
    require 'mockfs'
    
    class TestMoveLog < Test::Unit::TestCase
    
      def test_move_log
        # Set MockFS to use the mock file system
        MockFS.mock = true
    
        # Fill certain directories
        MockFS.fill_path '/var/log/httpd/'
        MockFS.fill_path '/home/francis/logs/'
    
        # Create the access log
        MockFS.file.open( '/var/log/httpd/access_log', File::CREAT ) do |f|
          f.puts "line 1 of the access log"
        end
    
        # Run the method under test
        move_log
    
        # Test that it was moved, along with its contents
        assert( MockFS.file.exist?( '/home/francis/logs/access_log' ) )
        assert( !MockFS.file.exist?( '/var/log/httpd/access_log' ) )
        contents = MockFS.file.open( '/home/francis/logs/access_log' ) do |f|
          f.gets( nil )
        end
        assert_equal( "line 1 of the access log\n", contents )
      end
    end
    

    Although I suspect MockFS would be a great fit for some projects, I ended up running into issues.

    First of all, it depends on a library (extensions) that can have strange monkey-patching conflicts with other libraries. For example, compare this:

    require 'faker'
    puts [].respond_to?(:shuffle) # true
    

    to this:

    require 'extensions/all'
    require 'faker'
    puts [].respond_to?(:shuffle) # false
    

    Secondly, as you'll notice in the above example, using MockFS requires you to use methods like MockFS.file.exist? instead of just File.exist?. This works fine if you're only testing your own code. However, if your code calls any libraries that use filesystem methods, MockFS won't work.

    (Note: There is a way to mock out the default filesystem methods, but it's experimental. From the MockFS documentation:

    "Reading the testing example above, you may be struck by one thing: Using MockFS requires you to remember to reference it everywhere, making calls such as MockFS.file_utils.mv instead of just FileUtils.mv. As another option, you can use File, FileUtils, and Dir directly, and then in your tests, substitute them by including mockfs/override.rb. I'd recommend using these with caution; substituting these low-level classes can have unpredictable results. ")

    All that said, MockFS is probably your best option if you're only testing your code and you want to mock out files that you can't actually interact with - for instance, if you need to test that a method reads/writes a file in /etc (although for the sake of testability, it's generally good to avoid hardcoding fully-qualified paths in your code).

    FakeFS is another library that uses this approach. I haven't used it personally, but it looks quite nice.

    Creating temp files and directories (with Construct)

    Besides mocking the filesystem, another option is to have tests interact with actual files and directories on disk. The advantages are that the test code can be simpler to write and you don't have to use any special filesystem methods.

    Of course, as always, you want the test itself to contain all the relevant setup and teardown - you don't want your tests to depend upon some set of files that have no explicit connection to the test itself (or create files that aren't cleaned up).

    To make this easy, we created a new library called Construct. Construct makes test setup simple by providing helpers to create temporary files and directories. It takes care of the cleanup by automatically deleting the directories and files that are created within the test. And because it creates regular files and directories, you can use plain old Ruby filesystem methods in your code and tests.

    To install Construct, simply run:

    # gem install devver-construct --source http://gems.github.com
    

    Using Construct, you can write code like this:

    require 'construct'
    
    class ExampleTest < Test::Unit::TestCase
      include Construct::Helpers
    
      def test_example
        within_construct do |construct|
          construct.directory 'alice/rabbithole' do |dir|
            dir.file 'white_rabbit.txt', "I'm late!"
            assert_equal "I'm late!", File.read('white_rabbit.txt')
          end
        end
      end
    
    end
    

    Let's look at each line in more detail.

        within_construct do |construct|
    

    When you call within_construct, a temporary directory is created. All files and directories are, by default, created within that temporary directory and the temporary directory is always deleted before within_construct completes.

    The block argument (construct) is a Pathname object with some additional methods (#directory and #file, which I'll explain below). You can use this object to get the path to the temporary directory created by Construct and easily create files and directories.

    Note that, by default, the working directory is changed to the temp dir within the block provided to within_construct.

          construct.directory 'alice/rabbithole' do |dir|
    

    Here we are using the construct object to create a new directory within the temp directory. As you can see, you can create nested directories like alice/rabbithole in one step. The block argument (dir) is again a Pathname object with the same added functionality noted above.

    Just like before, the working directory is changed to the newly created directory (in this case, alice/rabbithole) within the block.

            dir.file 'white_rabbit.txt', "I'm late!"
    

    Here we use the dir object to create a file. In this case, the file will be empty. However, it's easy to provide file contents using either an optional parameter or the return value of the supplied block:

    within_construct do |construct|
      construct.file('foo.txt','Here is some content')
      construct.file('bar.txt') do
      <<-EOS
      The block will return this string, which will be used as the content.
      EOS
      end
    end
    

    As a more real-world example, here's how you could use Construct to start testing the #rewrite_file! method we looked at before:

    require 'test/unit'
    require 'construct'
    require 'shoulda'
    
    class RewriterTest < Test::Unit::TestCase
      include Construct::Helpers
    
      context "#rewrite_file!" do
    
        should "alter each line in file" do
          within_construct do |c|
            c.file('bar/foo.txt',"a\nb\nc\n")
            Rewriter.new.rewrite_file!('bar/foo.txt') do |line|
              line.upcase
            end
            assert_equal "A\nB\nC\n", File.read('bar/foo.txt')
          end
        end
    
        should "not alter file if exception is raised" do
          within_construct do |c|
            c.file('foo.txt', "1\n2\nX\n")
            assert_raises ArgumentError do
              Rewriter.new.rewrite_file!('foo.txt') do |line|
                Integer(line)*2
              end
            end
            assert_equal "1\n2\nX\n", File.read('foo.txt')
          end
        end
    
      end
    
    end
    

    You can learn more at the project page (both the README and the tests have more examples).

    (As an aside, since Construct changes the working directory, it doesn't play nicely with ruby-debug. Specifically, if you place a breakpoint within a block, you'll see the message "No sourcefile available for test/unit/foo_test.rb" and you won't be able to view the source. If anyone knows an easy way to make Dir.chdir work with ruby-debug, I'd very much appreciate some help!)

    Conclusion

    We've been moving our filesystem tests over to using Construct and so far have found it to be very useful. How do you test interactions with the filesystem? Do you use one of the above approaches, or something else? Or do you skip testing the filesystem altogether?

    Devver Caliper: Hosted metric_fu for your Ruby project.
    Get set up in under a minute

    Posted on August 25th, 2009 by Ben in Hacking, Testing and tagged , , .

  • Devver adds Postgres and SQLite database support

    by Dan

    We are working hard to quickly expand our compatibility on Ruby projects. With that goal driving us, we are happy to announce support for Postgres and SQLite databases. With the addition of these database options, along with our existing support for MySQL, Devver now supports all of the most popular databases commonly used with Ruby. These three databases are the default databases tested against ActiveRecord and we expect will cover the majority of the Ruby community.

    To begin working with Postgres or SQLite on Devver all you need to do is have a database.yml with the test environment set to the adapter of your choice. If we don't support your favorite database, you can still request a beta invite and let us know which database you want us to support. If we just added support for your database, perhaps we can speed up your project on Devver, so request a beta invite.

    Devver Caliper: Hosted metric_fu for your Ruby project.
    Get set up in under a minute

    Posted on July 6th, 2009 by Dan in Development, Devver, Ruby, Testing and tagged , , .

  • Tracking down open files with lsof

    by Ben

    The other day I was running in a weird error on Devver. After running around twenty test runs on the system, the component that actually runs individual unit tests was crashing due to "Too many open files - (Errno::EMFILE)"

    Unfortunately, I didn't know much more than that. Which files were being kept open? I knew that this component loaded quite a few files, and that by default, OS X only allows 256 open file descriptors (ulimit -n will tell you the default on your system). If this was a valid case of needing to load more files, I could just up the limit using ulimit -n <bigger_number>.

    Fortunately, a quick Google or two pointed the way to lsof. Unfortunately, my Unix-fu is never nearly as good as I wish and I didn't know much about this handy utility. But I quickly discovered that it's very useful for tracking down problems like this. I quickly used ps to find the PID of the Devver process and then a quick lsof -p <PID> displayed all the files that the process had open. So easy!

    Sure enough, there were a ton of redundant file handles to the file that we use to store information about the Devver run. Armed with this information, it was easy to find the buggy code where we called File.open but failed to ever close the file.

    Unfortunately, I still don't know how to write a good unit test for this case. I guess I could do something ugly like call sytem("lsof -p pid | wc -l") before and after calling the code and make sure the number of descriptors stays constant, but that's really ugly. Is there a way to test this within Ruby? I'm open to ideas.

    Still, it's always good to learn more about a powerful Unix tool. I'm constanly amazed by the power and depth of the Unit tool set.

    Devver Caliper: Hosted metric_fu for your Ruby project.
    Get set up in under a minute

    Posted on October 9th, 2008 by Ben in Development, Hacking, Testing, Tips & Tricks.

  • Ruby Code Quality Tools

    by Dan

    Update: Devver now offers a hosted metrics service for Ruby developers which can give you useful feedback about your code. Check out Caliper, to get started with metrics for your project.

    This is the third post in my series of Ruby tools articles. This time I look at Ruby code quality tools. Rubyists like Ruby because the code can look so nice, simple, and sometimes beautiful. Unfortunately not all code is so great, in fact often the code I write doesn't look good. Fortunately while a great language can help you to write great code, great tools can help as well. As code grows it is easy for code bloat, dead code, or confusing complexities to slip in. The tools I review below can help with all of these problems. I recommend finding the one or two code quality tools you like best and starting to integrate them more into your development process.

    Roodi


    Roodi gives you a bunch of interesting warnings about your Ruby code. We are about to release some code, so I took the opportunity to fix up anything Roodi complained about. It helped identify refactoring opportunities, both with long methods, and overly complex methods. The code and tests became cleaner and more granular after breaking some of the methods down. I even found and fixed one silly performance issue that was easy to see after refactoring, which improved the speed of our code. Spending some time with Roodi looks like it could easily improve the quality and readability of most Ruby projects with very little effort. I didn't solve every problem because in one case I just didn't think the method could be simplified anymore, but the majority of the suggestions were right on. Below is an example session with Roodi


    dmayer$ sudo gem install roodi
    dmayer$ roodi lib/client/syncer.rb
    lib/client/syncer.rb:136 - Block cyclomatic complexity is 5. It should be 4 or less.
    lib/client/syncer.rb:61 - Method name "excluded" has a cyclomatic complexity is 10. It should be 8 or less.
    lib/client/syncer.rb:101 - Method name "should_be_excluded?" has a cyclomatic complexity is 9. It should be 8 or less.
    lib/client/syncer.rb:132 - Method name "find_changed_files" has a cyclomatic complexity is 10. It should be 8 or less.
    lib/client/syncer.rb:68 - Rescue block should not be empty.
    lib/client/syncer.rb:61 - Method name "excluded" has 25 lines. It should have 20 or less.
    lib/client/syncer.rb:132 - Method name "find_changed_files" has 27 lines. It should have 20 or less.
    Found 7 errors.

    After Refactoring:

    ~/projects/gridtest/trunk dmayer$ roodi lib/client/syncer.rb
    lib/client/syncer.rb:148 - Block cyclomatic complexity is 5. It should be 4 or less.
    lib/client/syncer.rb:82 - Rescue block should not be empty.
    Found 2 errors.

    I did have one problem with Roodi - the errors about rescue blocks just seemed to be incorrect. For code like the little example below it kept throwing the error even though I obviously am doing some work in the rescue code.

    Roodi output: lib/client/syncer.rb:68 - Rescue block should not be empty.
    begin
      socket = TCPSocket.new(server_ip,server_port)
      socket.close
      return true
    rescue Errno::ECONNREFUSED
      return false
    end

    Dust


    Dust detects unused code like unused variables,branches, and blocks. I look forward to see how the project progresses. Right now there doesn't seem to be much out there on the web, and the README is pretty bare bones. Once you can pass it some files to scan, I think this will be something really useful. For now I didn't think there wasn't much I could actually do besides check it out. Kevin, who also helped create the very cool Heckle, does claim that code scanning is coming soon, so I look forward to doing a more detailed write up eventually.

    Flog


    Flog gives feedback about the quality of your code by scoring code using the ABC metric. Using Flog to help guide refactoring, code cleanup, and testing efforts can be highly effective. It is a little easier to understand the reports after reading how Flog scores your code, and what is a good Flog score. Once you get used to working with Flog you will likely want to run it often against your whole project after making any significant changes. There are two easy ways to do this a handy Flog Rake task or MetricFu which works with both Flog and Saikuro.

    Running Flog against any subset of a project is easy, here I am running it against our client libraries
    find ./lib/client/ -name \*.rb | xargs flog -n -m > flog.log

    Here some example Flog output when run against our client code.

    Total score = 1364.52395469781
    
    Client#send_tests: (64.3)
        14.3: assignment
        13.9: puts
        10.7: branch
        10.5: send
         4.7: send_quit
         3.4: message
         3.4: now
         2.0: create_queue_test_msg
         1.9: create_run_msg
         1.9: test_files
         1.8: dump
         1.7: each
         1.7: report_start
         1.7: length
         1.7: get_tests
         1.7: -
         1.7: open
         1.7: load_file
         1.6: empty?
         1.6: nil?
         1.6: use_cache
         1.6: exists?
    ModClient#send_file: (32.0)
        12.4: branch
         5.4: +
         4.3: assignment
         3.9: send
         3.1: puts
         2.9: ==
         2.9: exists?
         2.9: directory?
         1.9: strftime
         1.8: to_s
         1.5: read
         1.5: create_file_msg
         1.4: info
    Syncer#sync: (30.8)
        13.2: assignment
         8.6: branch
         3.6: inspect
         3.2: info
         3.0: puts
         2.8: +
         2.6: empty?
         1.7: map
         1.5: now
         1.5: length
         1.4: send_files
         1.3: max
         1.3: >
         1.3: find_changed_files
         1.3: write_sync_time
    Syncer#find_changed_files: (26.2)
        15.6: assignment
         8.7: branch
         3.5: <<
         1.8: to_s
         1.7: get_relative_path
         1.7: >
         1.7: mtime
         1.6: exists?
         1.6: ==
         1.5: prune
         1.4: should_be_excluded?
         1.3: get_removed_files
         1.3: find
    ... and so on ...

    Saikuro


    Saikuro is another code complexity tool. It seems to give a little less information than some of the others. It does generate nice HTML reports. Like other code complexity tools it can be helpful to discover the most complex parts of your projects for refactoring and to help focus your testing. I liked the way Flog broke things down for me into a bit more detail, but either is a useful tool and I am sure it is a matter of preference depending on what you are looking for.

    saikuro screenshot
    Saikuro Screenshot

    Devver Caliper: Hosted metric_fu for your Ruby project.
    Get set up in under a minute

    Posted on October 1st, 2008 by Dan in Development, Ruby, Testing.

  • Ruby Test Quality Tools

    by Dan

    Update: Devver now offers a hosted metrics service for Ruby developers which can give you useful feedback about your code. Check out Caliper, to get started with metrics for your project.

    This is the second post in my series of Ruby tools articles. This time I am focused on Ruby test quality tools. Devver is always really interested in testing, and obviously the quality of a project's tests is important. We are always looking at ways to add even more value to the investment teams put in with testing. Simply knowing that you are writing higher quality tests helps increase the value returned on the time invested in testing. I haven't found many tools to help with test quality, but these tools are a great help to any Ruby tester.

    Heckle


    Heckle is an interesting tool to do mutation testing of your tests. Heckle currently supports Test:Unit and RSpec, but does have a number of issues. I had to run it on a few different files and methods before I got some useful output that helped me improve my testing. The first problem was it crashing when I passed it entire files (crashing the majority of the time). I then began passing it single methods I was curious about, which still occasionally caused Heckle to get into an infinite loop case. This is a noted problem in Heckle, but -T and providing a timeout should solve that issue. In my case it was actually not an infinite loop timing error, but an error when attempting to rewrite the code, which lead to a continual failure loop that wouldn't time out. When I found a class and method that Heckle could test I got some good results. I found one badly written test case, and one case that was never tested. Lets run through a simple Heckle example.

    #install heckle
    dmayer$ sudo gem install heckle
    
    #example of the infinite loop Error Heckle run
    heckle Syncer should_be_excluded? --tests test/unit/client/syncer_test.rb -v
    Setting timeout at 5 seconds.
    Initial tests pass. Let's rumble.
    
    **********************************************************************
    ***  Syncer#should_be_excluded? loaded with 13 possible mutations
    **********************************************************************
    ...
    2 mutations remaining...
    Replacing Syncer#should_be_excluded? with:
    
    2 mutations remaining...
    Replacing Syncer#should_be_excluded? with:
    ... loops forever ...
    
    #Heckle run against our Client class and the process method
    dmayer$ heckle Client process --tests test/unit/client/client_test.rb
    Initial tests pass. Let's rumble.
    
    **********************************************************************
    ***  Client#process loaded with 9 possible mutations
    **********************************************************************
    
    9 mutations remaining...
    8 mutations remaining...
    7 mutations remaining...
    6 mutations remaining...
    5 mutations remaining...
    4 mutations remaining...
    3 mutations remaining...
    2 mutations remaining...
    1 mutations remaining...
    
    The following mutations didn't cause test failures:
    
    --- original
    +++ mutation
    
     def process(command)
    
       case command
       when @buffer.Ready then
         process_ready
    -  when @buffer.SetID then
    +  when nil then
         process_set_id(command)
       when @buffer.InitProject then
         process_init_project
       when @buffer.Result then
         process_result(command)
       when @buffer.Goodbye then
         kill_event_loop
       when @buffer.Done then
         process_done
       when @buffer.Error then
         process_error(command)
       else
         @log.error("client ignoring invalid command #{command}") if @log
       end
     end
    
    --- original
    +++ mutation
     def process(command)
       case command
       when @buffer.Ready then
         process_ready
       when @buffer.SetID then
         process_set_id(command)
       when @buffer.InitProject then
         process_init_project
       when @buffer.Result then
         process_result(command)
       when @buffer.Goodbye then
         kill_event_loop
       when @buffer.Done then
         process_done
       when @buffer.Error then
         process_error(command)
       else
    -    @log.error("client ignoring invalid command #{command}") if @log
    +    nil if @log
       end
     end
    
    Heckle Results:
    
    Passed    :   0
    Failed    :   1
    Thick Skin:   0
    
    Improve the tests and try again.
    
    #Tests added / changed to improve Heckle results
    
      def test_process_process_loop__random_result
        Client.any_instance.expects(:start_tls).returns(true)
        client = Client.new({})
        client.stubs(:send_data)
        client.log = stub_everything
        client.log.expects(:error).with("client ignoring invalid command this is random")
        client.process("this is random")
      end
    
      def test_process_process_loop__set_id
        Client.any_instance.expects(:start_tls).returns(true)
        client = Client.new({})
        client.stubs(:send_data)
        client.log = stub_everything
        cmd = DataBuffer.new.create_set_ids_msg("4")
        client.expects(:process_set_id).with(cmd)
        client.process(cmd)
      end
    

    #A final Heckle run, showing successful results
    dmayer$ heckle Client process --tests test/unit/client/client_test.rb
    Initial tests pass. Let's rumble.

    **********************************************************************
    *** Client#process loaded with 9 possible mutations
    **********************************************************************

    9 mutations remaining...
    8 mutations remaining...
    7 mutations remaining...
    6 mutations remaining...
    5 mutations remaining...
    4 mutations remaining...
    3 mutations remaining...
    2 mutations remaining...
    1 mutations remaining...
    No mutants survived. Cool!

    Heckle Results:

    Passed : 1
    Failed : 0
    Thick Skin: 0

    All heckling was thwarted! YAY!!!

    rcov


    rcov is a code coverage tool for Ruby. If you are doing testing you should probably be monitoring your coverage with a code coverage tool. I don't know of a better tool for code coverage than rcov. It is simple to use and generates beautiful, easy-to-read HTML charts showing the current coverage broken down by file. An easy way to make you project more stable is to occasionally spend some time increasing the coverage you have on your project. I have always found it a great way to get back into a project if you have been off of it for awhile. You just need to find some weak coverage points and get to work.
    Rcov Screenshot
    rcov screenshot

    Devver Caliper: Hosted metric_fu for your Ruby project.
    Get set up in under a minute

    Posted on September 30th, 2008 by Dan in Development, Ruby, Testing.

  • One Day of TDD

    by Dan

    I am am a big believer in software testing. I normally have created tests after writing my code and mostly to ensure that regressions of functionality don't occur when the code is changed. As I have become more comfortable with testing, and the changes it requires such as writing testable code, I have found even more benefits of testing. Better automated testing, and better understanding of testing has changed my development practices.

    I haven't practiced TDD, but I do follow Test Driven Corrections (TDC) which I might be coining right now. Following TDC means that when you find a bug, you should try to write a test that fails on that bug, then fix bug and make the test pass. I have become a fan of fixing bugs this way because bugs often first appear in code that for one reason or another is brittle or has some unseen dependency. If you just fix the bug even if it was a simple mistake it is still far more likely that there will be another bug with that piece of code than other areas. If you know a section of code is error prone wouldn't you want to catch the error as fast as possible?

    A new tool in my toolbox is exploratory testing. If I was learning a new library or object in the past I would often write simple programs that would print out the state, manipulate that state, and print the result. I would then continue to learn ways to work with the objects and verify that the printed state matched my expectations. Hmmm that seems error prone, and not necessarily repeatable. Now when I am learning something new I tend to write tests against my expectations of how things should work. A great recent example of this was when I was learning to use the RightScale AWS gem so I could access Amazon's SimpleDB (SDB). I ended up writing tests to create, save, update, and delete objects. After that it was easy to write a real DB layer for some of our objects and write additional tests for that as well.

    The more I learned about testing the more useful I have found it in many situations. The issue is that I have still never bought into the idea of TDD for most development. Even though many smart people have written about the benefits of TDD (Adam@Heroku Jay Fields), it just hasn't ever seemed worth it to me. I decided that I really couldn't knock it if I really hadn't ever tried it for any significant amount of time. So I decided to spend one full day completely following TDD.

    I was going to be adding some new features to Devver, and thought it would be a good chance to try my TDD challenge. The features I was adding were some small actions on the server that would be triggered remotely by the client. This broke down into a few separate pieces of functionality.

    1. The client would send one of three new requests based on user input and existing client state
    2. The server would receive and parse these new messages
    3. The server would call 3 new handlers with the encoded project data and carry out tasks

    Breaking this into tests was very natural and led to nearly no debugging time as almost the first time the client and server connected the interactions all behaved exactly as expected. I didn't waste any time looking into where a message or a response got lost or wasn't properly acknowledged in the code. The tests had already simulated the message creation, parsing, routing as well as the event handling and completion. It is nice when you put all the pieces together and it just works, and you know it is very solid from the beginning.

    To break down the tasks I wrote the tests in this order.

    1. Tested creating messages to store the expected project information
    2. Tested parsing messages to get the expect project information
    3. Tested client inputs would call my event handlers
    4. Tested that client event handlers would send the proper message
    5. Tested that the server would receive and parse the expected messages
    6. Tested that the server would call my event handlers
    7. Tested that event handler would complete the task expected off them

    After spending a day and completing all the pieces of functionality I had expected to complete I was happy with my TDD experiment. I came away with a new respect for TDD an while I still don't think it would be well suited to all programming tasks, I can certainly see a place for it and plan on using TDD more in the future. I do think that it took me slightly longer to complete the features than it normally would have. I freely admit that the more often you do TDD development the better you would get and that likely less time would be wasted trying to think up the proper test cases. I felt that the code I wrote under TDD was of a higher quality than the code I normally write. It forced me to refactor and rework my code as I went as well as break it into small enough pieces that I could write simple tests to verify the next piece of the project. I think that the code I ended up writing will be less brittle and easier to work with in the future. As a developer, I was happier and more sure of the stability of the features I just added to the system.

    I think TDD is especially well suited to situations with small communications between systems, as each independent system can be completely tested and have it's behavior verified while isolated from the other pieces. I was surprised that working in a way that demanded more upfront costs before I wrote anything functional didn't slow me down more. I was expecting that it would make my development process slower by a factor of two, but the truth is that a single nasty bug halting your forward progress can take up more time than you would have needed to spend initially if working with a TDD approach.

    I don't plan on trying to move over to an entirely TDD approach but by challenging myself to work in a way that seemed unintuitive to me, I ended up learning a lot and likely will use the approach in the future for myself.

    Devver Caliper: Hosted metric_fu for your Ruby project.
    Get set up in under a minute

    Posted on September 4th, 2008 by Dan in Development, Ruby, Testing.

  • Miško Hevery on Writing Testable Code

    by Ben

    Miško Hevery has written up a nice collection of tips on writing testable code on his blog. Some of the tips are a bit hard to understand and apply, but it sounds like he will be going into many of them in more detail in the weeks to come (or you can use this list as a starting point and search for more details on individual tips).

    While I don't agree that there are no tricks to writing good tests, I whole-heartedly agree that a huge part of testing is making your application code testable. This is one of the biggest reasons that unit testing has a low ROI when you start (but that ROI increases as you learn to write testable code).

    Devver Caliper: Hosted metric_fu for your Ruby project.
    Get set up in under a minute

    Posted on August 7th, 2008 by Ben in Testing.

  • Learning RSpec and Merb

    by Dan

    WARNING: This is basically completely out of date Merb changed very fast before 1.0. please see merbivore.com for current information!

    We have been trying to work with some different Ruby technologies lately. We are moving to RSpec from Test::Unit, because we believe it has several advantages. It also seems all the cool projects are moving to RSpec: Rubinius, Typo, Mephisto, and of course Merb.

    In learning these two technologies together, I have found a few resources that I found to be really useful. I thought it would be good to share the information for anyone looking to write specs for their Merb projects.

    If you are first learning Merb and want to create a basic project and learn to test with Rspec along with development, I can't recommend enough that you follow the Merb Slapp tutorial. This is a great source for Merb basics that is very up to date, and gives good examples of RSpec tests.

    If you are new to Merb, the newest documentation will be your friend. I also recommend checking out the Merb Wiki. For RSpec, specifically check out these wiki pages: Merb Controller Specs, Merb Model Specs, and Merb View Specs.

    There were some things I had to search and stumble around a bit for, session variables and mock objects. The reason I needed to mock the session was that a user is expected to be logged in verified by a session variable before allowing the action to continue. I needed a mock object of my ProjectWriter, because it normally makes live calls to a web service. These are easy to do, but are both done differently than with Test::Unit with Rails. I found out about RSpec mocking and Merb session mocking at the links provided.

    Here is some code that demonstrates mocking both sessions and model objects.

    #create a mock object named ProjectWriter
    project_writer = mock("ProjectWriter")
    #mock expects this call
    project_writer.should_receive(:get_all_user_projects).with('ben')
    @controller = dispatch_to(Project, :index) do |controller|
      #mock the session hash
      controller.stub!(:session).and_return({:logged_in => true})
      #return my mocked object
      controller.stub!(:get_project_writer).and_return(project_writer)
      #we aren't testing the view don't render it
      controller.stub!(:render) # don't render this action
    end
    
    @controller.should respond_successfully

    Devver Caliper: Hosted metric_fu for your Ruby project.
    Get set up in under a minute

    Posted on July 24th, 2008 by Dan in Development, Ruby, Testing, Tips & Tricks.

  • Tips for Unit Testing

    by Ben

    For the past few weeks, I've been doing a series of posts on my thoughts on unit testing. Although I originally published them in little, bite-sized posts, I wanted to collect them all here in one massive post for those of you with bigger reading appetites.

    I also wanted to add one thought to sort of tie all these tips together. Unit testing is all about improving productivity. It's important to realize that the ROI for testing looks something like this:

    this graph is very exact

    A very professional-looking graph. I guess ROI should really be 'benefit', but whatever, you get it.

    If you are just getting started with unit testing, you're at the bottom of the curve, so you're going to sink a lot of time into testing without much benefit. Similarly, once you've done a lot of testing on a project, trying to test that last little bit may require more time than it's worth. The goal of these tips is to help you maximize the benefit-to-time ratio, wherever you may be in this curve.


    We're big on automated testing here at Devver, but I know a lot of companies aren't as into it. There's been plenty written about all the reasons you should be writing tests, but over the next week or so, I'll give you some tips on how to get started (and if you've already got some tests, how to improve and expand your test suite).

    I can't claim to have come up with these best practices, so I'll litter this post with links to those resources that have taught me something.

    A quick word about terminology. When I say "tests" I mean any type of automated tests, which may include unit, functional, integration or any other types of tests. When I say "production code" I simply mean the code goes into the actual product - i.e. the code being tested.

    Tip 1: You'll probably suck at testing
    Writing tests can be frustrating at first. It is usually a lot harder and more time consuming than you'd expect. Unfortunately, some developers assume that the cost of writing tests is fixed and conclude that the benefits can't possible justify the time spent - so they quit writing tests.

    Writing test code is an art unto itself. There are a whole new set of tricks and skills to learn and it's difficult to do correcty right away. Stick with it. The better you get, the faster you'll write tests, and the more your tests will pay off.

    Tip 2: Most code is not written to be tested
    Another surprising thing you'll find when you start testing is that your production code is not very testable. This isn't surprising - if there were no tests previously, there was no reason to design for testability. This will make your first tests way harder to write and less valuable (i.e. they are less likely to catch real bugs)

    There are a few tricks to get around this. First, try testing only new code or just test a smaller side project to start to get the hang of it. When you're ready to start testing your legacy application, try the following.

    1. Write a few very high-level tests. These tests will likely exercise almost the whole system and will interact with the application at the highest-level interface.
    2. Refactor out one component of the application so it is more decoupled and testable
    3. Continually run your high-level tests to make sure you haven't broken anything major
    4. Write more focused tests for the component you pulled out in step #2
    5. Go back to step #2

    If you need more help with this, pick up a copy of Working Effectively w/ Legacy Code. There is also some additional information here.

    Again, stick with it. As you write more tests, your application will be more testable (bonus: it's likely be easier to understand, more loosely coupled, easier to refactor, and more DRY as well!). As it becomes more testable, it'll be easier to write additional tests. This creates a positive loop where things get better and easier as you go.

    Tip 3: Test code isn't production code
    Another common mistake is to treat test code just like production code. For instance, you'd like your code production code to be as dry as possible. But in test code, it's actually more important for tests to be readable and independent than to be dry. As a result, you'll want your tests to be more "moist" than dry. Specifically, you'll want to use literals a lot more in test code than you would in production.

    In general, the most important properties of good tests are:

    Independent - No test should affect the outcome of any other test. Put another way, you should be able to run your tests in any order and always have the same outcome. A corollary of this is that setup/teardown methods are evil (both because they increase dependence and they decrease readability)
    Readable - The intent of each test should be immediately obvious (both by it's name and by its code).
    Fast - Each test should run as quickly as possible, so the entire suite is also fast. The faster the suite, the more you'll run the tests, and the greater benefit you'll get (because you'll catch regressions quickly)
    Precise - Each test should focus on testing one thing (and only one thing) well*. Ideally, if a test fails, you should know exactly what part of your production code broke by just glancing at the name of the test. Also, if your tests are precise, it's less likely that a change in your code will require you to change many different tests. In practice, precise tests are short and only have one assertion or expectation per test.

    *Note: this doesn't apply to integration tests, which should make sure all components play nicely together.

    Tip 4: Always write one test

    When writing new code, it's easy to avoid testing because it seems so daunting to test all the functionality. Rather than thinking of testing as an all-or-nothing proposition, try to write just one good test for the new functionality.

    You'll find that having just one test is much, much better than having no tests at all. Why? First of all, it'll catch catastrophic errors, even if it doesn't catch bugs in edge cases. Secondly, writing even one test may force you to refactor your production code slightly to make it more testable (which in turn, makes future tests easier to write). Finally, it gives you "test momentum". If you have no tests, you'll be inclined to delay testing, since there is more overhead to get started. But if you already have just one test in place, it'll be much easier to add tests as you think of them (and to write regression tests as you find bugs).

    By the way, don't worry about testing at exactly the right level. Having one functional test is way better than having no tests at all. You can always come back and break the "bigger" test down into more targeted, precise tests.

    Tip 5: Improve your tests over time

    Here's a terrible idea - decide you are going to spend a whole week building a test suite for your project. First of all, you'll likely just get frustrated and burn out on testing. Secondly, you'll probably write bad tests at first, so even if you get a bunch of tests written, you're going to need to go back and rewrite them one you figure out how slow, brittle, or unreadable they are.

    As they say, the best writing is rewriting. You should try out new techniques (and rewrite) old test code. But it's OK to have patchwork tests.

    You just found out fixtures suck? (they do). Or that those 'setup' methods make your tests less readable? Are you excited about using mocks? Great, apply your new technique to some new tests, rewrite a few old tests, and call it a day. Don't try to rewrite your whole suite, because you'll be kicking yourself when you rewrite your suite again after you decide technique X isn't perfect in all cases.

    Just like in production code, good practices take awhile to bake and prove themselves. See how maintainable, easy to understand, easy to read a new technique is. You can always move more tests over.

    Tip 6: Don't be dogmatic

    There are a lot of best practices for testing that may or may not apply to your situation. Should you have one assertion per test? Should you use mocks and stubs? Should you use Test Driven Development? Or Behavior Driven Development? Should you do interaction or state-based testing? While all of these practices have real benefits, remember that their applicability and value depends largely on your project, schedule, and team.

    Don't be afraid to play, but don't feel like you need to convert everything to the one, true way to test. It's fine to have a suite that mixes and matches these best practices. In other words, context is king.

    Tip 7: Be reasonable

    There are lots of reasons why tests are great, but if your practices aren't ultimately making your code better and you more productive, it's not worth it. You have to always think about the return on your time investment.

    There are domains in which automated testing is very difficult and doesn't provide a lot of value, like GUI testing. I would recommend writing tests for the interface that the GUI calls, but actually testing that things show up correctly is quite tricky and error prone.

    Also, 100% code coverage shouldn't necessarily be your goal. As you get better at writing tests, I think you'll find they provide a lot of value, but at some point, covering that last small percentage of code may require way more effort than it's worth.

    Tip 8: Keep learning!

    Just like learning new programming languages makes you a better developer, learning about new testing approaches, libraries, and tools will make you a better tester. The state of the art of testing is changing very rapidly these days - new frameworks and techniques are released almost every month. Keep looking at example code and trying out new stuff.

    For instance, here's a few tools that you may not be using but are very cool: Heckle and RushCheck

    Finally, if you want to learn more, subscribe to Jay Field's blog - he has lots of good (if sometimes controversial) thoughts about testing.

    And with that, I'll wrap up this series on testing. If you have your own testing tips, please share them!


    At Devver, we're building some awesome, cloud-based tools for Ruby hackers. If you're interested, sign up for our mailing list.

    Devver Caliper: Hosted metric_fu for your Ruby project.
    Get set up in under a minute

    Posted on July 7th, 2008 by Ben in Testing.