5.2. An overview of software project telemetry in Hackystat

5.2.1. Hackystat Telemetry Reports

This section introduces Software Project Telemetry through simple examples of its implementation in Hackystat. Let's assume that we are developers on a project and we're interested in assessing our testing process over an extended period of time in order to better understand quality assurance. To start, we could ask a very high level question: is our testing process changing over time, and if so, how?

The Hackystat "Telemetry Report Analysis" provides a pull-down menu of predefined "reports". Each report consists of a packaged set of telemetry "charts" containing metrics and how they change over time. Let's start our inquiry into testing by seeing if any of these predefined telemetry reports provide a useful perspective. Figure 5.3, “ Telemetry Report Command (Test Size and Coverage) ” shows the Telemetry Report Analysis command with sample selector values.

Figure 5.3.  Telemetry Report Command (Test Size and Coverage)


Telemetry Report Command (Test Size and Coverage)

As you can see, the Telemetry Report Analysis requires you to select an output format for the analysis results, the Project you wish to analyze, and a time period for the analysis. These are fairly standard selectors for Hackystat commands. The last two selectors provide a drop-down menu of predefined telemetry report names and a text field into which you (optionally) provide one or more parameter values. These last two selectors will be fully explained later in this chapter, but for now note that any user can define telemetry reports (just like they can define Projects), and that a telemetry report may require you to provide one or more parameters that customize the report. In this example, the TestSizeAndCoverage telemetry report does not require any parameters.

Figure 5.4, “ Telemetry Report Results (Test Size and Coverage) ” shows the results from this report. The TestSizeAndCoverage report generates a single Telemetry Chart showing three streams: the total number of Java classes in this Project (the red line), the number of Java test classes in this Project (the blue line), and the percentage coverage of the Java classes by the test classes (the green line). Each stream of this telemetry chart has its own associated axis, with the axis color corresponding to the stream line. The essential idea of a telemetry report analysis is to juxtapose multiple measures together for the same Project over the same time interval in order to whether or not and how they might co-vary. Telemetry Reports often generate multiple Charts, although in this case only a single Chart is generated.

Figure 5.4.  Telemetry Report Results (Test Size and Coverage)


Telemetry Report Results (Test Size and Coverage)

This TestSizeAndCoverage telemetry report reveals a few interesting trends regarding testing in this project over the 16 months shown. First, the total size of the system more than doubled during this time period, from 600 classes to almost 1450 classes. The total size of test code also increased substantially, from 150 classes to 430 classes. Finally, the coverage associated with the test code improved from 78% to 84%.

The most striking feature of this report, of course, is the co-variance in the total number of classes in the system with the number of test classes in the system. This trend exists even though the development process has no rules regarding the number of test classes to write; there is no process edict in place coercing people to write one test class for every four system classes, for example.

To make matters more interesting, one might hypothesize that if the number of test classes co-varies with the total number of system classes, then this would indicate relatively 'stable' testing, and one might thus predict that the coverage would be relatively constant over this time period. However, the actual coverage is not at all stable: it both drops precipitously and climbs steadily over the time period under analysis.

Finally, notice that the dip in coverage during June and July of 2004 was preceeded by a dip in both total system size and test size in May of 2004. Given that these were the only two dips that appear in the chart and they appear close together, could there be a connection between them?

With respect to project management decision-making, one use of this telemetry data is to provide baseline data on testing. First, it indicates that for 16 months, test size and total system size co-varied. A departure from this trend in the future could indicate a change in developer testing behavior. Second, it shows that coverage is trending steadily upward. This trend is intrinsically unsustainable, since coverage cannot exceed 100%. From a project management point of view, it will be useful to see where coverage "plateaus", and/or whether it begins to decrease in future.

Finally, the telemetry data provokes some new questions about the development process. Why did the system size decrease in 2004, and why did coverage dip shortly thereafter? Would the results be different if a different measure of size (such as methods or LOC) was used, or if individual modules were evaluated rather than the system as a whole? Finally, how effective was this testing process at uncovering problems--in other words, were defects actually discovered as a result of writing and running the tests?

5.2.2. Hackystat Telemetry Charts

While all of the above questions are amenable to further analysis in Hackystat, let's focus on the last question and use it as a way of introducing the Telemetry Chart Analysis. Unlike the Telemetry Report Analysis, which can potentially display many charts in a single command, the Telemetry Chart Analysis can generate only a single telemetry chart. Using this analysis provides you with access to all of the defined Telemetry Charts, including some which might not be packaged into a Telemetry Report. While Telemetry Reports often take no parameters, Telemetry Charts typically require you to provide parameter values to configure the data they display. The selectors for the Telemetry Chart Analysis command are almost identical to those for the Telemetry Report Analysis command, except the the pull-down list shows the set of defined Charts in one case and the set of defined Reports in the other.

To gain insight into the effectiveness of the unit tests at discovering defects, we can try the Unit Test Success Percentage telemetry chart command, as illustrated in Figure 5.5, “ Telemetry Chart Command (Unit Test Success Percentage) ”. This Telemetry Chart requires two comma-separated parameter values. The first parameter is a file specification for the unit tests to be used in the analysis. We specify "**" to indicate that all possible Unit Tests in the given Project should be considered. The second parameter is a string containing a boolean to indicate whether we should accumulate the unit test data over the course of the 12 months, or treat each 12 month period individually. We specify "false" to indicate that we want to calculate the percentage of unit test success for each month individually.

Figure 5.5.  Telemetry Chart Command (Unit Test Success Percentage)


Telemetry Chart Command (Unit Test Success Percentage)

Although the trend line bounces around, the actual percentages vary between 96% and 99%. This indicates that unit tests "almost always" pass, which could mean that our unit tests are not actually revealing any problems with the system. To understand whether or not tests are actually failing, it would be useful to display the actual numbers of failing test invocations. Unfortunately, this poses a temporary problem: there does not appear to be a pre-defined Telemetry Chart that shows the absolute numbers of failing unit tests.

5.2.3. Defining new Charts with Hackystat Telemetry Streams

If we cannot find a predefined Telemetry Chart that displays the data we need, then we can define a new Chart. To start, we go to the Preferences page and invoke the "Definition Management" command in the "Telemetry" section of this page. Invocation of the Definition Management command results in a page containing a listing of four types of user-definable Telemetry constructs: Y-Axis, Streams, Charts, and Reports. Figure 5.6, “ Definition Management Page ” illustrates the initial portion of the page returned after invoking the Definition Management command.

Figure 5.6.  Definition Management Page


Definition Management Page

This screen shot shows several definitions of two kinds of Telemetry constructs: Y-Axis and Streams. For each definition, the page displays its name, its definition, the user who defined it (and who will be provided with buttons to edit or delete it), and whether the definition is private to that user or globally available to all users. An "Add" button is available at the beginning of each section to allow definition of new Telemetry constructs of the corresponding type.

this server, called ActiveTime-Chart, ActiveTime-Coverage-Modules1-Chart, and ActiveTime-Member-Chart. A chart definition consists of a Name, a set of Parameters, the definition of the chart, the title to be displayed when the Chart is rendered, the user who defined the Chart, and whether the Chart is shared among members of the Project, globally on the server, or not shared at all. The Telemetry Chart Definition Management page also provides buttons allowing you to define new charts, edit existing chart definitions, and delete a Chart.

To define a new Telemetry Chart, we press the "New Chart" button. This brings up a new page for defining a Telemetry Chart, as illustrated in Figure 5.7, “ Telemetry Chart Definition ”.

Figure 5.7.  Telemetry Chart Definition


Telemetry Chart Definition

The form defines a new Telemetry Chart named "UnitTestFailures". We provide a definition for the Chart using the Telemetry Definition Language, described in detail later in this chapter. In this example, the definition of the UnitTestFailures chart indicates that there are two parameters, "filePattern" and "cumulative", which will be passed as parameters to the Telemetry Stream called "UnitTest-FailureCount". The "Share in Project" selector allows us to specify whether this Chart definition is restricted to a single Project, restricted to just this user, or provided as a public definition to all users. The Title field allows us to provide a title for this chart.

Once this new Telemetry Chart has been defined, it will appear in the pull-down menu associated with the Telemetry Chart Analysis command, and we can now proceed to learn about the frequency of unit test failures over the course of the year. Figure 5.8, “ Telemetry Chart (Unit Test Failures) ” shows the result of this analysis.

Figure 5.8.  Telemetry Chart (Unit Test Failures)


Telemetry Chart (Unit Test Failures)

Now that we can see the absolute number of test failures, some interesting characteristics of testing appear that were not apparent when looking at the percentage of passing test cases. First, we can see that substantial numbers of unit tests fail each month: over half of the months had over 300 unit test failures. This indicates that the set of unit tests are effective: they are both being invoked regularly and failing (i.e. detecting problems) regularly. Second, a substantial spike in unit test failures appears in March, 2005: the same month in which a substantial amount of new code was added to the system.

This chart raises additional questions about testing that could be investigated using software project telemetry. Does the number of unit test failures co-vary with the amount of new code added to the system? Does the total number of test invocations co-vary with the amount of new code? Are these trends consistent across modules in the system? Are they consistent across the various developers in the system?

5.2.4. Summary

This section has given you a taste for Software Project Telemetry as it is supported in Hackystat. We have seen how to visualize trends in multiple types of sensor data over time using the Telemetry Report Analysis. We have seen how to focus on one type of sensor data using the Telemetry Chart Analysis, and that it is possible to define new Telemetry Charts using the Telemetry Definition Language. The following sections of this chapter will fill in the details left out by this overview so that you can use Software Project Telemetry in your own management decision making.