Chapter 11. Software Engineering Experimentation

Table of Contents

11.1. Motivation
11.2. Categories of experimentation support: configuration, monitoring, and analysis
11.3. An example experiment: TFDvTLD
11.4. Configuration: The Register Experiment command
11.5. Configuration: Manage Experiment
11.6. Configuration: Miscellaneous administrative settings
11.7. Configuration: Client-side sensor installation
11.8. Monitoring: The Experiment Telemetry command
11.9. Monitoring: The PingMail command
11.10. Analysis: Experiment Export command

11.1. Motivation

Hackystat includes a special set of features for supporting software engineering experimentation. To motivate this specialized support, it is important to understand why support for software engineering experimentation in Hackystat differs both conceptually and practically from support for production software development.

In a production software development setting, it is vitally important for users (i.e. software developers) to understand the basic "principles" of Hackystat - the way data is represented, collected, and analyzed - so that meaningful conclusions can be drawn. These principles include the concepts of "sensors" and "sensor data types" and how various combinations of sensors and sensor data types can be used to effectively instrument the development environment and collect the necessary data. Users in a production setting also need to understand the Hackystat concepts of "workspaces", "workspace roots", and "projects" so that they can configure their server-side analyses correctly for individual and group work. Finally, users need to understand concepts such as "software project telemetry" in order to use this analysis technique to gain meaningful insight into their development practices and how they might be improved. If users do not understand these basic principles of Hackystat in a production setting, then it is quite possible that either not enough data will be collected, or the wrong data will be collected, or the analyses invoked will not be interpreted in a meaningful and appropriate way.

On the other hand, in an experimental setting, it is important to minimize, or even eliminate, the need for users (i.e. experimental subjects) to understand the basic principles of Hackystat. This is because in an experimental context, Hackystat principles are nothing more than "overhead" for the users, an overhead that does not contribute anything to the experimental question under study. Indeed, anything a user is required to learn about Hackystat in order to participate in an experiment generally detracts from the quality of the experiment.

For example, consider an experiment involving the comparison of the "Test First Design" software development method to a more traditional "Test Last Design" software development method. An ideal experimental design would enable users to carry out the procedures without having to learn anything about Hackystat sensors, sensor data types, workspaces, projects, telemetry, and so forth. The only information users would need to "learn" about Hackystat would appear on the experiment consent form, where they would be informed that their development environment has been instrumented and that process and product data was being collected about them and their software. In contrast, an experimental design that forces each subject to register with a Hackystat server, download the HackyInstaller, download and configure sensors, login to the server to configure workspace roots, define a project, and so forth would constitute a significant distraction from the experimental task itself.

Thus, the basic difference between a production setting and an experimental setting is that in a production setting, obtaining useful insights using Hackystat requires users/developers to understand how data is collected and processed so that they can "tailor" the system and its usage appropriately to their circumstances. In an experimental setting, the users/subjects carry out a protocol designed by the experimenter that constrains the way they work, and thus the "tailoring" of Hackystat can and should be defined in advance by the experimenter.