Key Software Metrics

Complexity Metrics

Prominent in the history of software metrics has been the search for measures of complexity. This search has been inspired primarily for the reasons discussed above (as a necessary component of size) but also for separate QA purposes (the belief that only be measuring complexity can we truly understand and conquer it). Because it is a high-level notion made up of many different attributes, there can never be a single measure of software complexity [Fenton 1992]. Yet in the sense described above there have been hundreds of proposed complexity metrics. Most of these are also restricted to code. The best known are Halstead's software science [Halstead 1977] and McCabe's cyclomatic number [McCabe 1976].

Figure 5.1 Halstead’s software science metrics

Halstead defined a range of metrics based on the syntactic elements in a program (the operators and operands) as shown in Figure 5.1. McCabe's metric (Figure 5.2) is derived from the program's control flowgraph, being equal to the number of linearly independent paths; in practice the metric is usually equivalent to one plus the number of decisions in the program. Despite their widespread use, the Halstead and McCabe metrics have been criticised on both empirical and theoretical grounds. Empirically it has been claimed that they are no better indicators of complexity than LOC since they are no better at predicting effort, reliability, or maintainability. Theoretically, it has been argued that the metrics are too simplistic; for example, McCabe's metric is criticised for failing to take account of data-flow complexity or the complexity of unstructured programs. This has led to numerous metrics that try to characterise different views of complexity, such as that proposed in [Oviedo 1980], that involves modelling both control flow and data flow. The approach which is more in keeping with measurement theory is to consider a range of metrics, which concentrate on very specific attributes. For example, static path counts, [Hatton & Hopkins, 1989] knot count [Woodward et al 1979], and depth of nesting [Fenton 1991].

Figure 5.2: Computing McCabe’s cyclomatic number

All of the metrics described in the previous paragraph are defined on individual programs. Numerous complexity metrics which are sensitive to the decomposition of a system into procedures and functions have also been proposed. The best known are those of [Henry and Kafura 1984] which are based on counts of information flow between modules. A benefit of metrics such as these is that they can be derived prior to coding, during the design stage.

Resource estimation models

Most resource estimation models assume the form

effort=f (size).

so that size as seen as the key "cost driver". COCOMO (see figure 5.3) [Boehm 1981] is typical in this respect. In this case size is given in terms of KDSI (Thousands of Delivered Source Instructions). For reasons already discussed this is a very simplistic approach.

Figure 5.3 Simple COCOMO model

The model comes in three forms: simple, intermediate and detailed. The simple model is intended to give only an order of magnitude estimation at an early stage. However, the intermediate and detailed versions differ only in that they have an additional parameter which is a multiplicative "cost driver" determined by several system attributes.

To use the model you have to decide what type of system you are building:

  • Organic: refers to stand-alone in-house DP systems
  • Embedded: refers to real-time systems or systems which are constrained in some way so as to complicate their development
  • Semi-detached: refers to systems which are "between organic and embedded"

The intermediate version of COCOMO is intended for use when the major system components have been identified, while the detailed version is for when individual system modules have been defined.

A basic problem with COCOMO is that in order to make a prediction of effort you have to predict size of the final system. There are many who argue that it is just as hard to predict size as it is to predict effort. Thus to solve one difficult prediction problem we are just replacing it with another difficult prediction problem. Indeed in one well known experiment managers were asked to look at complete specifications of 16 projects and estimate their implemented size in LOC. The result was an average deviation between actual and estimate of 64% of the actual size. Only 25% of estimates were within 25% of the actual.

Figure 5.4 Simple COCOMO time prediction model

While the main COCOMO model yields a prediction of total effort in person months required for project development, this output does not in itself give you a direct prediction of the project duration. However, the equations in Figure 5.4 may be used to translate your estimate of total effort into an actual schedule.

Figure 5.5 Regression Based Cost Modelling

Regression based cost models (see Figure 5.5) are developed by collecting data from past projects for relationships of interest (such as software size and required effort), deriving a regression equation and then (if required) incorporating additional cost drivers to explain deviations of actual costs from predicted costs. This was essentially the approach of COCOMO in its intermediate and detailed forms.

A commonly used approach is to derive a linear equation in the log-log domain that minimises the residuals between the equation and the data points for actual projects. Transforming the linear equation,

log E = log a + b* log S

from the log-log domain to the real domain gives an exponential relationship of the form E=a*Sb. In Figure 5.5 E is measured in person months while S is measured in KLOC.

If size were a perfect predictor of effort then every point would lie on the line of the equation, and the residual error is 0. In reality there will be significant residual error. Therefore the next step (if you wish to go that far) in regression based modelling is to identify the factors that cause variation between predicted and actual effort. For example, you might find when you investigate the data and the projects that 80% of the variation in required effort for similar sized projects is explained by the experience of the programming team. Generally you identify one or most cost drivers and assign weighting factors to model their effects. For example, assuming that medium experience is the norm then you might weight ‘low’ experience as 1.3, medium as 1.0, and high as 0.7. You use these to weight the right hand side of the effort equation. You then end up with a model of the form

Effort = (a *Sizeb)* F

where F is the effort adjustment factor (the product of the effort multiplier values). The intermediate and advanced versions of COCOMO contain 15 cost drivers, for which Boehm provides the relevant multiplier weights.

Metrics of Functionality: Albrecht's Function Points

The COCOMO type approach to resource estimation has two major drawbacks, both concerned with its key size factor KDSI:

KDSI is not known at the time when estimations are sought, and so it also must be predicted. This means that we are replacing one difficult prediction problem (resource estimation) with another which may equally as difficult (size estimation)

KDSI isa measure of length, not size (it takes no account of functionality or complexity)

Albrecht's Function Points [Albrecht 1979] (FPs) is a popular product size metric (used extensively in the USA and Europe) that attempts to resolve these problems. FPs are supposed to reflect the user's view of a system's functionality. The major benefit of FPs over the length and complexity metrics discussed above is that they are not restricted to code. In fact they are normally computed from a detailed system specification, using the equation


where UFC is the Unadjusted (or Raw) Function Count, and TCF is a Technical Complexity Factor which lies between 0.65 and 1.35. The UFC is obtained by summing weighted counts of the number of inputs, outputs, logical master files, interface files and queries visible to the system user, where:

  • an input is a user or control data element entering an application;
  • an output is a user or control data element leaving an application;
  • a logical master file is a logical data store acted on by the application user;
  • an interface file is a file or input/output data that is used by another application;
  • a query is an input-output combination (i.e. an input that results in an immediate output).

The weights applied to simple, average and complex elements depend on the element type. Elements are assessed for complexity according to the number of data items, and master files/record types involved. The TCF is a number determined by rating the importance of 14 factors on the system in question. Organisations such as the International Function Point Users Group have been active in identifying rules for Function Point counting to ensure that counts are comparable across different organisations.

Function points are used extensively as a size metric in preference to LOC. Thus, for example, they are used to replace LOC in the equations for productivity and defect density. There are some obvious benefits: FPs are language independent and they can be computed early in a project. FPs are also being used increasingly in new software development contracts.

One of the original motivations for FPs was as the size parameter for effort prediction. Using FPs avoids the key problem identified above for COCOMO: we do not have to predict FPs; they are derived directly from the specification which is normally the document on which we wish to base our resource estimates.

The major criticism of FPs is that they are unnecessarily complex. Indeed empirical studies have suggested that the TCF adds very little in practical terms. For example, effort prediction using the unadjusted function count is often no worse than when the TCF is added [Jeffery et al 1993]. FPs are also difficult to compute and contain a large degree of subjectivity. There is also doubt they do actually measure functionality.

Next section - measuring faults, failures and errors


Last modified: July 28, 1999.