If you measure the weight of a person, you probably express this in a unit like kilograms, pounds, stones, etc.
But what you are in fact doing is comparing the person’s weight against a standardised measurement unit of weight, called a metric.
A metric is a unique way to define what you want to measure. When weighing a person, you perform an action on the person, the action called ‘measurement’. The action is based on the metric and the result of the measurement is expressed in the metric.
After the action what is left is the result of the measurement, the measure.
In short, a metric is a unit definition, used in the action of measuring. The result of a measurement is compared to this metric, leading to a unique and uniform and common understandable measure.
But, that is only one direction, from definition to action to result, but there is also the reversed direction.
Sometimes it is necessary to put (counter)measures in place, reactions on a situation that occurs, most of the time when the result of a measurement shows that certain boundaries were (not) reached, depending the situation.
These boundaries are often referred to as (entry and/or exit) criteria.
E.g. The music is too loud and exceeds the allowed boundaries, therefore you have to put the volume down.
It is this flow of events that leads to the common believe that what you measure, you better understand, you can control and what you can control , you can improve. If you can’t measure it, you can’t understand it, you can’t control it and therefore can’t improve it.
“Measurement is the first step that leads to control and eventually to improvement. If you can’t measure something, you can’t understand it. If you can’t understand it, you can’t control it. If you can’t control it, you can’t improve it.” – H. James Harrington
But, what happens if you encounter a situation where the maths are not that simple. E.g. performance of a company, an ecological system, traceability in software testing, etc.
In that case it all starts with the definition, the metric, of what you want to measure and often you will find out that you need multiple comparisons, multiple units/metrics, but that it is still possible to break down what you want to measure to a number of measurements, giving you a unique and uniform view on what you are measuring and allowing you an objective way of evaluating the situation.
Again, this is easier to explain with an example. Let’s have a look at traceability in software testing.
We start from the assumption that every requirement needs to be covered (is traceable to) by at least 1 use case, 1 use case needs to be covered by at least 1 positive test suite and 1 positive test suite must be covered by at least 1 test case.
I know, that this is probably not the way you are setting up your testing, but bare with me just for the example.
In the above setup I will be measuring :
- how many percent of requirements are covered by a use case
- how many percent of use cases have at least 1 positive test suite
- how many test suites are covered by at least 1 test case
These 3 numbers give me a pretty good view starting from requirements.
On the other hand, we also can go the reversed way.
- How many percent of test cases are used in a test suite?
- How many percent of test suites are related to a use case?
- How many use cases are covering requirements?
The 6 numbers together give us even a better view on the traceability between requirements/use case/test suites/test cases.
The metric is each time a ratio that can be expressed as a percentage and it allows to make an objective evaluation of the traceability.
It doesn’t give a complete view. ‘A requirement must be covered by at least 1 use case’ doesn’t tell you whether the requirement is covered enough by one use case. (But that is perhaps for the next article)
What I tend to do is to have 1 formula that combines all numbers and that allows to score e.g. traceability. In the above example you could take the average of all numbers, or you could use a radar diagram to show the footprint of the traceability as shown in the image.
But, what happens if there are unknown elements playing a role. elements you can’t put your finger on? E.g Stock market, profiling a serial killer, etc.
If you have a system with so many variables, so many parameters playing a role, it becomes harder to find the right definition for the metric. You could even question yourself if it makes sense to define a metric.
The management of a company loves numbers. Preferably numbers that show how well they are doing. These numbers are often referred to as KPI’s (Key performance indicators) and the name already explains the difference with a metric.
A metric gives a fix, uniform, consistent view on what you want to measure. A metric will allow an unbiased evaluation.
A KPI is nothing but an indicator and the (subjective) interpretation of the indicator is as important as the KPI itself.
E.g. Based on budget approved cost, earned value, percentage completed, etc. it is possible to calculate the cost indicator and performance indicator. Numbers greater than 1 indicate a positive trend, while numbers less than 1 indicate a negative trend.
At first sight you could think that this is a metric, but nevertheless, the formula used to calculate these indicators are experience based formulas and don’t tell why the result of the indicator is what it is.
Measuring traceability we exactly could tell why a number was low. Requirements are not covered by use cases, so by writing more use cases we know that the number will improve.
There is a 101 relationship.
In the case of e.g. a cost indicator we don’t know. Are we using too many people, too expensive people, too expensive equipment, etc.
That’s why it is an indicator and not a metric. But they have there benefits in helping people to understand complex systems and situations. They are build on statistical data and experience and often will lead to the correct evaluation.
The only difference with metrics is that this evaluation is biased and depends of the person making the evaluation.
I couldn’t profile a serial killer, even if I wanted, but profilers can because they have the experience to evaluate the statistics captured from a crime scene in order to come to conclusions on who can be the perpetrator.
We only have to be aware that numbers, statistics, indicators are not ‘holy-making’ and should never replace common sense.
The quote mentioned above “Measurement is the first step that leads to control and eventually to improvement.
If you can’t measure something, you can’t understand it. If you can’t understand it, you can’t control it. If you can’t control it, you can’t improve it.” holds a huge risk.
It implies that if you want to improve something you MUST measure it. It doesn’t say HOW to measure it.
The risk is that people will come up with numbers that in fact give a ‘false witness’. Numbers that apparently show something, but in fact can mean something completely differently.e
.g. In software testing, one of the numbers that managements like is the number of defect found for each executed test case. There is even a benchmark that says that in average 1 defect is found for every 2 test cases executed.
People could think that if they find 0,6 defects per test case, that they are doing a great job, but who is saying that the testers are doing great?
Perhaps the developers are screwing up big time and their software contains more bugs to be found.
As indicated before, you must be aware of the difference between metrics and indicators. #defects/# executed test cases is an indicator, but will need a subjective evaluation to find the root cause if the number indicates a problem.
Metrics and Quality
It is a huge leap to go from metrics to quality (and should be the subject of another article), but I don’t want to withhold the following rules I found. My apologies, but I don’t know anymore where I got them from, but I do know that they help me to figure out when a number is a metric, or not.
- Correlation : linear relation Q/M
- Consistency : Improving M = improving Q
- Tracking : Changing M = Changing Q
- Predictability : If M then Q
- Discriminative power : High M,Q ≠ Low M,Q
- Reliability : M valid for x% of use of M
- (M = Metric; Q = Quality)
What these rules in fact say is that a metrics are related 1-on-1 to a quality characteristic. If the metric improves, also the quality improves, if the metric changes also the quality changes and visa versa. The link between metric and quality characteristic is not occasional, but can be used at all times. If the metric changes, the quality has to change with relatively the same amount. Further more, metrics must be valid almost all the time. Only e.g. a division by zero, allows the metric not to be valid.
Sense and/or Nonsense?
And, this brings us to the title of this article. The sense and nonsense of metrics. I would even dare to take it one step further and talk about the sense or nonsense of numbers (metrics and/or indicators).
NEVER rely solely on numbers. Make sure that numbers are always interpreted by capable people and if you communicate them, make sure the interpretation is part of the communication.
Numbers must mean something, so it is a good practice to link them to what you consider a quality characteristic. Often the GQM (Goal – Question – Metric) method of Basili is used to do this. Also ISO and IEEE standard concerning quality contain references to a bunch of measurements you can make.
NEVER measure because it is easy to do. Sometimes it is easy to define a metric or indicator, but nobody does anything with it. It is nice to know the shoe size of the project manager, but it will not help our project.
Measure Everything Thats’s Required to Increase Customer Satisfaction.
ALWAYS think of what you want to achieve and for whom you are doing it. Although it is not always clear who your customer is. As a test manager, who is your customer? Your management? The project manager? The client that will use the product? If not clear, prepare to have multiple customers and to have multiple sets of metrics.
Personally I tend to make everything I do measurable. I makes it easier to stop myself. I tend to get so passionate about what I do, that I tend to do more than necessary. By defining metrics and spending the time on defining metrics, I trap myself and am I able to keep myself under control. I guess this is maturity, no?
Please, share your experiences with metrics …
Ascertain the size, amount, or degree of (something) by using an instrument or device marked in standard units.
Estimate or assess the extent, quality, value, or effect of (something).
Judge someone or something by comparison with (a certain standard)
Scrutinize (someone) keenly in order to form an assessment of them.
A plan or course of action taken to achieve a particular purpose.
Punishment or retribution imposed or inflicted on someone.
Another nice way to represent data and to make a score visible, are bubble charts. Not only you can show the importance of the items, but you can also show the weight of the item. E.g. A risk with a higher impact and a higher probability has a higher weight.
How to measure a serial killer?
Not so long ago there was this guy that thought he could measure when someone was a serial killer.
By measuring the size of the eyes, the forehead and a bunch more of similar measurements, he claimed that he could predict who could be a serial killer.
I’m pretty sure that he annoyed a lot of people and also falsely accused people, but in the end common sense won.
Now, profilers use statistical data to figure out the motives and the profile of serial killers, but they also know that it is not an exact science. The subject opinion as well as the experience of the profiler play a big role.
Victor R. Basili, born April 13, 1940 in Brooklyn, New York, is an Emeritus Professor at the Department of Computer Science and the Institute for Advanced Computer Studies, both at University of Maryland. He holds a Ph.D. in Computer Science from the University of Texas at Austin and two honorary degrees. He is a Fellow of both the Association for Computing Machinery (ACM) and of the Institute of Electrical and Electronic Engineers (IEEE).
From 1982 through 1988 he was Chair of the Department of Computer Science at the University of Maryland. He is currently a Senior Research Fellow at the Fraunhofer Center for Experimental Software Engineering – Maryland and from 1997-2004 was its Executive Director.
He is well known for his works on measuring, evaluating, and improving the Software development process, especially his papers on the Goal/Question /Metric Approach, the Quality Improvement Paradigm, and the Experience Factory “
“12 Characteristics of Effective Metrics”
Wayne W. Eckerson on April 19, 2010.
Creating performance metrics is as much art as science. To guide you in your quest, here are 12 characteristics of effective performance metrics.
- Strategic. Start at the end point–with what you want to achieve and then work backwards.
- Simple. Know what is measured and how it is calculated. If too difficy-ult, re-think.
- Owned. A metric has an owner who is held accountable for its outcome.
- Actionable. If a metric indicates problems, employees should know what actions to take to improve the measures.
- Timely. Metrics must be updated frequently.
- Reference-able. Users must understand the metric’s origins.
- Accurate. Underlying data needs to be scanned for defects, standardised, deduced and integrated before displaying.
- Correlated. Metrics must drive desired outcomes.
- Game-proof. Ensure that metrics cannot be circumvented.
- Aligned. Metrics are aligned with corporate objectives.
- Standardised. All agree on the definitions of metrics.
- Relevant. Metrics have a natural life cycle.
Following basic information is needed to define a metric.