In other industries than the software industry, productivity measurement is a normal activity that drives the success of a company. Let’s see for example a one man painting company. For a painter, it would be logical to measure his productivity in effort hours per square meter. Probably he wants to differentiate the measurement into some categories,like tools used (例如. roller / brush / spray/ etc.) and paint object characteristics (例如. wall/ stairs/ door/ etc.). When the painter builds up a database with productivity figures, he can easily quote for new painting jobs, simply by measuring the paint surface in square meters, multiplying with the proper productivity rate and multiply the result with the hour rate he asks. If there is a (国际的) database available with the productivity rates of paint jobs performed by other companies in the industry, the painter understands how he performs on average against the industry and in case he is not best-in-class already, he understands that to win new paint jobs he has to keep improving his productivity (as lowering the hour rate is usually not a very good idea).
He can do all this, but only when:
- He uses a standard measurement unit, 例如. square meter. Only by using standards, productivity rates can be compared (benchmarked) and used for estimating;
- He uses a standard way to record the effort hours. For example, is the lunch break included or excluded, is the time to talk to the customer to understand his requirements included/excluded?
- He uses meaningful categories that differentiate productivity. For a painter, it may not matter too much if the paint object (let’s say a wall) is in a villa or in a fisherman’s house. The type of house may not be a meaningful category. 然而, the tool he uses is probably a main productivity driver.
Now, let’s take a look at the software industry.
The Software Industry and Productivity Measurement
Unfortunately, the IT (and software) industry is still quite immature when it comes to using standards and when it comes to productivity (performance) measurement, benchmarking and continuous improvement. The industry got away with that for a long time, because:
- It is difficult to measure output (software is not a physical thing, can’t be touched and measured with conventional measurement instruments).
- Software projects are much more like an R&D project than manufacturing a product. R&D is incredibly hard to measure. It is relatively easy to measure the inputs, but the outputs are hard to measure and unpredictable by nature.
Now, slowly the industry is becoming more and more transparent and customer organizations ask potential suppliers more and more to quantify their performance based on historical data. This way, it becomes possible to select the best supplier for the job. Please note that the best choice is usually not the least expensive choice (often resulting in project failures…or even disasters).
While It may seem easy to implement a Productivity Measurement process in an organization, reality shows that it is more difficult than one may think. In principle, just like the painter, it is sufficient to measure inputs (usually effort hours) and outputs (Units of Measurement, UoM) per software project, while using meaningful categories to differentiate the projects, like technology (Java/.Net/Oracle/Etc.), project type (new development/enhancement) and/or implementation (Package implementation/modification/custom made software).
能够建立有意义的和可比较的生产率指标, it is critical that (国际的) standards are used. A number of choices have to be made:
Some decisions that have to be made:
- Effort hours in/out scope of measurement, for instance
- 技术设计, 编码, 单元测试, 系统测试, 其他供应商测试, overhead in scope;
- 功能设计, 支持验收测试, implementation activities out of scope.
- Overtime in/out scope of measurement;
- 旅行时间, meeting hours, overhead hours in/out scope;
- In case of packages, Portals/CMS or other configurable software, it may be necessary to have separate effort registration activities for customization, setting parameters and custom made software not part of the package.
To be able to analyze the productivity of a supplier, department or team, the effort registration system should be implemented in a standard way. If the choice is made that functional design hours are out of scope, all projects should register their effort of functional design separately from the other effort hours. It is strongly recommended to draw up a standard ‘Work Breakdown Structure (WBS)’ per project type and implement this WBS in the effort registration system. Everybody who registers effort hours should be aware of the importance of booking their effort correctly in the system.
Measuring Outputs, methods that should be avoided
Measuring outputs is somewhat harder than measuring the inputs, due to the intangible nature of software. Many organizations measure the delivered source lines of code (slocs) of the software product delivered after completion and use productivity metrics like effort hours per 1000 slocs. This seems like a good way to go, but in fact there are many reasons why this is not a recommended practice:
- There is no (ISO or other) standard for Source Lines of Code. The result is that different automatic code counting tools produce (very) different results for the same code.
- It is not clear if more code is ‘good’ or ‘bad’. Source lines of code are not of value for the customer organization. Functionality is of value. Customers never say:”Please give me 100000 source lines of code”. 没有, it is functionality in terms of features that they require. More functionality is better and costs more, more slocs is maybe not better.
- Different programming languages (and mixes of these) result in very different source lines of code results.
American ‘software metrics guru’ Capers Jones wrote in a paper ‘Software Defect Origins and Removal Methods (2013) that sloc measurements are so inaccurate, that using slocs in software measurement is in fact ‘professional malpractice’.
Other size measures that are often used in the industry, but are also not recommended to use in productivity measurement:
- Story points (SP) in agile projects: A very subjective measure that only has value within one team. Comparison to other teams, departments and organizations is not possible. Please note that SP are useful to plan sprints and to track velocity for one team, but for productivity measurement SP are close to useless.
- Usecase Points (UCP): Only applicable when the documentation consists of usecases. UCP is also a highly subjective method, especially when it comes to establishing the Technical Complexity Factor and the Environmental Factor. Also, there is no standard way to write usecases, see for instance five possible levels of granularity as described 这里.
- Complexity Points: Subjective and not standardized method to measure the complexity of an application.
- IBRA Points: Not standardized method to measure the business rules in an application. When applied according to the manual, the result is zero for all applications.
- Fast Function Points (FFPA) (by Gartner): A measurement method deployed by Gartner that can not be compared to the ISO standardized function point analysis methods. FFPA is perceived to be a commercial method that lacks a theoretical base and is partly subjective. The method has not proved to be faster than the Nesma estimated method and has not proved to be more accurate. Unfortunately it is often pushed on higher management level without the support of the specialists who have to work with it.
Measuring Output – strongly recommended methods
It is a highly recommended practice to use an ISO / IEC标准 for functional size measurement in Productivity Measurement of software projects. There are five functional size measurement methods that comply to the ISO/IEC standard:
- NESMA功能点 (ISO / IEC 24570);
- 联合会 功能点 (ISO / IEC 20926);
- COSMIC 功能点 (ISO / IEC 19761);
- Mark II 功能点 (ISO / IEC 20968);
- FiSMA 功能点 (ISO / IEC 29881).
Advantages of using one of these functional size measurement methods for productivity measurement are:
- Objective, 可重复的, verifiable, defensible way to determine the size of the software;
- A clear relation between functional size and effort needed to realize the application.This has been studied and verified many times;
- 客户组织和供应商组织都清楚该措施. More functionality means more value, more effort needed and a higher price;
- Functional size is independent of the technical solution and/or the non-functional requirements. 一个应用 500 用Java实现的NESMA功能点与 500 FP. This enables comparison and benchmarking over technical domains and the use of historical project data (when properly classified) in estimating new software projects.
Useful categories for data collection
为了能够比较和基准化您的生产率, it is important to use standard categories to collect data from your projects. Nesma highly recommends to use the definitions and categories that 国际软件基准标准小组 (ISBSG) is using in their data collection activities.
国际软件基准用户组 (ISBSG) is a ‘not-for-profit’ organization that collects software industry data and that grows, maintains and exploits two repositories: ‘New developments and enhancements’ and ‘Maintenance & Support’. ISBSG can only do this when data is collected in a standard way. The ‘Data Collection Questionnaires’ can be downloaded from the ISBSG site and show already a lot of definitions and categories. Also the glossaries that are provided with the repositories are helpful.
Implementing software productivity measurement
所以, just like the painter from the example above, it is possible to measure the productivity of software projects. See this page for a few examples.
To successfully implement software productivity measurement, it is strongly recommended to use the document ‘Basis of Measurements’ that is listed in the list of 出版物.