Evaluating software quality without knowing the source code is challenging. Describing and testing functional requirements is much easier: As a requirement engineer you talk with the stakeholders about the demanded functions and after realization you simply check if the functions are there and work correctly. Some categories of software quality [ISO 25010] are visible for the client and the stakeholders, too:
- functional suitability: functional completeness, functional correctness and functional appropriateness
- time behavior
- compatibility: co-existence and interoperability
- usability: appropriateness recognizability, learnability, operability, user error protection, user interface aesthetics and accessibility
- reliability: maturity, availability, fault tolerance and recoverability
- installability (if the client himself installs the application)
Although these kinds of requirements are usually much harder to capture and to test than functional requirements, their implementations are in general visible for the client.
The really nasty software quality dimensions are these:
- performance efficiency: resource utilization and capacity
- security: confidentiality, integrity, non-repudiation, accountability and authenticity
- maintainability: modularity, reusability, analyzability, modifiability and testability
- portability: adaptability, installability (if the supplier installs the application) and replaceability
The client cannot ‘see’ those quality aspects. They are hidden and laying under the bonnet. Sophisticated testing and comprehensive technological and programming knowledge is required to evaluate these hidden software qualities: security experts are needed to perform penetration tests, software architects may assess the software design regarding maintainability and only operation teams are able to check compatibility and portability.
Normally the client does not have this knowledge and the resources for such evaluations. But the client is affected by lack of quality. Insufficient maintainability leads to increasing development costs, missing portability might cause a technological dead end, inefficiency causes increasing operation costs and security holes might have a dramatic business and legal impact. Then, the client is becoming aware of the lack of quality – too late. A good example is the case of unintended acceleration of Toyota vehicles: Because of some untestable and overly complex functions, bugs were not found and people had to die [SRS].
But how can the client demand and check invisible qualities?
The RE Classics
The common way of dealing with hidden software quality is a combination of specification and trust. Next to the functional requirements, the specification contains non-functional requirements like software quality requirements. These quality requirements are general and independent of certain functional demands or they are attached to specific functional requirements (see e. g. [RUPP et al.], p. 247 ff.). However, there is nothing wrong with defining quality requirements in that way. But there are two limits of this approach:
First, in practice, many of the quality requirements specified in that way are usually vague und unspecific. The criteria for good requirements like ‘testable’ and ‘estimable’ are often not given. The effort for formulating good quality requirements is quite high. E.g., the requirement ‘the software should be designed in such a way that modifications can easily be realized’ does not specify anything. It is just a declaration of intention without any concrete guidance. To be precise, many questions must be clarified: What is easy, i.e., what effort for a modification is acceptable? What kind of modifications are we talking about? Answering these questions requires a comprehensive and time-consuming up-front effort. Such an effort is usually not appropriate and in many cases not compatible with agile software development. Furthermore, there is a large effort for testing the implementation of those requirements. Even if you are able to define acceptance criteria, you have to carry out the acceptance test. This is easy for most functional requirements, possible for visible quality requirements – but can be really hard and costly for invisible ones.
Second, there is a general danger by specifying requirements classically and up-front: You get what you specified, not necessarily what you want. One central purpose of iterative development is to avoid this effect: During the development process, requirements that have turned out to be unnecessary or unimportant can be removed and replaced by new ones. Thus, waste by implementing unused functions is avoided or at least reduced. But what about quality requirements, for instance: What about the quality requirement ‘modifiability’ not linked to any concrete functional requirement? This requirement cannot be defined during implementation, but should be formulated up-front. Thus, it can produce waste because the software architecture might be prepared for modifications not needed later on. In this case, it is appropriate to define a quality vision rather than a concrete specification. The vision is necessary to develop an appropriate software architecture, but an extensive specification might encourage implementing a too complex and unused architecture.
Specifying software quality requirements is useful as long as stakeholders and development team are aware of these limits. It is useful as long the effort for the specification is reasonable, especially at the beginning of the project. But because of these limits, further alternative approaches make sense and should be added.
Don’t Do Micromanagement
The job of the client (the requirement engineer or the product owner) is to capture, to communicate and to test business requirements. It is the contractor (the IT supplier or development team) who is responsible for its implementation. Thus, the contractor, not the client, derives technical requirements from business requirements. This includes quality metrics like cyclomatic complexity or clean code guidelines.
Some clients tend to do the contractor’s job. They define technical quality guidelines and metrics and demand and control its realization. This behavior shifts responsibility from the contractor to the client: If the technical requirements defined by the client are fulfilled and later on it turns out that these technical requirements do not meet business demands, then the client and not the contractor is responsible for this. This is micromanagement: The management (here the client) tells to the experts (here the contractor) how to do their job. Micromanagement encourages working according to guidelines and processes, not according to the actual objectives. Micromanagement encourages teams to develop in brain-off-mode.
Especially agile, self-organized development teams must have the freedom to decide how they want to realize business requirements, as stated in the Agile Manifesto: “Build projects around motivated individuals. Give them the environment and support they need, and trust them to get the job done.” [AGILE MANIFESTO] Client’s micromanagement contradicts agile development approaches. Thus, it is a serious impediment when changing traditional organizations to agile organizations.
There can be exceptions: Some branches of industries, like the medical industry, have to fulfill legal or official requirements which force the development team to assert certain code guidelines and metrics. Some authorities and courts like micromanagement to ensure product safety, and the client has to forward their requirements to the contractor.
Since specifying invisible software quality requirements has limits and micromanaging should be avoided, the following approaches are presented which focus on monitoring business quality requirements under laboratory and field conditions. These approaches should be added to or replace classical specifications.
Software quality is made for real life: for real customers and markets and for real operation and technical environments. Thus, monitoring quality under real conditions is the best way to check if the software meets business criteria. Monitoring under field conditions can be dangerous since lack of quality and its late capturing might cause real business damage. Therefore, field monitoring is only appropriate if this danger and its business consequences can be restricted to an acceptable level. Nevertheless, for all quality categories, invisible and visible ones, field monitoring methods are available.
Except the dimension ‘time behavior’, performance efficiency dimensions and its absence are usually not noticed by the users of the software. But they are noticed by operators when system resources like storage or processors have to be added. The costs of such resources are the actual business impact. These costs can be monitored and benchmarked by the operator or client of the system.
At first glance, the idea of checking security in the field seems to be quite odd. Finding security holes only during operation can lead to large reputation damage and legal consequences. Thus, validating security in advance under laboratory conditions is necessary. Nevertheless, even when the software is already rolled out, external security experts and hackers can give valuable information on existing security holes. Therefore, bug bounty platforms like Hackerone offer a communication between system providers and the hacker community. Hackers and security researchers may legally report discovered security holes to the provider. Some providers pay for reported holes. In this way, the provider captures security holes which have been unknown so far.
Maintainability consists of the dimensions ‘modularity’, ‘reusability’, ‘analyzability’, ‘modifiability’ and ‘testability’. All these dimensions share the same basic business impact: They enable an efficient and thus economical further development of the IT system during the entire product lifecycle. But these dimensions come in different flavors. Modularity enables an efficient exchange of modules and components and an independent and thus efficient work of different development teams. Therefore, modularity is crucial for large products developed by different teams. Reusability accelerates software development and thus decreases long-term implementation costs. Analyzability supports bug fixing and maintenance even for developers not familiar with the source code. Therefore, it is important for long-term maintenance. Like reusability, modifiability enables an efficient implementation of change requests and any other further development. Testability supports bug finding and stabilizes the entire development process and thus decreases its costs. Although all these dimensions have impact on every product life cycle, modularity and reusability are more important during the initial development while analyzability, modifiability and testability become more important for long-term software maintenance. Because the complexity of a system usually increases over time, lack of maintainability leads to increasing development costs.
Measuring adaptability or replaceability for a system in operation is very limited. Because events like adaption for another environment or replacing software components are rather singular than continuous, only limited data regarding the costs are available. However, the time for installing software can be measured periodically. Increasing installation times indicates a lack of automatization or missing documentation.
In the Laboratory
Simulating real operating conditions in the laboratory is an alternative or a supplement for validating software quality under field conditions. It is especially important if lack of quality would cause too much damage in real operation. Simulation may discover lack of quality before rollout. Thus, it is a core approach of quality assurance from a business perspective.
In the following, several quality monitoring methods under laboratory conditions are presented, which might be appropriate depending on the software type, the organization and the product vision.
Load tests are common methods to test the capacity of IT systems in the laboratory. Typically, there is a special pre-live stage that should have the same technical environment and basically the same data as the productive stage. Thus, it simulates real operating conditions and enables realistic load tests.
Like testing performance efficiency, validating security under laboratory conditions is well known and established: Penetration tests are popular approaches in many organizations. While hacker attacks on productive systems are usually black penetrations tests, penetration tests under laboratory conditions should be white ones, i.e. the source code is available for the testers. Penetration tests are comprehensive and have to be performed by security experts. Thus normally they cannot be performed after each sprint of two or four weeks, but are usually performed only infrequently. This organizational and processual separation of continuous development and software maintenance on the one hand, and white penetration tests on the other hand, might encourage separation of responsibility and micromanagement: Software security might be outsourced to the security team and considered as an add-on instead of an integral part of software development, while security experts might focus on potential source code weaknesses instead of real security holes. In this situation, management tends to define standards and guidelines to organize interaction between development and the security team. But again, micromanagement should be avoided: It is the job of the development team to produce software of high quality including security aspects. It is the job of the security experts to identify security holes, to help the development team to fix these holes and to evaluate if software is secure enough to be released. And it is the job of the management to encourage trustful cooperation between development and security teams rather than negotiating guidelines.
While the maintainability aspects modularity, reusability and modifiability are hard to measure under laboratory conditions without practicing micromanagement (i.e., defining code guidelines), there is a nice way to check analyzability. If software maintenance has not been demanded for a while and the developer team has been exchanged in the meantime, analyzability is crucial when further development starts again. Analyzability should ensure developers can understand the source code and its architecture easily. A proper documentation – this can be a well-written source code, comments in the source code, other documents like specifications or system handbooks or a combination of all these– have the purpose to make the software understandable and analyzable. Therefore, a developer who has not been involved in developing this software so far should be able to fix bugs if analyzability is given. This can be tested in advance: Introduce a new developer into the team – let him try to fix known bugs – give him all available source code and other documents, but do not let him talk with other people – ask him to speak out loudly what he is thinking while trying to understand the system – invite other team members to this test – let them discuss the results and the experiences they have made during the test. In this way, weaknesses can be identified: missing documents, too comprehensive or useless documentation, inconsistencies, ambiguous naming of methods, variables and classes and design patterns which can hardly be understood.
In many cases it is not necessary that software runs on every operating system, for every browser version, on every device, etc. But software which is trapped in a special technical environment can be a dead end. Therefore, software should be developed in such a way that it can be adapted to other operational or usage environments. The client can check this by capturing the effort it takes to adapt a certain piece of the software to another environment. If the effort is large, adaptability is not given. In this case the reasons for this should be discussed with the development team and tasks for improving the flexibility of the software design should be defined. It is not necessary to adapt big parts of the software for this test. Even the adaptation of a small piece may indicate weaknesses of the software design.
There are several methods for checking invisible software quality dimensions (i.e., not directly perceptible by the client), but there is no definite tool set or recipe which is appropriate for every project and every kind of software. Instead, the right mixture of these methods has to be found for every digital product and its development phase.
In general, the client should not specify those quality requirements in too much detail or too technically. Instead, he should focus on the business impacts of these quality aspects. He can do this by capturing the costs, times and other data of the software operation and by demanding laboratory tests which simulate real field conditions.
The client who focuses on business, real-life implications of software quality gives the contractor and development team the chance to take the responsibility for the implementation of the business requirements. And it is more likely that those clients are getting the quality they really need instead of the quality they demanded at the beginning.
- [AGILE MANIFESTO] http://www.agilemanifesto.org/principles.html. 21/10/2012.
- [ISO 25010] https://www.iso.org/obp/ui/#iso:std:iso-iec:25010:ed-1:v1:en. 14/01/2016
- [RUPP et al.] RUPP, Chris, SOUPHIST GROUP: Requirements-Engineering and -Management. 5th edition, Carl Hanser Verlag München Wien, 2009
- [SRS] SAFETY RESEARCH & STRATEGIES, INC.: http://www.safetyresearch.net/blog/articles/toyota-unintended-acceleration-and-big-bowl-%E2%80%9Cspaghetti%E2%80%9D-code. 07/11/2013