Probably the best way to avoid incoherence in requirements is to precisely follow requirement modelling principles, such as those elaborated most notably in the IREB Handbook of Requirements Modelling (Cziharz et al. 2016). Atomic requirements can be compared to a large number of other requirements related to the same task allowing potential incoherence or misconceptions to be detected. Although this approach works for relatively simple, natural language formulations of requirements, it allows engineers to produce and integrate large sets of requirements with a good level of coherence a priori.
Our approach complements these modelling principles. It deals with situations where engineers indeed follow these principles but nevertheless make errors which are more ‘local’, for example in the specification of values. These values may be, for example, numbers, colours, units, but also linguistic markers with a heavy semantic content such as adverbs, prepositions or modals. These errors are not related to modelling misconceptions but rather to lack of attention or distractions, for example due to work overload, domain evolution (e.g. norms, types of equipments) paired with lack of traceability, or insufficient validation steps when several authors are involved. Incoherent requirements turn out to have very costly consequences at production stages: up to 80% of the initial costs. It is therefore crucial to detect them as early as possible.
Incoherence is not easy to define. It has several facets and it is not only a logical problem. It basically consists in two or more requirements, or groups of requirements, which cannot co-exist in the same context without introducing negative consequences, whatever they could be. These requirements are in general not adjacent in a document, otherwise they could easily be fixed. They generally appear in different sections or chapters of a document or even in different documents. A very simple example is:
TCAS (traffic collision avoidance system) alarm message must appear in red on the FMGC (flight management guidance computer) screen,
TCAS alarm message must appear in purple on the FMGC screen,
where there is a colour specification mismatch. More subtle is the mismatch between the two following requirements where modelling principles should detect the incoherence if they are sufficiently accurate:
The system S must be stopped every 24 hours for maintenance, and later in the document:
The update of the database via the system S must not be interrupted.
Incoherence between two requirements may be partial: they may include divergences without being completely logically opposed. Next, incoherence may be visible linguistically, where words used induce incoherence, in about 40% of the cases according to our evaluation, or, instead, incoherence may require domain knowledge and inferences to be detected and characterized. We focus in this article on incoherence which can be detected from a linguistic and general semantic analysis: mining them is simpler and probably re-usable over domains. Finally, we consider here only pairs of individual requirements. Identifying incoherence among two or more groups of requirements is more challenging.
1 What is incoherence in requirements? Definition issues and state of the art
1.1 Some definitions and terminological distinctions
Let us consider four main properties that a set of requirements must meet, from the most external to the most conceptual. We have:
|Cohesion (maximize uniformity of words and constructions)||surface form and readability|
|Completeness (no omissions in a situation)||modelling of all situations|
|Consistency (control data and trace updates)||traceability and update|
|Coherence (no contradictions at any stage)||deep level no contradictions|
Cohesion deals with the characterization of the uniformity of a text, from its typography to semantic aspects such as the choice of words. A requirement document, including its associated structures (summary, definitions, introduction, etc.) must have a uniform organization and presentation to facilitate its understanding and update. It must show very regular syntactic structures. Lexical variation (e.g. use of synonyms) must be constrained. Its semantic and pragmatic content must be controlled to guarantee good uniformity, in particular pre-requisites and implicit elements must be stable to avoid misconceptions.
Completeness means that a set of requirements must fully cover a situation. A simple, but frequent, situation is the development of cases where each case requires for example a different action. Cases must not overlap to guarantee unique solutions and no case must be forgotten. A typical example is the flap level extension before landing, e.g.:
Speed between 250 and 220 kts: flap 5 degrees
Speed between 220 and 200 kts: flap 15 degrees
Speed between 200 and 160 kts: flap 25 degrees
Speed between 160 and 130 kts: flaps full
This case structure is not ambiguous; it satisfies the completeness criterion since the intervals which are given do not overlap and cover the whole spectrum of relevant speeds. The description is also comprehensive since all flaps positions have been considered. In requirement authoring, there are many forms completeness may take.
Inconsistency is a complex notion. It involves requirements which deviate or diverge from a standard, in particular from the semantic or pragmatic points of view. Inconsistency is frequently dynamic; it accounts, for example, for the fact that stakeholders may change their mind when producing requirements. Similarly, equipment may evolve and their update is not fully guaranteed, even with good traceability. Inconsistency is therefore related to human behaviour, whereas coherence is static. Consistency may sometimes be associated with cohesion to characterize how good, and how stable, a text is from the points of view of language and general understanding.
Coherence is a logical notion whose scope extends to a whole set of documents. In epistemology, coherence is part of a theory of truth. A document or a knowledge base is coherent if there is no logical contradiction between all the propositions it contains. A coherent set of requirements does not guarantee that this description is sound, correct and comprehensive. However, we can observe in requirement authoring that coherence is not a Boolean notion: it is possible to observe degrees of incoherence, where lower levels can be allowed.
1.2 Aspects of coherence
Coherence analysis has not been very much investigated in the area of requirement production. Besides modelling principles, authoring norms that limit the linguistic complexity of requirements in terms of structure and lexicon is also a means to avoid incoherence. These considerations are developed in several articles, among which the following, published in the IREB magazine:
- Requirement analysis as problem solving (Loenhoud et al., 2017),
- Finding out the right level of granularity to describe processes (Vazquez, 2017),
- Feature analysis for change requests (Sneed et al., 2017),
- Simulation-based approach to validate large sets of requirements (Tenbergen et al. 2016),
- Using declarative constraint-based approaches for requirement modelling (Jastram, 2016).
Several road-maps on requirement elicitation and writing, e.g. (Wyner, 2010) show the importance of having consistent and complete sets of requirements, and rank it as a priority. On the research side, projects such as those developed by (Kloetzer et al., 2013) aim at finding contradictions between lexico-syntactic patterns in Japanese, including spatial and causal relations. In (de Maneffe et al., 2008), the limits of textual entailment as a method to detect inconsistencies is shown and a corpus is developed from which a typology of contradictions is constructed. The need for very fine granularity in the data is advocated to avoid errors, which is not possible for large sets of requirements.
1.3 Some general purpose research directions to mine incoherent requirements
Incoherence analysis in texts in general and in requirements in particular is in a very early development stage. One of the reasons is that incoherence analysis is not a surface problem, it requires accurate domain knowledge, reasoning and analysis strategies. Artificial Intelligence foundations and techniques can be found in (Marquis et al. 2018), for example.
When dealing with the coherence problem, three main directions can be foreseen:
- The use of satisfiability solvers (SAT solvers) which can efficiently detect incoherence over large sets of propositions. These systems are of much interest in artificial intelligence and a number of them are now freeware. One of the main limitations to their use is that requirements must be translated into propositional logic or very restricted forms of first-order logic. It is clear that this is in general not possible given the linguistic and conceptual complexity of requirements. Furthermore, requirements contain implicit information, known to everyone, which should also be incorporated in some way into these solvers.
- The use of deep learning methods to automatically acquire models of incoherence. This trend is highly fashionable in artificial intelligence, and produces interesting results in some areas. However, it requires seeds to work properly; in our case the annotation of quite a large number of documents which will serve as input to the learning mechanism. Annotating incoherence in requirements is very challenging. Furthermore, several thousands of annotations would be required to ensure that the acquired models are relatively correct. This makes this approach almost intractable at the moment, in spite of a few efforts developed by companies using for example non-supervised methods.
- The use of linguistic methods, based on patterns, which can recognize specific forms of incoherence based on language. Although this method is limited to linguistic aspects and also requires some limited annotation, it turns out at the moment to be the only viable approach. It is described hereafter.
1.4 Working methodology
Our methodology to analyse incoherence is to:
|(1)||observe the problem on corpora that contain incoherent requirements,|
|(2)||organize the errors and categorize them, and then|
|(3)||develop a model, organize the linguistic resources which are needed,|
|(4)||implement a prototype,|
|(5)||evaluate the results, the user’s satisfaction, and define future improvements.|
These points are described with some detail in the sections below.
2 Preliminaries: constructing a corpus of incoherent requirements
Our analysis of incoherence is based on a corpus of requirements coming from five companies in five different critical industrial sectors: energy, aeronautics, insurance, transportation regulations and telecommunications. To guarantee a certain generality to our results, the main features considered to validate our corpus are:
|(1)||requirements correspond to various professional activities, and have been authored by technical writers in different industrial sectors, over a relatively long time span (between one and two years),|
|(2)||requirements correspond to different conceptual levels, from abstract rules to technical specifications,|
|(3)||requirements have been validated and are judged to be in a relatively `final' state,|
|(4)||requirements follow various kinds of authoring norms imposed by companies, including predefined patterns (boilerplates),|
|(5)||the documents from which requirements are extracted are well-structured and of different levels of language and conceptual complexity.|
A total of 7 documents have been analysed, partly manually, using the metrics advocated below, searching for incoherent requirement pairs. Corpora characteristics can be summarized as follows:
|Average length of specification documents||Number of requirements per documents||Total number of requirements||Total number of pairs of incoherent requirements|
|150 - 200 pages||3500 to 4000||27500||128|
When searching for incoherent requirements, we noted that:
- they deal with the same precise topic,
- in the same context,
- they may appear in very remote sections,
- they differ on some significant element(s) which leads to the incoherence.
Since requirements should a priori follow strict authoring guidelines (no synonyms, limited syntactic forms), dealing with the same precise topic means that two requirements which are potentially incoherent should:
- have a relatively similar syntactic structure,
- share a large number of similar words,
- differ on a few words which are at the origin of the incoherence.
To automatically detect such pairs, we have defined two metrics (Saint-Dizier, 2018):
- a similarity metric to deal with the two first points above,
- a dissimilarity metric, which covers the last point above.
These metrics are based on linguistic and typographic factors; these are not developed here.
It is important to note that our system is aimed at generating incoherence warnings: incoherence is probable but it must be confirmed by a human expert. Then, the revision of these requirements, and possibly others as a consequence, is in the hands of requirement authors. The system cannot resolve them. There are also several severity levels which can help writers to organize their revisions.
3 A typology of incoherence based on linguistic criteria
The categorization presented below is defined empirically from our corpus and is based on simple linguistic considerations. The goal of this categorization is to organize and facilitate the definition of templates and associated lexical resources to automatically mine pairs of incoherent requirements in a large diversity of types of requirements. Templates must be sufficiently expressive and generic, but they should not over-recognize incoherent requirements.
The term incompatible, used below between two elements, means that these elements are not equivalent in the domain in which they operate. They are not synonyms or semantically equivalent. Our dissimilarity metric is based on a measure of the incompatibility between two elements.
3.1 Partial or total incompatibilities between expressions
In this category, eight closely related forms of incoherence are included. Categories are based on linguistic factors:
- Incompatible numerical values, arithmetical expressions, arithmetical operator, Boolean constraint,
- Incompatible named entities: measure units: kts vs. nautical miles, temperature, and product names, etc. or valued term (adjectives such as the example in introduction),
- Incompatible temporal organization between events: Update data marking when transferring data. vs. Update data marking before transferring data.
- Incompatible modal variation, which may induce variable criticality levels : It is recommended to stop the engine before E. vs. The engine shall be stopped before E.
- Incompatible use of adverbs (manner, temporal), of quantification: The pipe must be carefully vs. quickly opened.
- Incompatible verb complements: Service D must be available only from the screen vs. in any configuration.
- Incompatible prepositions or connectors: Engines must be switched off before arrival at the gate, vs. Engines must be switched off at the gate.
- Incompatibility in an enumeration: A rule includes facts and their criticality, vs. a rule includes facts and a description.
3.2 Incoherent events
In this category fall pairs of requirements that describe incoherent events, the detection of which often requires domain knowledge. In spite of this knowledge limitation, a number of differences appear at a `surface' level and can be detected for the large part on a linguistic basis. A typical example is given in the introduction of this article. Another type of incoherence frequently found is:
The maximum 20 degree flap extension speed is 185 kts. vs.
Extend flaps to 20 degrees and then slow down to 185 kts.
Note that the second example is more a procedure than a requirement.
3.3 Terminological incoherence
In this category, requirements which largely overlap are considered. Their slight differences may be symptomatic of a partial inconsistency or of terminological variations which must be fixed. These cases are relatively frequent and typical of documents produced either by several authors or over a long time span (where for example equipment names, attributes or properties may have changed). Two typical examples are:
The up-link frequency, from earth to space, must be in the s band. vs.
The down-link frequency, from space to earth, must be in the s band.
Those tests aim at checking the rf compatibility of the ground stations used during the mission and the tm / tc equipment on board. vs.
Those tests aim at checking the rf compatibility of the ground stations used during the mission and the on board equipment.
In the first example, the use of the s band in both situations may sound strange and requires a control check by an expert; in the next example, the expression tm / tc must be compared with on board equipment.
As the reader may note, besides simple terminological variants, this class of incoherence hides deeper modelling problems.
3.4 Incoherence in enumerations
This case covers situations where different actions must be carried out depending on specific criteria, as illustrated in section 1. For example, for a given set of temperature intervals, different actions must be carried out. If these intervals overlap, or if some values are missing between the intervals, it is possible to have a form of incoherence.
The observed frequency of these forms of incoherence can be summarized as follows:
|Incompatible values or expressions||Differences between various types of values, or linguistic expressions||59%|
|Incoherent events||Events which differ in their content or structure||15%|
|Terminological incoherence||Term variations due to updates||14%|
|Incoherent enumerations||Situations such as overlap or gaps||12%|
4 Mining Incoherence in texts: the overall processing strategy
The next challenge is the definition and implementation of templates, which encode the forms of incoherence reported above, to mine incoherent pairs of requirements. The templates are implemented in our TextCoop research platform dedicated to discourse analysis that allows an easy declarative specification of discourse templates. They are then integrated into our freeware LELIE research software, aimed at improving requirement authoring from a language point of view (Saint-Dizier, 2014), (Kang et al., 2015).
The global analysis strategy is based on a comprehensive analysis of a requirement document. The complexity in terms of processing time is quite high since requirements must be compared one after the other with all the other requirements. The processing strategy is organized as a loop as follows:
STEP 1: a new requirement Ri is read from the source text; its discourse structure is tagged via TextCoop, including the kernel requirement portion,
STEP 2: Ri is then compared to all the elements Rj already stored in the requirement database. For that purpose:
|(1)||for all requirements Rj already checked and stored in a database, the similarity and dissimilarity metrics advocated in section 2 are activated to check whether Ri and Rj may potentially be incoherent,|
|(2)||If this is the case, then incoherence patterns are activated to detect potential incoherence. If this is the case, a potential incoherence warning is produced and Ri is not added to the database,|
|(3)||Ri and Rj are then stored in the incoherent pair database for checking by an expert,|
|(4)||if no incoherence has been detected, then Ri is added to the requirement database.|
STEP 3: read a new requirement and go to STEP 1.
In this article, we will not go into the implementation details of point (2) above. Briefly, the strategy is to extract from two requirements the set of terms which are different and then to check whether they belong to one of the categories given in section 3. For that purpose, dedicated templates have been implemented. To identify expressions which differ, syntactic analysis is carried out based on local grammars which deal with the expressions of incoherence. In addition, structured, lexical semantics resources are used, in particular antonyms. These are in general domain independent and therefore re-usable in almost any context.
5 System Evaluation and Discussion
The research elements briefly described above have been implemented within our freeware Lelie authoring environment (Saint-Dizier, 2014) (Kang et al., 2015). This is obviously at an experimental stage that needs further improvements. The last step is to evaluate the results (a) from a purely technical and linguistic perspective and (b) from the point of view of the requirement authors. This evaluation is indicative: it provides improvement directions, not final results.
5.1 Evaluation of the technical results and limitations
An evaluation of the accuracy of the system has been carried out on a test corpus (a corpus different from the above one, called the development corpus) where incoherence has been introduced artificially to be able to make tests. A first result is that the system is more powerful than human analysis:
- 54 pairs of incoherent requirements have been manually identified,
- the system identified 109 of a priori incoherent pairs (about twice as many as those introduced, since the test corpus also contains incoherent requirements).
Then, when inspecting the pairs mined by the system, 74% of the pairs (81 pairs) turn out to be really incoherent and require revisions. It seems that this is a relatively acceptable accuracy for requirement authors.
5.2 Improvement directions
The main directions in which we plan to improve incoherence recognition and, at the same time, to limit noise, are characterized by the following situations:
|(1)||Different forms for a similar content do not necessarily entail incoherence: two requirements may deal with the same point using slightly different means of expression. For example, values and units may be different, or intervals or arithmetical constraints may be used instead of a single value, but these expressions remain globally equivalent, even if they do not follow authoring guidelines very strictly. For example, these two requirements are not equivalent according to the similarity metric: the A320 neo optimal cruise altitude must be FL380 in normal operating conditions vs. the A320 neo optimal cruise altitude must be between FL 360 and FL 380, depending on weather conditions. The second requirement is just more precise.|
|(2)||Use of generic terms instead of specific ones does not entail incoherence: it is also frequently the case that two requirements differ only in that a general purpose term or a business term is used, where one is either more generic or at a different level of linguistic abstraction that the other. Detecting this situation requires an accurate domain ontology.|
|(3)||Presence of negative terms may cause problems to the pattern analysis: the negation or negatively-oriented terms, though not recommended in technical writing, (verbs, adjectives) may appear in one requirement and not in the other, but these requirements are in fact broadly similar. For example: Acid A must not be thrown in any standard garbage vs. Acid A must be thrown in a dedicated garbage.|
|(4)||Influence of the co-text: requirements dealing with a given activity are often grouped under a title and subtitle in an enumeration or in a chart. Two requirements that belong to two different groups may seem to be incoherent, but if the context, given by the title or enumeration introducing the requirement is different, then these two requirements may deal with different cases and are not a priori incoherent.|
In terms of silence, a few requirements have not been detected as incoherent because:
|(5)||Implicit elements prevent the similarity diagnosis: this is the case in the pair: the upper attachment limit must not exceed 25GB. vs. It must be possible to specify a maximum limit for the storage capacity of an attachment. The implicit 'for the storage capacity' expression is unexpressed in the first pair: as a result these two requirements are presumed to be different by the system since they are not close enough (similarity metrics). Missing information prevents the similarity metric from mining requirements that deal with the same precise topic.|
|(6)||Coherent, but related to opposed external contexts: Similarly to (4) above, but conversely, two requirements may be identical but incoherent if the sections in which they appear have titles which are in some way opposed. The incoherence may then be 'external' to the requirements.|
5.3 User satisfaction analysis
Requirement authors are always curious to see what an automatic system can detect in terms of incoherence. They know that detecting incoherence, even with some noise, is a difficult task for humans and they found the results to be useful. Our system can only detect incoherence visible from a linguistic point of view, which we estimate to represent about 40% of the total occurrences of incoherence. Given the relative simplicity of our patterns, our final estimate is that about 25% to 30% of the total occurrences of incoherence are detected. This is not much, but useful given the cost of errors at production stage.
The main feedback from authors which played the role of testers is given informally below, where we concentrate on three main issues. These testers have two preoccupations:
- the relevance and the accuracy of the system incoherence diagnosis and
- the ease of resolving the incoherence, preferably by revising a limited number of requirements.
The first remark we get is that the incoherence is often not limited to the terms which differ between two requirements, but must be analysed on a larger portion of each requirement. For example, our system underlines:
the maximum 20 degree flap extension speed is 185 kts. vs. Extend flaps to 20 degrees and then slow down to 185 kts.
which are the main differences. However, the linguistic scope of the incoherence is larger than what is underlined. It is difficult to precisely generate an automatic message that explains the misconception in simple terms.
Secondly, testers feel that for revising incoherent requirements they need additional tools so that they can access related requirements: indeed resolving incoherence may entail the rewriting of several related requirements, besides those which are explicitly incoherent.
Finally, some forms of incoherence were felt to be rather limited and no revision was carried out (about 15% of the cases). It would be of much interest to be able to identify the severity of the incoherence so that authors can develop a revision strategy, probably starting with the most crucial ones.
6 Conclusion and Takeaways
The best way to avoid incoherence in requirements is to follow requirement modeling principles whenever, and as much as, possible, such as those elaborated most notably in the IREB Handbook of Requirements Modelling and Elicitation (Cziharz et al., 2016), (Häußer et al., 2019). Developing accurate traceability methods is also crucial. Nevertheless, as shown in this article, requirement authors can make errors which are less conceptual, for example due to distractions, work overload or domain evolutions that were not taken into account. These values may be for example numbers, colours, units, equipment or product names, but also linguistic markers with a heavy semantic content such as adverbs, prepositions or modals. These errors are rather difficult to detect in a majority of situations because they may concern requirements which are in quite remote sections or chapters of a document. These errors occur in about 1-2% of the total number of requirements - this is not very frequent but may nevertheless have significant consequences on the development of a product or a process.
Section 3 of this article proposes a categorization of the errors found in our corpora. Section 4 shows how these can be mined with a relatively good accuracy. An evaluation and the current limits of the system are provided in Section 5.
Before using our system, we recommend requirement authors to accurately proofread their text over short text portions: a few related sections or a few pages, concentrating on discrepancies which may arise at the level of value specifications, enumerations and terminology usages. This careful reading may contribute to detecting unexpected errors. The case of incoherent events is more difficult to detect and may reveal misconceptions.
Detecting the categories of errors presented in Section 3 in requirements which are not adjacent but in remote text sections or in different documents could partly be resolved by using our approach, even if it does not cover all the possible errors. However, our approach can be extended or customized to specific errors or document genres. Besides using our system, implementing a few scripts that search for typical errors of a domain over documents can be really helpful and should produce results quite similar to those given above. The challenge remains the identification of pairs of requirements which may potentially share similarities but contain errors, as well as the development of related linguistic resources. A good system must indeed have a low level of noise: it must accurately and essentially point to ‘real’ errors.
I wish to thank the two reviewers, Peter Hruschka and Thorsten Weyer, whose contribution greatly helped to improve this text. I also thank Gareth Rogers, who polished the English of this article.
- Cziharz, T., Hruschka, P., Queins, S, Weyer, T., Handbook of Requirements Modeling IREB Standard: Education and training for IREB Certified Professional for Requirements Engineering Advanced Level. online IREB publications, 2016.
- Häußer, D., Lauenroth, K., van Loenhoud, H., Schwarz., A., Steiger, P., Handbook of Advanced Level Elicitation according to the IREB Standard, online IREB publications, 2019.
- Jastram, M., Kara, A., Smart use of constraints leads to cleaner requirements that are easy to use, IREB Magazine, 2016.
- Kang, J., Saint-Dizier, P., LELIE - An Intelligent Assistant for Improving Requirement Authoring, IREB Magazine, 2015.
- Kloetzer, J. De Saeger, S. Two-stage Method for Large-scale Acquisition of Contradiction Pattern Pairs using Entailment. proc EMNLP'13 (2013).
- Kuhn, T. A Survey and Classification of Controlled Natural Languages, Computational Linguistics, 40(1) (2014).
- Van Loenhoud H., Lauenroth, K., Steiger, P., The goal is to solve the problem, IREB Magazine, 2017.
- De Maneffe, M.C. Rafferty A.N., Manning, C.D. Finding Contradictions in Text. ACL-HLT'08 (2008).
- Marquis, P., Papini, O., Prade, H., (eds.) A Guided Tour or Artificial Intelligence and its Applications, 3 volumes, Springer, 2020 (to appear).
- Saint-Dizier, P. Challenges of Discourse processing: the case of technical documents, Cambridge Scholars 2014. Saint-Dizier, P, Mining Incoherent Requirements in Technical Specifications: Analysis and Implementation. Data and Knowledge Engineering, Elsevier, Vol. 116, (2018).
- Schriver, K.A. Evaluating text quality: The continuum from text-focused to reader-focused methods, IEEE Transactions on Professional Communication, 32, 238-255 (1989).
- Vazquez, C.E., What are the levels of granularity of functional requirements and why this is important, IREB Magazine, 2017.
- Sneed, H., Demuth, B., From Requirements to Code, IREB Magazine, 2017.
- Tenbergen, Vogelsang, A., Weyer, T., Froese, A. Wehrstedt, J C, Branstetter, V., Modelling requirements and context as a means for automated requirement validation, IREB Magazine, 2016.
- Unwalla, M. AECMA Simplified English, 2004. techscribe.co.uk/ta/aecma-simplified-english.pdf.
- Weiss, E.H. How to write usable user documentation. The Oryx Press, Westport 1991.
- Wyner, A. et al. (collective roadmap), On Controlled Natural Languages: Properties and Prospects, 2010. wyner.info/research/Papers/CNLP&P.pdf.