Andreas Maier Simon Darting

ReqInspector

An Approach for the Inspection of the Completeness of individual Software Requirements Specifications by Semantic Analysis of the Argument Structures


During the process of requirements engineering, incompletely specified requirements require iterative follow-up discussions with stakeholders to complete and disambiguate specified requirements. We introduce a domain-independent approach to minimize this effort and build a sound basis for the subsequent use of specifications. The approach called ‘*ReqInspector*’ inspects specifications in German natural language and is based on the semantic analysis of the argument structures of the full verbs used in individual requirements. A comparison of requirements to reference sentences serves as a basis for the generation of requirement-specific advice for the authors of the requirements specification to complete the requirements.

1 Introduction

1.1 Motivation

The analysis of specified requirements is time-consuming and, depending on the given quality of the requirements, causes correspondingly high costs. This is particularly true if comprehension issues must still be resolved prior to the actual analysis of the requirements. Comprehension issues may arise, for example, if proprietary terminology is used within a software requirements specification (SRS), if abbreviations appear that are not introduced, if correlations among requirements or the need for individual requirements are described ambiguously, or particularly if knowledge of skills, abilities, correlations, resources, work methods, etc. is assumed as given and thus is no longer mentioned explicitly in the SRS. In such cases, there is no complete SRS, which makes it more difficult to further use the specification during the development process: ambiguity in individual requirements might lead to so many possible interpretations that the requirement loses its meaning.

A practical example is: A company develops navigation systems and one requirement for a new navigation system states that the position should be displayed on the map. The author of this requirement intended the position to be represented schematically, from a birds-eye-view and monochromatically. The programmer implements the requirement very differently as follows: realistic map, 3D representation and using the full-color spectrum.

1.2 Solution Idea

The aim of the solution approach called ‘ReqInspector’ is to enable domain-independent quality inspection without specific models and word lists. To do so, it starts off on the lower language level of semantics and does not work on the interpretative, syntactic level like most other approaches do (see Chapter 2 for a comparison of ReqInspector to related approaches). The ReqInspector approach is based on verb semantics and checks the completeness of the information demanded by a verb in German grammar. This information (argument, complement, or sub-categorization) is independent of a particular domain, a particular organizational terminology, or other technical language, but rather follows strictly German grammar and semantic analysis. The only constraint for this approach is thus the size of the underlying lexicon, and, where required, the underlying data needed to get the information regarding the argument structure of the verbs occurring in the SRS. This underlying data must therefore be as complete as possible in the sense of completeness of an individual sentence and must fulfill both qualitative and stylistic demands with regard to the language of the SRS. The ReqInspector approach illustrates an innovative method for the inspection of requirements based on the semantics of the requirements. Following this approach manually is quite difficult. Hence, the ReqInspector approach is supported with specific NLP (natural language processing) tools. But the approach can also be applied with other tools than the ones mentioned in this article. Please note that the main contribution of this article is the description of the ReqInspector approach to check the completeness of requirements based on a semantic analysis, but not the description and the use of the tools to support the approach.

We decided on developing the first version of ReqInspector for German specifications, because most of our customers write their specifications in German. However, we are aware that English is the common language for writing specifications outside linguistic and economic areas that use German as the official language. Since German is a more complex language than English, we believe in the transferability of ReqInspector from German to English. We will perform this transfer as a future task.

ReqInspector checks the completeness of individual requirements, not the completeness of an SRS as a whole. That is, ReqInspector identifies underspecified requirements within an SRS and aims at supporting authors of the SRS in increasing the quality of individual requirements by providing missing information. This support is provided by the generation of advice for the authors of SRS. This advice is requirement-specific and serves the authors of the SRS directly. Specifically, it helps them complete and disambiguate individual requirements before these are reviewed by experts or analyzed by requirements engineers and packaged for further use.

ReqInspector thus is a means to improve the quality of requirements formulated in natural language. The most complex part is the check for completeness, which is why this quality check is emphasized in this article and will be described in detail. However, ReqInspector is not limited to checking the completeness of requirements: it is also being developed further to check their necessity, unambiguity, and singular occurrence.

1.3 Benefits

The semantic analysis approach followed by ReqInspector offers a series of benefits that mainly result in saving costs and time with respect to the requirements engineering activity of requirements analysis. Especially this holds for the cost savings in the later development process where corrections of requirement quality violations are a lot more expensive.

If the specification of the requirements is complete, time is not only saved with regard to understanding each requirement, but also regarding iterative follow-up discussions with the originators of the underlying information of individual requirements, to clarify comprehension issues and obtain missing information. This enables requirements engineers to deal directly with their primary task of analyzing the requirements and packaging them for further use in the development process. This results in time and cost savings for the entire development process. At the same time, the advice for completion of individual requirements specifications, generated by information based on the application of the ReqInspector approach, also helps non-experts to write unambiguous and complete requirements. It is expected that by applying the ReqInspector approach repeatedly, a learning effect will occur for the authors of requirements and that the quality of new requirements and thus of whole SRS will steadily improve.

1.4 Document Structure

This article is structured as follows: Chapter 2 first describes when to use the ReqInspector approach for a quality inspection of requirements, followed in Chapter 3 by the proposed solution approach called ReqInspector, which is primarily concerned with the inspection of the completeness of requirements specifications. Chapter 4 provides ‘take-aways’ for professional requirements engineers and business analysts. Chapter 5 concludes the article with a summary and an outlook on future extensions of ReqInspector.

2 When to Use the ReqInspector Approach

The ReqInspector approach is appropriate for use in early project phases. At this time of a project, requirements are often elicited from various sources (e.g., former projects and similar projects), which indicates that the quality of the resulting requirements is not consistent or reviewed explicitly. In order to achieve better requirements quality, requirements should be reviewed regularly while the project is still in the early development stage. Analysts responsible for the quality assurance of requirements have to spend a lot of time and mental effort to iteratively review SRS’s. For this task, they usually use tools to support requirements inspection.

Currently available tools for the inspection of requirements are mainly for the English-language, and also use models and word lists. That means, they analyze only the syntactic structure of requirements, but not their semantics (see, for example, [HOOD], [LA05], [PR09], [RE13], [RS14]). In addition, related approaches are very specific for particular SRS and domains, and the creation and maintenance of models and word lists requires a lot of effort. In such approaches, requirements inspection is only as good as the information contained in the models and word lists. The creation of such models and word lists is effort-intensive. At the same time, the models and word lists must be checked regularly to ensure that they are up to date and, if necessary, must be adapted to new conditions. Reuse in other projects is therefore possible only rarely. ReqInspector was developed with the aim of no longer having to depend on domain-specific models and word lists. At the same time, ReqInspector should allow for checking SRS without placing overly strict demands on the formulation of individual requirements, for instance by insisting on the adherence to sentence templates (see, for example, [PR09], [RE13], [RS14]).

Internal discussions about the strengths and weaknesses of related approaches that we have used and studied in the context of our projects, and which we were able to identify as approaches with the same goal via a literature search, were a decisive factor for the development of ReqInspector.

3 The ReqInspector Approach in Detail

In this chapter, ReqInspector will be described in detail. To explain the process for a requirement inspection using the ReqInspector approach, the following example sentence from the fictitious SRS of a navigation system for use in an automobile will be used, which shows different aspects of typically specified requirements:

During navigation, the system must permanently indicate the current position on a map.

Following a description in Section 3.1 of the database used, the requirements inspection process performed with the help of the ReqInspector approach will be explained in Section 3.2, including the embedded tools used for this purpose. In Section 3.3, the generation of advice for the authors of SRS will be described in more detail.

3.1 Verb Sub-Categorization Database

Currently, ReqInspector works with the verb sub-categorization database of the Institute for Natural Language Processing (IMS) in Stuttgart, Germany. This database, which is not publicly accessible but available from IMS for non-commercial purposes, comprises approx. 1,310,000,000 entries. Each of these entries represents a sentence from the German web corpus SdeWaC [FE13], with approximately 880,000,000 words, and a dump of the German Wikipedia as of April 2011, with approximately 430,000,000 words. This huge number of well-formed sentences is the foundation of the check for the completeness of individual requirements. Without such a set of sentences to refer to, it is not possible to judge if a requirement is complete. Thanks to the verb sub-categorization database as its reference, ReqInspector can guarantee the completeness of each individual requirement of an SRS.

The sentences contained in the verb sub-categorization database were extracted automatically from the SdeWaC corpus and the dump of the German Wikipedia with the help of the Subcat Extractor [SS13].

The database stores every sentence in one line, always listing verb information, complement information, and sentence information. Table 2 in Section 3.2 shows the example sentence as a database entry. A very simple representation of the complement structure of the example sentence is shown in Figure 1.

Figure 1: Complement structure of the example sentence

Every extracted verb is described by four parts: First, it is noted whether it is a verb in the root position of the sentence (-) or a verb within a passive construction (PAS); then the type of the verb is provided based on the Stuttgart-Tübingen-Tagset (STTS) [SC99]; next is the position of the verb in the sentence (beginning with 0), and finally the lemma of the verb, that is, its lexical base form.

The sub-categorization information of every verb, i.e., the information about the complements demanded by a verb, is described by the same four types of information (function, type, position, lemma) in the same sequence as a verb. In addition, the sub-categorization information also includes the dependency relations of the verb complements, i.e., their grammatical relations, such as whether they are the subject, the indirect object, or the genitive object of a sentence.

Prepositional phrases (in the example sentence ‘on a map’) also contain the part of speech of the argument of the prepositional phrase, the case, the position of the argument in the sentence, as well as the lemma of the argument.

Finally, the complete sentence in which the verb occurs is listed in the database, including the complements and their dependency relations. The number of occurrences of each verb in the database corresponds to the number of example sentences containing this verb in the underlying texts of the SdeWaC corpus and the dump of the German Wikipedia.

3.2 ReqInspector Process Description

The generic process performed by the ReqInspector approach to check the completeness of individual requirements consists of the following five steps, assuming that a set of well-formed reference sentences and information on the argument structures of these reference sentences already exists (Figure 2):

  1. Identify the grammatical structure of each requirement, e.g., subject, verb, direct object.
  2. Identify the Parts-of-Speech within each requirement.
  3. Identify the phrases that are dependent on the finite or infinite full verbs in each requirement and mark these phrases as the complements of the respective requirement, maintaining the sequence in which the complements occur in the requirement.
  4. For each requirement, compare the sequence of complements to the argument structures of well-formed reference sentences. In each reference sentence with the same sequence of complements, mark complements of reference sentences that occur in addition to those already contained in a requirement. Additional complements may occur before the sequence of complements in the requirement, between the sequence of complements, and after the sequence of complements.
  5. Based on the additional complements identified in Step 4, create and provide advice to the authors of the requirement, because these complements represent information potentially missing from the requirement. The advice should follow a particular template, such as "You could additionally state, “; this way, you do not force the authors of requirements to include the information. For example, when you identified an additional prepositional phrase in the set of reference sentences, you transform the preposition in the reference sentence into a question word, and then include the phrase in the requirement that corresponds to the phrase that follows the preposition in the reference sentence. Reformulate this phrase to match the case and the syntax demanded by the preposition (see Section 3.3 for examples of advice).
Figure 2: ReqInspector Logical Process Overview

Since the generic process of ReqInspector is difficult without tools that support the steps of the process, the eight-step process in Figure 3 illustrates the ReqInspector process with example tool support:

  1. First, the SRS must be read. For this purpose, a Word file is uploaded via the ReqInspector web service.
  2. In the next step the adaptations to the individual requirements, which are necessary due to the behavior and the constraints of the tools used, are performed automatically. It is necessary, for example, to separate all combinations of prepositions and definite articles occurring in requirements in German (comparable to the issue of ‘it’s’ vs. ‘it is’ in English). Otherwise, mistakes may, for example, be generated in the automatic identification of the parts of speech.
  3. The Dependency Parser of Mate Tools [BO10] is used to identify the grammatical structure of the individual requirements (see Table 1).
    In a parallel work step, the Part-of-Speech Tagger of Mate Tools is used to identify all parts of speech, particularly the verbs.
  4. The information obtained in step 3 must be saved in a special CoNLL format [CONL], which is needed by the Subcat Extractor as input during the next step.
  5. The complement structure of each requirement is analyzed automatically with the Subcat Extractor and is transferred into the same format as that of the sentences contained in the verb sub-categorization database (see Table 2). This allows matching the complements of individual requirements and the database entries.
  6. The complement structures of the verbs found in each requirement are matched to the entries in the verb sub-categorization database. Maintaining the sequence, complement structures are sought that contain further complements in addition to those already contained in a requirement. These complements represent potentially missing information, and the author of the specification is given appropriate advice to provide them. The priority of the need for potentially missing information is determined by checking the frequency of its occurrence in the verb sub-categorization database.
  7. The advices are automatically generated by filling a corresponding template with the parts of the correlating complements and thus providing indications regarding which information could be added to the requirement.
  8. The original document with the added advice is once again made available to the user for download. The document can be used without any restrictions since the advices are inserted as Word comments and can be dealt with as usual. The advices have unambiguous IDs and are thus taken into account when a file is uploaded again, preventing duplicate generation.
Figure 3: ReqInspector Technical Process Overview
ID Form Lemma POS Feats Head Deprel
1 Das (the) Der (the) ART nom|sg|neut 2 NK
2 System System NN nom|sg|neut 3 SB
3 Muss (must) Muss (must) VMFIN sg|3|pres|ind 0 --
4 Während (during) Während (during) APPR -- 3 MO
5 Der (the) Der (the) ART dat|sg|fem 6 NK
6 Navigation navigation NN dat|sg|fem 4 NK
7 Die (the) Der (the) ART acc|sg|fem 9 NK
8 Aktuelle (current) Aktuell (current) ADJA acc|sg|fem|pos 9 NK
9 Position position NN acc|sg|fem 14 OA
10 Ständig (permanently) Ständig (permanent) ADJD pos 14 MO
11 Auf (on) Auf (on) APPR - 14 MO
12 Einer (a) Ein (a) ART dat|sg|fem 13 NK
13 Karte (map) Karte (map) NN dat|sg|fem 11 NK
14 Anzeigen (indicate) Anzeigen (indicate) VVINF -- 3 OC
15 . -- $. _ 14 --
Table 1: Output of the Dependency Parser of MATE Tools for the example sentence; the translation of each German word is given in brackets after the word, except for words that are the same in German and in English
Verb Information Complement Information Sentence Information
OC:VVINF:13: anzeigen (indicate) <SB:NN:1:System|OA:NN:8:Position|MO:ADJD: 9:ständig (permanently)|MO:APPR:10: auf (on)::NN:dat:12:Karte (map)|MO:APPR:3:während (during)::NN:dat:5:Navigation> 3ia Das (the) System *muss* (must) während (during) der (the) Navigation die (the) aktuelle (current) [Position]OA [ständig (permanently)]MO:OTHER [auf (on)]MO einer (a) Karte (map) [[anzeigen (indicate)]]OC . Das (the) [System]SB [[muss (must)]]-- [während (during)]MO der (the) Navigation die (the) aktuelle (current) Position ständig (permanently) auf (on) einer (a) Karte (map) anzeigen (indicate) .
Table 2: Output of the Subcat Extractor for the example sentence; the translation of each German word is given in brackets after the word, except for words that are the same in German and in English

3.3 Generation of Advices for the Authors

The advice for the authors of an SRS, which is created and generated specifically for each requirement, consists of recommendations formulated consistently as indirect questions. Below we show two examples of advice for missing information based on the example sentence.

For the example sentence „Das System muss während der Navigation die aktuelle Position auf einer Karte anzeigen“ (During navigation, the system must indicate the current position on a map), the following reference sentence is found: „Schreiben Sie einen Reflexionsagenten, der permanent von einer URL ausgehend bis zu einer vorgegebenen Link-Tiefe die Verfügbarkeit der Web-Sites anzeigt.“ (Write a reflexion agent that permanently indicates the availability of web-sites starting from a URL to a given link depth) The reference sentence provides six complements of the full verb 'indicate':

SB:NN:Reflexionsagent (reflexion agent);
OA:NN:Verfügbarkeit (availability);
MO:ADJD:permanent (permanently);
MO:ADJD:ausgehend (starting);
MO:APPR:von (from);
MO:APPR:zu einer vorgegebenen Link-Tiefe (to a given link depth);

The reference sentence includes each complement of the requirement, but the reference sentence also includes another MO:ADJD, namely ‘permanent’ (permanently). Table 3 shows the missing MO:ADJD by a gap in the third column. So, the following advice can be generated that asks the writer of the requirement to enrich the requirement with another piece of useful information that can close this gap: ‘You could additionally state how the system must permanently indicate the current position on a map.’

1 2 3 4 5 6
System Position complement missing! ständig auf Karte während Navigation
SB:NN OA:NN MO:ADJD MO:APPR:auf:NN MO.APPR:während:NN
Reflexionsagent Verfügbarkeit permanent ausgehend von URL Linktiefe
SB:NN OA:NN MO:ADJD MO:ADJD MO:APPR:von:NN MO:APPR:zu:NN
Table 3: Example presentation of the complement mapping. The missing complement MO:ADJD is identified in the third column.

A reference sentence for the example ‘Das System muss während der Navigation die aktuelle Position ständig anzeigen’ (During navigation, the system must permanently indicate the current position) is: „Für den Laien sei es aber gerade ein besonderer Vorteil, daß ihn das Gerät nicht mit „unverarbeiteten physikalischen Daten“ behellige, sondern ihm das Gesundheitsrisiko durch Elektrosmog jeder Art ganz unmittelbar auf einer Skala anzeige.“ (For the layman, however, it is precisely a special advantage that the device does not bother him or her with ‘unprocessed physical data’, but rather shows him or her the health risk directly by means of a kind of electro-smog on a scale).

Again, the reference sentence includes each complement of the requirement, but it also includes the phrase ‘on a scale’, which is used to create the advice ‘You could additionally state where the system must permanently indicate the current position on a map.’

EP:PPER:es (it)
PD:NN:Vorteil (advantage)
MO:ADV:aber (however)
MO:ADV:gerade (precisely)
MO:APPR:für (for)
NN:acc:Laie (layman)
CJ:VVFIN:anzeigen (show)
SB:NN:Gesundheitsrisiko (health risk)
DA:PPER:ihm (him or her)
MO:ADJD:unmittelbar (directly)
MO:APPR:durch (by means of)
NN:acc:Elektrosmog (electrosmog)
MO:APPR:auf (on)
NN:dat:Skala (scale)

The first advice refers to missing information about how the current position is to be indicated (e.g., two-dimensionally, three-dimensionally, schematically), whereas the second advice refers to missing information regarding the display medium (e.g., center console display, smartphone, smartwatch). Authors of a requirement might implicitly assume such information, but the implementation will be much easier if the requirement leaves less room for interpretation. It will also result in a much cheaper software development project, because there is no need to ask stakeholders for the missing information in iterative interview sessions.

The author of the SRS will not receive any advice if the pertinent information is already available in the requirement. Figure 4 shows an example of how the advice automatically generated by ReqInspector is represented in Word. Figure 4.1 shows the translation of the sentences and feedback in English. As mentioned before, the proof of concept currently works for the German language only.

Both examples illustrate that the reference sentences do not need to be related to the topic or to the domain of the requirement, but they can focus on very different aspects. The only prerequisite of reference sentences is the inclusion of the full verb that is included in the requirement, and of each of the complements of this full verb that the requirement already includes.

Figure 4: Schematic representation of automatically generated advices in a Word file
Figure 4.1: Translation of the sentences and advices

4 Take-Aways for Practitioners

5 Summary and Outlook

The ReqInspector approach is the result of state-of-the-art research in the area of semantic requirements analysis. The aim of this research is domain- and expert-independent quality inspection of SRS that goes far beyond current approaches, using carefully compiled word-lists and ontologies. ReqInspector thus enables quality inspection of any requirement formulated in natural language in German. Improving the precision with which it recognizes missing or insufficient information within individual requirements still requires some comparisons and adjustments. In particular, the database used is still to be extended with high-quality requirements in order to adapt it better to this specific usage purpose.

Furthermore, the integration of an algorithm is to be checked which would enable ReqInspector to learn the structures of good SRS on its own and issue the corresponding advice in a prioritized sequence.

Additional quality attributes such as the singular occurrence of requirements and the necessity of requirements are also to be checked in the future in order to mature ReqInspector to an inspection tool for checking the fundamental quality of SRS that goes beyond mere completeness. Finally, the ReqInspector approach and the underlying algorithms of the tool support have to be adapted to English SRS’s, in order to spread the ReqInspector approach beyond German-speaking areas. But the approach cannot just be carried over as is in order to check the completeness of requirements written in a language other than German.

Currently, using the ReqInspector approach requires gaining a good understanding of the semantics and the semantic tags of natural language. This is especially true for the identification of missing information. But we strongly believe that the advantages requirements engineers gain by using the ReqInspector approach will outweigh the effort spent in gaining this understanding.

References



Andreas Maier

Andreas Maier studied Computational Linguistics, Language Science and Technology and Philosophy at Saarland University in Saarbrücken, Germany. From November 2008 to April 2018, he worked as scientific assistant at Fraunhofer IESE, where he focused on the elicitation and the automatic analysis of user requirements provided in natural language. In May 2019, he finished his PhD in computer science, where he investigated difficulties in the identification and specification of hedonic quality in user requirements and developed possible solutions for the mitigation of these difficulties.

Simon Darting

Simon Darting studied applied computer science with focus on communication at the University of Applied Sciences in Worms, Germany. Subsequent to his studies he started working as a researcher at Fraunhofer IESE in 2012, where he focused on requirements- and systems engineering within large-scale projects. In 2017 he completed his Master's degree in "Software Engineering for Embedded Systems" at the Technical University of Kaiserslautern, Germany. Since December 2018, he has been employed at BASF SE in Advanced Business Analytics with a focus on predictive maintenance.