Marie Garnier Patrick Saint-Dizier

Improving the Use of English in Requirements

Analysis, results, and recommendations


For most international industries, English is the main language of communication in technical documents. Among them, requirements are specifically designed to be easy to read and as efficient and unambiguous as possible for their users and readers. They must leave little space for personal interpretation. Non-native speakers of English often find themselves in the position of having to write requirements without extensive training in the use of English for this task, which may result in lexical, grammatical and style errors. Controlled languages are usually used as a way to go around this difficulty, but they fail to address the specific stumbling stones of non-native speakers, and even have their own shortcomings. In this article, we present an analysis of the errors found in a corpus of requirements written in English by French speakers, and we attempt to highlight the most efficient ways to help requirements engineers limit the number of language errors in their work. Results are also relevant in the case of requirements written in English by speakers of other languages.

1. Introduction

As a text genre, requirements have to follow sets of rules in form and content. With the recent emphasis on the importance of requirement quality and proper training for requirements engineering, discussions have focused on higher order problems, such as gathering information for requirements, ensuring coherence in long documents or selecting criteria for requirement validation.

The role of linguistics and natural language processing (NLP) in requirement engineering and management has gained more importance over the last few years. In particular the International Requirements Engineering Conference regularly includes articles on that topic, as does the present magazine. The article "Readable requirements are not a matter of course – or are they?" (Rabeler, RE Magazine, issue 2014(4)) develops the complex notion of requirement readability, which shares connections with the present article. In addition, the two-part article "How requirements engineers can benefit from applying the NLP communication techniques" (Thomas and Georgieva, RE Magazine issues 2016(1, 2)) develops the notion of neuro-NLP and discourse notions such as reframing or generalizations. Finally, the present article is a continuation of research on using NLP to improving quality in requirements, which is presented in the article "LELIE – An intelligent assistant for improving requirement authoring" (Saint-Dizier and Kang, RE Magazine issue 2015 (2)) [14].

In this article, we propose to tackle requirement quality from another angle, that of the quality of their language and grammar. This aspect of requirements is just as crucial as questions of content in avoiding approximations and misunderstandings. Specifically, we focus on the use of English in requirements written by French native speakers.

The research presented here is the result of a project funded by the IREB Academy Program. The objective of the research was to gather data on the language errors produced by French native speakers writing requirements in English. Errors found in a corpus of requirements for two industrial domains were thoroughly analyzed and categorized in order to identify common error types in this specific type of writing, and find ways to improve language quality and readability in requirements.

Our project stems from the following initial observations:

We have chosen to focus specifically on requirements written by French native speakers. Research in second language acquisition has shown the influence of a speaker's first language on their use of a second language (e.g. Jarvis and Pavlenko, 2007) [7], a phenomenon called transfer, or cross-linguistic influence. Language transfer plays a major role in error production, and gives precious indications as to the requirement engineer's intended meaning and possible remediation.

2. Research methodology

Research and analysis approach

Our research relies on the manual analysis of corpora of requirements written in English by French native speakers. We use the methodology of error analysis, a research method initially developed and used in the domain of Second Language Acquisition (Corder, 1981) [2]. The main steps of error analysis include the identification, classification, and interpretation of errors.

The analysis is conducted by a single trained linguist specialized in English grammar with a background in research on linguistics-based automatic grammar checking for English. Due to the complexity of the task, the different steps are performed manually. Requirements are read and screened for errors, which are then tagged.

Overview of corpora

We analyzed 772 requirements extracted from 3 different sources in two technical domains. They are hereafter referred to as Corpus 1, Corpus 2 (both in aeronautics but coming from two different companies) and Corpus 3 (telecommunications).

Our corpora are composed only of requirements: technical documents usually contain textual parts that are of no interest for this research (e.g. introductions, summaries, definitions, contexts, diagrams, etc.), and which were therefore not included in the corpora. As a result, we present the numerical data in terms of number of requirements rather than number of words. The number of words in each requirement varies greatly, from under 10 words to over 100 words.

We checked that these texts were indeed produced by French native speakers. It is extremely difficult to assess the actual level of English proficiency of the authors, as the corpora are compiled a posteriori. We thus don't have access to detailed information about the authors and have no opportunity to test them. However, we estimate that the authors have a B2 or at least B1 level (respectively higher and lower intermediate levels in the Common European Framework of Reference for Languages issued by the Council of Europe) [3], partly because obtaining the jobs the authors have usually requires a B2 level. The names of the companies who provided these documents are kept anonymous at their request.

Diagram 1: Overview of Corpora

3. Requirement quality with respect to controlled language norms

Most companies ask that a given controlled natural language (or CNL) be used in their technical documentation. Resorting to a controlled language may be seen as a way to bypass the question of language quality in requirements, since they constrain and simplify the types of grammar and lexicon that can be used in such documents. As we explain below, it is unfortunately not a completely effective solution.

After our initial observation of the presence of errors in requirements that are supposed to follow the norms of a controlled language, we ran a preliminary investigation on the adequacy of the requirements in our corpus with respect to the standard guidelines and norms (i.e. INCOSE, IREB, general simplified natural language, (Kuhn, 2014)) [8]. In parallel to our main investigation on errors in requirements, we have also tried to identify the main ways in which the requirements in our corpus fail to follow the recommendations of the controlled language used in the industry.

Here are some examples of the main deviations from controlled language norms found in our corpus (for a more detailed overview, see (Saint-Dizier and Kang, 2015)) [14]:

4. Definition and classification of errors

Identifying errors found in requirements

At its most basic, an error is defined as "an unsuccessful bit of language" (James, 1998) [6]. The most common criterion for declaring a segment of language "unsuccessful" is grammaticality, that is to say whether or not the segment follows the rules of grammar. Acceptability is another criterion that focuses on whether or not the segment might be produced by a native speaker in an appropriate context (Lyons, 1968) [10].

The concepts of "competence" and "performance" are also important in the definition of errors. Competence errors are attributable to a lack of knowledge in the language, while performance errors are due to external factors, such as lack of attention, stress or fatigue (Corder, 1981) [2]. However, researchers have highlighted the fact that even though this distinction is theoretically relevant, it is practically impossible to distinguish competence errors from performance errors (Thouësny, 2011) [15].

We have adapted these commonly used criteria to the objectives of our project as well as to the nature of the documents in our corpus to define the types of segments we identify as errors in requirements:

Classifying errors in our project

Error categories usually rely on a set of criteria, sometimes used in combinations of two or three but very often used independently:

Consider the following example:

The system shall includes a locking device.

In this sentence, the verb include should not have the ending –s since it follows a modal auxiliary (modals require the use of an uninflected verb form, or verb base, after them). Here is how this error would be described for each of the above criteria:

We designed our classification system in order to obtain precise and comparable data about errors found in requirements. Previous research has posited that the type of classification needed to yield comparable data includes the rank of the linguistic unit to be taken into account for the error to become apparent (i.e. the error domain, Lennon, 1991) [9], usually Noun Phrase, Verb Phrase, Clause, etc., as well as a number of sublevels of classification giving more detailed information, such as "Preposition selection", "Placement of modifiers", etc. (Garnier, 2014) [4]. We also document spelling errors.

For the error given above (The system shall includes a locking device), the form of the verb only appears as wrong if we take into account the presence of the modal auxiliary before it, as modals require an uninflected verb form. As a result the error domain is Verb Phrase*. The second level of classification is Modality, since the error is linked to the use of a modal.

In addition, we introduce a distinction between what we call "central" and "marginal" categories. Central categories fit the description we gave above, while marginal categories allow for the researcher's own margin of error: since we are not specialists in the technical domains represented in the corpus, and don't have access to a list of expressions allowed in the companies the requirements come from, we use marginal categories to document segments that appear to us as errors, but may not actually be perceived as such by the requirements engineer and the readership.

Marginal categories include:

Our system thus includes 2 marginal categories and 27 central categories. Errors in the central categories are distributed in 5 error domains and 1 Other category. Most domains are divided into sublevel categories. As explained above, some sublevel categories require an extra level of detail, especially when the errors represent different surface phenomena (ex. Preposition selection vs. Missing preposition).

5. Results and discussion

Presentation of results

Overall, we have found 279 errors in 188 requirements, in a corpus of 772 requirements. This means that 1 out of 4 requirements contained at least one error. Nearly half of these contained more than one error, with a small proportion of them including up to 4 errors.

Surprisingly, Corpus 1 and Corpus 3 show a similar proportion of errors, with 18% of the requirements in these corpora containing at least one error, while Corpus 2 has nearly double the amount, with 33% of requirements containing at least one error. In all instances, the presence of errors is nowhere near negligible.

Diagram 2 shows the proportion of errors coming from each sub-corpus in our corpus of errors.

Diagram 2: Proportion of errors from 3 corpora

Errors are most often found in the domain of Noun Phrase, which accounts for about 46 % of errors, and the marginal categories, with about 26 % of errors. In each of them, one sublevel category holds the majority of errors: most errors in the Noun Phrase category are linked to modification (and mostly to modifier stacking, as we will see below), with 25 % of errors in total, and most errors in the marginal categories are linked to the use of non-standard expressions, with about 15 % of errors in total (which, as was stated above, may not be considered to be errors in the relevant technical domain).

Out the 29 final categories, marginal ones included, only 6 account for more than 5 % of errors, with 20 of them accounting for less than 3 %. However, when put together they gather 37 % of all errors. This is evidence of the wide diversity of errors found in requirements, highlighting the fact that more than a third of errors prove difficult to prevent, since each category stems from a different grammatical or lexical problem.

Diagram 3: Main error categories

The sublevel category we have identified as "modifier stacking" comprises segments in which a noun phrase is composed of a head noun and a string of modifiers to its left, sometimes with their own embedded modifiers. Here are a few examples of this error type:

The use of several noun modifiers in an NP is becoming increasingly common in English, especially in technical and journalistic English (Pastor-Gomez, 2011) [11]. They are favored in these two areas because they eliminate the need for prepositions and some determiners. Preposition selection and determiner selection are two of the main difficulties non-native English users face when writing in English, so we make the hypothesis that the use of noun modifiers appeals to this type of writers as a way to try and control the number of errors they produce. However, by eliminating prepositions and sometimes plurals, such structures rely on implicit information that needs to be reconstructed by the reader. As a result, they may lead to longer reading times and difficulties in interpreting the requirement (Biber et al., 1999) [1]. Moreover, these errors overlap with controlled language recommendations on the use of noun complements and heavy noun phrases.

The category of "non-standard expressions", found in the marginal error categories, includes segments such as the following ones:

The fact that we included such segments in the marginal categories means that even though their form may seem ungrammatical or unacceptable in general English, we recognize that they might be standard in the language of requirements. Having no way to ascertain the acceptability of their use, we chose to document them. The first requirement is an example of a structure found only in Corpus 2, and the second requirement represents a structure found only in Corpus 3. We didn't find any non-standard expressions in Corpus 1. This imbalance indicates that the use of non-standard expressions is not necessarily expected and/or accepted in all companies, and may be domain- or even company-dependent.

The second requirement is actually one example of a type of structure that takes several forms in Corpus 2. All instances of this structure contain a form of ellipsis; here are a few additional examples:

From the point of view of surface syntax, these segments have different forms (e.g. a past participle followed by a participial adjective, a past participle followed by a preposition, a past participle followed by one adjective or by an adjective phrase composed of a head adjective and a modifying adverb or a PP complement), but from a semantic point of view they are built on the same model, which is close to that of verbal expressions such as to turn on, to switch off: The second term or phrase indicates the position or situation of a "mobile" element, such as a switch (e.g. OFF, ON, open, closed, failed), while the first one either specifies the action leading to that situation, or the observation of that situation (e.g. selected, detected). It is imaginable that this type of phrasing is accepted and even expected in the companies for which the requirements were written. This type of segment would therefore be an example of an ungrammatical but acceptable phrasing, and it can be seen as an efficient way to avoid the use of some prepositions.

This observation initially prompted us to exclude them from the error corpus. However, we found an alternative phrasing (selected to OFF), suggesting that the practice may not be as stable or as widely used as we initially thought. Furthermore, the ellipsis of prepositions and other words, which may be seen as increasing concision and simplicity, also creates a gap that must be filled by the reader, and may lead to ambiguities if the use of such structures is not the same in all instances.

Specificity of the genre of requirements: comparison with other error corpora

In order to find out whether the errors found in requirements were similar to those found in other non-native output, we compared our error corpus from requirements with the results of research on errors in English learner productions and research papers written by French native speakers writing in English (Garnier, 2014) [4].

First, we looked at the distribution of errors according to the main categories, and more specifically those of Noun Phrase, Verb Phrase and Sentence and Clause. Since NPs and VPs are the minimal elements of sentences, they usually account for the highest number of errors. We did not count the proportions of NPs, VPs, clauses and sentences in each corpus, but we made the conservative assumption that they are similar. Diagram 4 shows the proportions of errors found in the comparable corpora for these three main categories. For this comparison, we only took central categories into account, since the marginal categories we identified are specific to requirements.

Diagram 4: Distribution of errors according to main category in comparable corpora

We notice that there is a great difference between the distribution of errors in our corpus of requirements and other corpora. Similar error rates are found for the three categories in learner productions and research papers, while requirements show significantly fewer errors linked to the VP or at the level of the clause and sentence, and significantly more on the NP.

The smaller proportion of errors in the VP in requirements could be attributed to the high level of proficiency of the writers and the fact that the requirements are proofread, which may eliminate most agreement errors. However, we make the hypothesis that the low frequency of errors in the VP is mostly due to controlled natural languages prohibiting the use of complex verb groups in requirements. This could also explain the low rate of errors at the level of the clause or sentence, as the constrained use of complex syntax helps limit errors. Conversely, requirements use more complex vocabulary and expressions, which may lead to a higher proportion of errors in the Noun Phrase.

In addition, we compared the proportion of the two most common error types in our requirements corpus with those found in our two comparison corpora. Diagram 5 shows the frequency for these error types.

Diagram 5: Frequency of two error categories in comparable corpora

In the case of missing articles (ex. All components shall meet the requirements of [ ] table presented below), we notice that results in the two comparison corpora are similar, and are much lower than in the requirement corpus. However, determination errors in general account for 24 % (student essays) and 16 % (scientific papers) of all errors in the two comparison corpora, indicating that authors produce a more varied range of determination errors in these types of writing than in requirements, where determination errors other than missing articles are non-existent.

In the case of modifier stacking, there is a progression in the number of errors found in the three corpora, with these being marginal in the corpora of student essays. This is consistent with other studies on such structures (e.g. Pastor-Gomez, 2011) [11], which identify them as a feature of technical, scientific or journalistic English. We should note that, since we are looking at "absolute" rather than "relative" error numbers (i.e. only the total number of errors, not the number of errors over the number of total uses of the structure), it is not surprising to see lower error rates on this structure in types of writing that typically don't make use of them. Errors linked to modifier stacking are also more varied and complex in the corpus of requirements, with up to 5 modifiers on the left of a head noun (see examples above).

Overall, the most frequent error categories correspond to "simplification" strategies, with the omission of function words or punctuation that may be perceived by the author as superfluous or expendable. Three out of the six categories have to do with "missing" words or parts of words, while the use of noun modifiers eliminates the need for prepositions. The two types of non-standard expressions we reviewed also seem to be indicative of simplification strategies. In the example from Corpus 2 (and case yes, the tenderer to confirm if…), a more grammatically acceptable version of the requirement would include more words and a more complex syntax (e.g. and if it is the case, the tenderer must confirm that…). It is also the case for the example from Corpus 3 (the segment in case one analog acquisition is detected failed can be corrected as e.g. in case one analog acquisition is detected as having failed / as being in a state of failure).

We conclude from this comparison that the distribution of errors reflects the specificities of the genre of technical documents, warranting the collection and use of data in this genre, and specific treatment as an L2 production.

6. Summary and recommendations for training

In this article, we presented the results of a research project devoted to the analysis of language errors in requirements written in English by native speakers of French. We analyzed 772 requirements from 3 corpora in 2 different technical domains, and detected 279 language errors. These errors were categorized using a tailored classification system which includes marginal (e.g. non-standard expressions) and central (e.g. agreement errors) categories.

We found a large variety of errors, with only 6 of the total 29 categories accounting for more than 5% of the errors each. However, some general tendencies were identified. A majority of errors (62%) occur in the Noun Phrase. This result is at odds with results from error analysis in non-native speaker productions from other genres, indicating that requirements form a specialized text genre that reflects specific writing behaviors. In addition, a significant proportion of errors is linked to missing punctuation and misspelled words, and can be remedied through the use of spellcheckers and grammar checkers.

A high number of segments were found to be unacceptable, or even ungrammatical from the point of view of standard English, but might be deemed acceptable in the context of requirements. Finally, we found numerous errors linked to the use of multiple adjectives or nouns in front of the head noun of the phrase. This type of structure can lead to interpretation errors and decrease the readability of the requirement.

The 6 most frequent error types, which account for 62% of errors in total, are not equal in terms of impact on readability, and most importantly in terms of ease of correction and prevention. For example, errors on the use of articles are notoriously frequent in the productions of non-native speakers writing in English, but they are also very difficult to address in automatic grammar checking or even in in-person teaching. Training providers should therefore focus on the errors for which the remediation is relatively simple (ex. spelling errors), or errors that are really detrimental to the readability of the requirement (ex. use of several nouns and adjectives in an NP, see our discussion of nocuous ambiguity below). The error types that we recommend trainers to address are not directly linked to the native language of the requirement authors, therefore our recommendations can be used for authors with native languages other than French, with any adjustments that training providers might think necessary. Specifically, these recommendations could be used as part of the syllabus of the CPRE Foundation Level offered by IREB, more precisely in the "Language effects" sub-unit (see link to the syllabus in the list of references) [17].

The importance of using a spellchecker

Despite the ubiquitous presence of spellcheckers, 7% of all errors found in requirements are linked to spelling errors.

In addition to the fact that the errors themselves may decrease the clarity of the requirements, the main problem is that they affect the credibility and image of the company or requirement engineer directly. In the eyes of a client or contractor, the presence of easily avoidable errors in technical documentation may be the telltale sign of more significant errors in other areas.

Fortunately, the distribution of these errors shows that they are not the result of a lack of spelling skills, but rather of momentary lapses (e.g. one common word spelled correctly most times and wrong a few times), and subsequent lack of editing that can be fixed relatively easily. Writing requirements is a difficult task, therefore requirements writers should be heavily encouraged to rely on spellcheckers for part of the editing process. This can be achieved by making sure authors are familiar with the use of spellcheckers and notice the corrections proposed.

Policing the use of non-standard expressions

Roughly 15% of the errors found in requirements came from the use of non-standard expressions. However, it is perfectly acceptable for requirements authors to use non-standard expressions in their writing, since requirements writing is a technical task using specialized English. Moreover the constraints of the form of controlled English used might call for dedicated expressions.

Nevertheless, resorting to a non-standard expression should be a conscious choice. In order for their use not to decrease the readability of requirements, alternative syntax and expressions should only be used if the following criteria are met:

Improving the readability of Noun Phrases

NPs that include several modifiers, and especially in the form of other nouns, are very common in technical writing because they reduce the number of words by eliminating the need for prepositions and some determiners, and give the impression of a compact delivery of information. However, when several modifiers are stacked in front of an NP, the readability of the NP decreases, and interpretation errors may occur as readers of requirements are left to reconstruct the intended meaning.

In particular, ambiguity arises from the fact that the modifiers used in these NPs often contain their own modifiers, making it difficult to identify the exact scope of each element, and determine whether the NP demonstrates stacked (e.g. [thermal [system breakdown]]) or embedded (e.g. [[thermal system] breakdown]) modification. The use of noun modifiers further complicates the issue, since they can function as heads as well as modifiers in NPs and embedded nominal modifiers.

When discussing ambiguous structures, we must address the question of nocuous ambiguity. Ambiguity is said to be innocuous when a theoretically ambiguous text is interpreted in the same way by different readers regardless of its ambiguity; it is said to be nocuous when the ambiguity in the structure actually yields different interpretations (Willis et al., 2008) [16]. According to this study, nearly half of the cases of syntactic ambiguity were attributable to the use of nominal modifiers.

Research on the nocuous status of modifier stacking including nominal modifiers would be very useful in helping to identify the structures that should be corrected. However, as is visible from the examples from our corpus given in section 5, most NPs with modifier stacking include embedded and stacked modification, and usually more than 2 modifiers. We posit that these two factors are enough to create nocuous ambiguity. In addition, the lack of prepositions and determiners clarifying the relationships between the elements of the NP may lengthen reading times and mobilize cognitive resources to the detriment of other elements of the requirement.

As a consequence, in order to minimize the risk of nocuous ambiguity in NPs and to maintain fluidity in reading, we recommend that the number of modifiers placed before the head noun of an NP be limited to two adjectives or two nouns, or one of each, therefore ensuring that no more than 3 elements will be found in succession in an NP without a preposition or conjunction. There is an overlap between our recommendations and other constraints on requirements writing, since the use of heavy NPs is often discouraged in controlled English.

Training providers should make sure that requirements authors are familiar with the structure and pay attention to the readability issues that may arise when it is used. In addition, trainers should help authors choose adequate ways to rewrite heavy NPs.

Further work: Implementation in the LELIE Research Platform

This article mainly deals with the analysis of non-native authors writing in English. The main errors that were identified can be expressed on the basis of patterns and implemented in the LELIE authoring platform (Garnier, 2014) [4], (Saint-Dizier, 2015) [13], (Saint-Dizier and Kang, 2015) [14]. In this platform, errors can be signaled in the text either by means of dedicated tags, or via specific comments when texts are in Word or Excel. When a correction is automatically induced by the system, it is then suggested in the comment.

Such an implementation would allow us to test our diagnosis and the possibility to provide authors with automatic corrections, which is often welcome but needs some control from the authors. Such corrections can also be automatically learned when they turn out to be recurrent.

The LELIE technical text authoring platform is a university prototype (Saint-Dizier, 2014)[12]. It has been plugged into Word and Excel to allow authors to call LELIE from their document and to make corrections directly on their document. The LELIE platform is freely available from the authors under a creative commons license.

References and Literature



Marie Garnier

Marie Garnier is an associate professor in the English Department at the Université Toulouse 2 – Jean Jaurès (Toulouse, France). She holds a PhD in English linguistics on the topic of automatic grammar checking. Her research interests include the interface between syntax and lexical semantics in English, the definition and processing of errors produced by English learners, and linguistics-driven NLP. She can be reached at mgarnier@univ-tlse2.fr.

Patrick Saint-Dizier

Patrick Saint-Dizier, PhD, is a senior researcher in Computational Linguistics and Artificial Intelligence at CNRS, IRIT, Toulouse, France. He is specialized in discourse and semantic analysis. He has developed several national and European projects dedicated to logic programming, argumentation and technical text analysis. He is the author of several conference and journal articles and of 11 books. Besides foundational research, he has a long practice and experience of research and development activities. Contact: stdizier@irit.fr