Eduard C. Groen Matthias Koch

How Requirements Engineering can benefit from crowds

Driving innovation with crowd-based techniques


Requirements are usually elicited, analyzed, and validated using techniques that are strongly based on the co-presence of the persons involved. These techniques can quickly become too costly and labor-intensive when used with larger populations. However, automation now enables us to acquire and interpret data from very large and heterogeneous groups of stakeholders, so-called “crowds”.

The combined set of techniques for analyzing data from the crowd using text- and usage mining, motivational techniques for stimulating further generation of data, and crowdsourcing to validate requirements, is coined “Crowd-based Requirements Engineering” (CrowdRE). Three scenarios show how the crowd’s knowledge is a great source for helping to define requirements.

1. Introduction

Requirements Engineering (RE) plays a pivotal role in the successive stages of a system’s development process, but should not be performed simply for the sake of producing many requirements. This creates an ongoing tension between attaining a sufficient level of depth and remaining at a sufficient level of abstraction. Requirements must be no less than complete but no more than necessary; detailed enough to be verifiable and realizable, but free from premature design decisions. Within all these tensions, the optimum is found somewhere in between the “breadth” and “depth” of requirements (see Figure 1) to handle this trade-off efficiently.

Figure 1. Illustration of the optimal balance between breadth and depth.

Strangely enough, we do not see this tension reflected in the way requirements elicitation and validation are performed. Traditionally, requirements are elicited through techniques that include interviews, workshops, and ethnography, and validated with stakeholders through inspections and walkthroughs. These techniques are able to explore the depth very successfully, but they become too costly or time-consuming when employed among larger numbers of stakeholders. As a result, in order to cover the pool of stakeholders over its entire breadth while remaining cost-efficient, smaller numbers of stakeholders are carefully selected to represent certain groups of stakeholders. In doing so, the breadth is covered only implicitly. This restrictive approach is understandable from a historic viewpoint. But now that automation allows us to elicit and validate requirements with a far broader stakeholder sample, the question arises why the RE community has so far not followed fields such as market research [1] in adopting these techniques to complement its traditional techniques.

Techniques that can be of use for RE in this context include the use of text- and usage mining, as well as crowdsourcing and motivational instruments, including gamification. These primarily differ from more traditional RE techniques in that they do not require the co-presence of the persons involved, but can rather be performed remotely [2] and by many (potential) stakeholders at the same time. Using this sort of automation goes beyond the benefits RE gets from tool support, such as qualitative analysis of interviews or requirements management. The additional set of technology-based techniques and activities makes it possible to access and make use of additional information sources, and prevents user knowledge and data on how they work with a system from remaining unused. This makes RE more scalable to settings with a large number of stakeholders (i.e., breadth), with the trade-off being that they are less suited for exploring requirements in detail (i.e., depth).

In the following section, we will describe the concept behind this set of techniques. Section 3 covers three scenarios in which the techniques are applied. In section 4, we take a more general look at the benefits we expect in practice when the proposed concept is employed. The article concludes with a brief summary of the state of the art and future research activities.

2. Concept

The potential of text and usage mining techniques and motivational instruments for RE has been taken up by the research community under names such as Data-driven RE [3], Feedback-based RE [4], Crowd-centric RE [5], and Crowd-based RE [6] [7], or CrowdRE. In essence, these approaches are all concerned with the same issue: obtaining and analyzing user feedback about a product from a so-called “crowd” through automation in order to derive validated user requirements (Figure 2). The main difference between the approaches is which of the challenges related to analyzing large groups of people they are primarily addressing.

Figure 2. Abstract overview of the CrowdRE process.

A crowd is, generally speaking, a heterogeneous group of (potential) stakeholders, large enough in size for group effects to occur when they interact [6]. What these people have in common is their interest in a particular product. On the one hand, they discuss this product and thereby influence one another’s opinions and decision-making. By doing so, they generate (natural language) text data, including reviews, reports, transcripts (of chat discussions or phone calls), emails, manually documented protocols, and documents. On the other hand, by using the product, they produce log data, including mouse clicks, duration counts, system log outputs (i.e., automatically generated protocols), and sensor data, which are collected automatically. These two types of data can be analyzed through text mining and usage mining [8], respectively. The outcomes from these mining activities allow deriving (not yet validated) requirements. Through crowdsourcing, where the crowd provides input upon request [9], the crowd can be involved in validation, prioritization, and planning activities [10]. Repeating the process after implementing the requirements in a product or a prototype (e.g., for A/B testing) allows for a loop, as the crowd responds to the changes verbally (through text) and behaviorally (by usage).

CrowdRE faces four main challenges (see Figure 2). The crowd needs to be mobilized to generate data beyond what is already available. This requires the use of motivational instruments such as incentives (e.g., through gamification), as well as community management. In addition, in order to guarantee their continuous involvement, the crowd should be informed about what was done with their input and should remain involved in the process through crowdsourcing. Especially with respect to natural language data, feedback must be understood. When interpreting the data, including the implications of culture, user feedback types (categorized by their willingness to share their knowledge and/or allow their behavior to be tracked), user feedback structure [11], and data protection and privacy laws need to be taken into account. Finally, the significance of the data must be established. In addition to determining the quantity, validity, and reliability of the data, this requires empirical methods and other data sources (e.g., comparing text and usage mining results). This becomes critical, for instance, when choices between two or more conflicting opinions need to be made.

The two main techniques for analyzing the crowd are text mining and usage mining (i.e., analyzing what they say and what they do, respectively). Written language is a central medium for storing, sharing, and communicating content, which can be analyzed with the help of text mining techniques to find predominantly conscious requirements (i.e., requirements that people can put into words), along with unconscious requirements from the patterns that these findings reveal. Text mining assumes the existence of text-based data that are either imported directly or through the use of a crawler, which converts texts from other (online) sources and stores them in a database. The relevant portions of text are then identified and classified on a per-sentence level using language patterns that specify a certain syntax or wording to be matched by a sentence. Grouped statements can provide information about “pseudo-requirements”, which need to be validated to be considered requirements. By aggregating the data, reports can be generated. Text mining is neither subject to the interpersonal effects of interviews nor limited to answers to questions asked, e.g. in questionnaires.

With a crowd, the quantity of text-based data can grow beyond what can be manually read and processed. For example, popular apps get thousands of reviews on each app store each day and contain valuable information that can help further the development of a product. Text mining includes selecting, categorizing, and combining these data using aspects such as sentiment, valence (positive or negative), content (e.g., feature request, bug report), and indications about quality aspects (i.e., non-functional requirements). Further analysis also allows identifying trends over time or determining changes in comments (e.g., to verify if ideas suggested by users were successfully implemented and have led to a higher satisfaction). Moreover, comments about competing products can be analyzed in order to understand what the crowd likes about them or whether their users provide innovative ideas. These analyses can be repeated to guide development prioritization and strategic decisions.

In addition to text mining, usage mining techniques help measure and analyze user behavior to uncover various types of requirements, similar to ethnographic techniques but with many users simultaneously. Usage mining comprises the recording of behavioral and log data from users who have given approval to collect these data. Descriptive statistics, correlations, and visualization of data are used to uncover behavioral patterns and interpret them with caution. Because the interrelationships between the data are not always clear, and because usage data are often unstructured, the analysis of data in non-relational databases (e.g., Big Data analysis) may be required to identify problems, future trends, and innovations in larger quantities of data within an acceptable amount of time in order to stay ahead of the competition.

Usage mining can reveal subconscious requirements (i.e., requirements that are so self-evident to users that they fail to specifically identify them). This way, the data of many users can be compared (breadth), though at the cost of not being able to ask users to provide a verbal report of their actions (depth). A thorough analysis of the patterns may also reveal unconscious requirements (i.e., requirements that people do not know they have), including innovative ideas that are otherwise being elicited through creativity techniques. Patterns that deviate from the intended use of the product reveal whether users found a workaround, take an unusually long time for a particular step, or prematurely end an activity; i.e., patterns that reveal new uses for the product, opportunities for optimization, or problems that require addressing. Finally, the analysis can help to verify whether the expressed requirements from text mining fit the way the product is being used.

To attain an optimal balance between breadth and depth, the techniques that make use of the added benefits from a crowd should be used in parallel with established RE techniques. People who “stand out from the crowd”, e.g., by bringing in many creative new ideas, could be interviewed. Conversely, the outcomes of workshops or focus groups could be verified by comparing these results with information from the crowd. We will now illustrate this with the help of three scenarios: an automobile manufacturer, a product development company, and a situation where subcontracting between two organizations is arranged.

3. Scenarios

Scenario 1: Automobile Manufacturer

The research project PRO-OPT [12] aims to optimize the vehicle production and maintenance process. This includes detecting systematic errors and their sources in the presales phase quicker, as well as detecting the possible contributions of anyone operating, owning, or servicing a vehicle (i.e., the “crowd”) in the aftersales phase with regard to improving existing and future vehicles. Car manufacturers release similar models globally, but the feedback often does not reach the manufacturer. For example, repair shops collect vehicle log data through their diagnostic process, but this remains in the shop. More complicated matters are discussed with the importer of the country in which the repair shop is located. Only in exceptionally difficult cases will the importer pass the matter on to the manufacturer. As a result, manufacturers often miss out on vital information on how to improve their vehicles, and structural issues are often identified by chance rather than through a systematic analysis. For this reason, CrowdRE aims to combine such data, leading to improvements based on both log data and natural language data.

Systematic problems can be identified and repaired more effectively through CrowdRE. Finding the root cause based on the diagnostic log data no longer relies merely on the knowledge and experience of the mechanic. The correlations between intermittent problems and the parameters that reveal the special circumstances under which they occur are identified systematically by bringing together diagnostic data from repair shops, which is currently not yet being done in practice. Furthermore, a gamified diagnosis management system allows manufacturers and mechanics to exchange knowledge. Garage mechanics are able to provide possible solutions to specific Diagnostic Trouble Codes (DTCs) along with a description of solution attempts they tried out, but which failed. Engineers from the manufacturer can in turn provide specific maintenance advice for co-occurrences of DTCs with other parameters. This could also include instructions to watch out for certain symptoms and to proactively correct the problem in all affected vehicles.

For example, a mechanic describes how the fuel injector of a turbocharger keeps getting clogged after a few months, even if the injector and the particulate filter are replaced completely. He gets points for adding a unique case to the diagnosis management system and other mechanics can benefit from this knowledge, so they won’t have to repeat all diagnostic steps. However, other mechanics increasingly appear to encounter the same problem. Through usage mining of these cases, the manufacturer finds out that this is a complex issue that correlates with the vehicle’s use in terms of driving range and geographic location. The engine is unsuitable for countries where fuel blends with higher amounts of hard particulate are more common. Moreover, this problem systematically occurs first in vehicles that are driven over long distances, but eventually affects all vehicles equipped with this engine. This means that CrowdRE helped the manufacturer to detect early that modification of all engines of this type is required.

User feedback can provide useful insights into the problems, wishes, and needs of car owners and operators within a short amount of time. This way, CrowdRE can satisfy the manufacturers’ interest regarding what people write about their vehicles and comparable vehicle types from competitors (e.g., on car forums and in social media). These findings provide further support for the findings from the log data, but also show the degree to which user expectations were not met, were met, or were exceeded, and which innovative ideas some crowd members have for future iterations.

Continuing with the above example, as the manufacturer is going to modify the turbodiesel anyway, they also want to know what people say about this particular engine type. They find that it garners praise for its outstanding performance and reliability, but is criticized it for its high fuel use and loudness. As a result, the planned modification of the engines currently in use will also include a fuel save mode that can be disabled when more power is required, and some simple measures are taken to reduce the noise. For a future iteration of the engine, CrowdRE will be used to specify requirements for making the engine more environmentally friendly and more silent.

Scenario 2: Product Development Company

When adding a new product to the portfolio of a company in a new domain, the initial requirements specification can be made more time- and cost-efficient by analyzing existing products. For example, a company that primarily produces navigational hardware devices and navigation apps would like to enter the international camera market with a compact camera that integrates navigation features. As they have no experience in this field, they need to perform market research to design a product that immediately meets the expectations of many potential buyers in their target audience. CrowdRE can help identify useful sources where the crowd of potential users of such a product shares what they find important and what they would like to have, e.g.:

Based on the outcomes of these analyses, reports are composed that support the early requirements specification. These requirements can then be validated through crowdsourcing with the stakeholders already identified before. After introducing the camera to the market, the company has an interest in knowing how the product is perceived on the market and how it can be improved continuously in order to increase its market share. For this purpose, the feedback from crowd members must be continuously monitored. Though analyzing reviews and other text-based data remains important, CrowdRE now also includes usage mining to identify which functions are used often or rarely. In combination with review analyses, correlations can be established in order to detect reasons for this behavior and focus future developments.

Scenario 3: Subcontracting

Subcontractors who provide design and manufacturing services of complex hardware components (e.g., jet engines) often have to provide an accurate offer at short notice. Requests for proposals are often submitted along with requirements documentation received at short notice. However, due to the complexity of the system and the required compliance with standards and legal requirements, these requirements documents usually encompass several thousands of pages. Based on this documentation, the company needs to determine if, at what price, and how fast they can construct this component. The estimates provided in the offer depend on many factors, including the design effort, the raw materials required, programming of the machines, manual tasks, and the expected person-hours for each task, along with a particular margin for profit and unforeseen circumstances. As a request for proposals was likely also made to competitors, they need to present an attractive but still realistic offer.

Manually identifying the relevant parts and extracting the important information from the documentation is a tedious task, which is often performed manually and takes a group of people several days to complete. In this process, relevant requirements are selected, assigned to a person who can take decisions or verify the contents, and finally to integrate the outcomes. As these documents often contain requirements for which services have already been provided in the past or address the same standards, review work is often unnecessarily repeated. Here, CrowdRE techniques are applied to identify the relevant requirements within the requirements documentation, compare them with a requirements backlog, and use reference words to assign the responsibility of making decisions to a specific person. This way, efforts and costs can automatically be derived based on data from past projects, and validated by the correct responsible roles. To improve the analysis, the existing language rules are first fine-tuned and adapted to the specific aspects of the documentation. For example, the potential customer may be using a specific type of sentence pattern that is not detected through the current language rules and an aeronautics-specific ontology of keywords may be necessary. The text miner can now crawl the documentation and efficiently filter out irrelevant statements, focusing only on the most important information and vastly reducing the manual labor involved to prepare the tender.

4. In Practice

The above scenarios describe situations in which CrowdRE is already being used or will be used in the future. They show that CrowdRE can be employed very well in highly diverse situations. In situations not covered by these scenarios (e.g., large software systems [13]), CrowdRE is also expected to add value. At its core is always the intention to obtain, understand, or validate the needs and wishes of stakeholders: not from an in-depth perspective, but by obtaining a broad sample. Combined with traditional RE methods, an optimal balance can be achieved in an efficient and logical way.

The techniques that fall under the umbrella term of CrowdRE – including text mining, usage mining, crowdsourcing, and motivational mechanisms – provide an additional set of tools in the hands of a skilled requirements engineer. The research community is actively working on recommendations on how and when these tools should be used. However, to prove its full potential and value in real contexts beyond being a good idea that sounds sensible, CrowdRE needs to be employed in many more concrete business situations. The above scenarios describe applications of CrowdRE in the automotive, mobile software, and hardware development branches, but we also see great opportunity in the (embedded) consumer electronics and lifestyle markets, as well as in the healthcare domain. Each of these domains will likely result in a different set of recommendations on how to optimally employ the CrowdRE techniques. In the future, these techniques will be just as indispensable for successful requirements elicitation and validation as the more traditional RE techniques are today.

CrowdRE opens the way to enormous pools of stakeholders, many of whom have already provided valuable information that is currently not being picked up by anyone. The first organization to show their stakeholders, both the majority and the minorities, that their voices are being heard, has a lead advantage over their competitors. The first organization to involve a person who revealed their experience with the product (or that of a competitor) has just won another customer and an evangelist to his or her friends and family. The first organization to aggregate and cluster vital information that can already be found everywhere online into ranked requirements will occupy the pole position, as the company will know what most of the stakeholders want and may even identify a new business opportunity that was everyone’s blind spot until then. In that sense, CrowdRE places RE amidst market research, business intelligence, community management, and even marketing.

Broad audiences provide useful data as a crowd interacts, generates ideas, and discusses them. These can help us understand how we can provide the right services and products to the right people. However, simply aggregating the results and averaging them does not provide a useful or even desired answer. Nor are absolute numbers sufficient for finding the requirements needed for a great product. They will require further analyses, e.g., by clustering, by correlating, and by taking demographic covariates into account. Moreover, CrowdRE can identify individuals among the stakeholders with unique ideas. These are candidates for the performance of traditional RE methods, if necessary in return for an incentive.

5. Conclusion

This article aimed at raising awareness for CrowdRE as a new and supplementary approach to traditional RE. Its benefit is not in achieving depth (i.e., understanding the individual’s motives) as provided through traditional RE techniques, but rather it provides techniques for covering breadth (i.e., for covering the total pool of stakeholders to a much higher degree) more explicitly than is being done today, which will contribute to better and more efficient product development. This article showed how CrowdRE is capable of providing added value to three very different contexts, though the approach is still also an important research topic in light of its current limitations and relative crudeness. Yet, it won’t be long before the limitations have been overcome and the methodology has been sufficiently refined. What will be more challenging is how the results from text and usage mining can be integrated in a meaningful way, and whether the links to related fields, such as business intelligence, could be intensified to establish the significance of RE for such previously unrelated domains. In any case, the crowd is waiting for their user feedback to be heard, and CrowdRE is the long overdue means for accomplishing this.

References



Eduard C. Groen

Eduard C. Groen is engineering psychologist. His fascination with the rapid rise of man-machine interfaces and other changes that affect society inspires him to contribute with technologies that optimally make use of the potential that these developments bring. As Fraunhofer IESE operates at the intersection of science and industry, he is involved in a variety of projects, while leading the development of the “Crowd-based Requirements Engineering” approach.

Matthias Koch

Matthias Koch studied computer science with a focus on software engineering. Since 2012, he is employed as engineer at Fraunhofer IESE and mainly addresses the topics of requirements engineering and business analysis. In this context, he was involved in research projects on the pre-project and requirements definition phase, case studies and product evaluations. In research as well as industry projects, he regularly acts as requirements responsible, conducts workshops with customers and provides consulting.