Case Studies

This page presents the three case studies which we used to evaluate our tool. The case studies utilise parametric discrete-time Markov chain models for:

Tele-Assistance service-based system

Online shopping web application

HTTP request handling

Foreign exchange system

Descriptions of each case study are provided below. Our analysis of the research papers published in leading software engineering venues between 2016-2020 indicates that the models used in these case studies are representative of typical software engineering projects.

Case Study 1: Tele-Assistance system

Description

The Tele-Assistance System (TAS), first introduced by Baresi et al. (2007), is an application from the healthcare domain, and aims to support a patient suffering from a chronic condition in their own home by using:

a set of vital-sign monitoring sensors mounted on a medical device worn by the patient;
remote assistance services offered by emergency, medical and pharmacy service providers.

Periodically, the patient's vital signs are measured by the wearable device, and a third-party medical analysis service is invoked to analyse them in conjunction with the patient's medical record. Depending on the results of this analysis, TAS may confirm that the patient is fine, may invoke a pharmacy service to request the delivery of different medication to the patient's home, or may invoke an alarm service. The invocation of the alarm service is also triggered when the patient presses a panic button on the wearable device, and results in a medical team being dispatched to provide emergency assistance to the patient.

We suppose that a team of software engineers wants to verify whether the third-party services they consider for the implementation of the TAS system satisfy - at confidence level ∝ - the nonfunctional requirements from the first two columns of Table 1, where bound₁, bound₂ and bound₃ are intentionally left unspecified.

**Table 1: Nonfunctional requirements for the TAS**
ID	Requirement	PCTL expression
R1	The probability that an alarm failure ever occurs during the lifetime of the TAS system shall be below bound₁.	P<bound₁[F alarmFail]
R2	The probability that the handling of a request by the TAS workflow ends with a service failure shall be below bound₂.	P<bound₃[¬done U serviceFail]
R3	The probability that an invocation of the medical analysis service is followed by an alarm failure shall be below bound₃.	P<bound₃[¬done U alarmFail{analysis}]

The probabilities of successful invocation for the three services (p_ma, p_ph and p_al) are unknown. Therefore, we need to test the system to collect observations about these parameters. However, testing could take a long time, e.g. instatiating each component and obtaining observations may be a non-trivial process, and engineers want to verify the requirements with the least amount of time and effort possible. VERACITY assists the engineer in deciding how many observations are required to complete the verification.

TAS models, properties, requirements, and experimental results files

**Table 2: Data files for the TAS**
File(s)	Description
DTMC model	DTMC model for TAS that can be analysed in PRISM
PCTL Properties	A PRISM properties file with the three properties for the nonfunctional requirements listed in Table 1
VERACITY model	DTMC model for TAS that can be analysed using VERACITY
TAS requirements	A file with the three nonfunctional requirements listed in Table 1 (note that the placeholders bound₁, bound₂ and bound₃ need to be replaced with concrete numerical values)
Files from the folder 'CaseStudies/results/CaseStudy1/' in the VERACITY archive	Raw experimental results from the experiments reported in the VERACITY research paper.

Analysis

A detailed analysis and discussion of the experimental results is available from our companion paper.

Case Study 2: Online shopping web application

Description

The system from the second case study is an online shopping web application adapted from Filieri et al. (2012). The shopping process implemented by this application is modelled by using a parametric Markov chain that comprises a combination of known and unknown transition probabilities. The known transition probabilities correspond to application components that have been in use for a long time, and for which the values of these probabilities can be determined from application logs. In contrast, the unknown transition probabilities correspond to new versions of several components that the online shopping company's developers have re-implemented and want to evaluate through A/B testing.

A/B testing is a method for testing a new online-application feature, or a new implementation of an existing feature. Frequently used by leading companies like Amazon, Facebook, Google and Microsoft, the method involves splitting the users of a web application into two sets, such that one set of users is given access to a version of the application that includes the new feature (or the new implementation of a feature), while the other set continues to use the standard version of the application. In this way, A/B testing allows companies to evaluate new features and components, and to decide whether they should be included in their online applications or not.

For this case study, we assume that the online shopping company wants to verify whether the nonfunctional requirements from Table 3 would be satisfied if several application components were to be replaced with new variants of those components. Furthermore, we assume that in order to limit the disruption of the user experience that may occur if these requirements are in fact violated, the company wants to perform this verification with as little A/B testing of each of four new component implementations as possible.

**Table 3: Nonfunctional requirements for the online shopping application**
ID	Requirement	PCTL expression
R1	The probability that customers complete the shopping process successfully shall be above bound₁	P>bound₁ [F success]
R2	The probability that the authentication component fails shall be below bound₂	P<bound₂ [F authFail]
R3	The average number of successful uses of new components per shopping session (counted by the Markov chain reward structure shown in the square annotations associated with states s₂, s₁₁, s₁₄ and s₁₆) shall exceed bound₃	R>bound₃ [F done]

Web application models, properties, requirements, and experimental results files

**Table 4: Data files for the online shopping application**
File(s)	Description
DTMC model	DTMC model for the online shopping application that can be analysed in PRISM
PCTL Properties	A PRISM properties file with the three properties for the nonfunctional requirements listed in Table(4)
VERACITY model	DTMC model for the online shopping application that can be analysed using VERACITY
Online shopping application requirements	A file with the three the nonfunctional requirements listed in Table(4)
Files from the folder 'CaseStudies/results/CaseStudy2/' in the VERACITY archive	Raw experimental results from the experiments reported in the VERACITY research paper.

Analysis

A detailed analysis and discussion of the experimental results is available from our companion paper.

Case Study 3: HTTP request handling

Description

The third system is a web application consisting of three servers for HTTP proxy, web and application adapted from Calinescu et al. (2015). To perform a client request, the application accesses structured data stored in a database, and static content held on a file server. The accessed data and content can be cached by cache servers. The process of handling an HTTP request for this application is modelled using parametric DTMC. The model consists of ten states where each one indicates a stage of the process. The initial state s₁ indicates that the request is being handled. The shaded states are those that, once entered, could not be exited. These states are: s₇ indicates that the proccess is failed due to unavailability of the server, s₈ indicates that the process completed successfully, and s₉ indicates that the overloading of the server.

Assuming that some of the servers are replaced with new ones. Engineers of the system want to verify if the nonfunctional requirements from Table 5 will be satisfied with the least amount of effort and time possible. Since the probabilities of invocating these servers are unknown, they need to test the system to collect the required observations. However, testing could take a significant amount of time and efforts. VERACITY facilitates the engineers in determining the number of observations required to complete the verification.

**Table 5: Nonfunctional requirements for the HTTP request handling**
ID	Requirement	PCTL expression
R1	The probability that handling a client request without reaching to file server or database is above bound₁	P>bound₁ [ F !(s=4 \| s=6) U s=8 ]
R2	The probability that the request is handling successfully shall be above bound₂	P>bound₂ [ F s=8 ]
R3	The average cost of handling a request shall below bound₃	R{"cost"}<bound₃ [ F (s=7 \| s=8 \| s=9) ]
R4	The average time of handling a request shall below bound₄	R{"time"}<bound₄ [ F (s=7 \| s=8 \| s=9) ]

Web application models, properties, requirements, and experimental results files

**Table 6: Data files for the HTTP request handling**
File(s)	Description
DTMC model	DTMC model for the HTTP request handling that can be analysed in PRISM
PCTL Properties	A PRISM properties file with the three properties for the nonfunctional requirements listed in Table(5)
VERACITY model	DTMC model for the HTTP request handling that can be analysed using VERACITY
HTTP request handling requirements	A file with the three the nonfunctional requirements listed in Table(5)
Files from the folder 'CaseStudies/results/CaseStudy3/' in the VERACITY archive	Raw experimental results from the experiments reported below.

Analysis

The box plots below show the additional testing budget required to complete the verification of the nonfunctional requirements for the HTTP request handling when the round budget is partitioned using the uniform method instead of the VERACITY method, and assuming a scenario in with the same component testing cost for each component (left) and a scenario with different component testing costs (right). A set of 30 experiments were performed for both scenarios at 90%, 95% and 99% confidence level.

Case Study 4: Foreign-exchange (FX) trading system

Description

The system from the fourth case study is a foreign exchange system (FX) taken from Calinescu et al. (2021). This system assists the customers of a foreign exchange company by carrying out currency exchange transactions. FX has two operation modes: expert and normal. In the expert mode, the system analyses the market activities, defines the patterns that meet the customer's goals, and executes the trade automatically. As shown in the figure on the right, the expert mode monitors the market using a "Market watch" operation to obtain the exchange rate of the chosen currency pair in real-time. The outcome of this operation is used by a "Technical analysis" operation to assess the market's current condition, anticipate the price movement, and decide whether the customer's objectives are satisfied or not. If the objectives are satisfied, FX performs an "Order" operation to carry out the trade. Otherwise (i.e., if the objectives are unsatisfied), the system either returns to the "Market watch" or triggers an "Alarm" to notify the customer that the objectives are not achieved. On the other hand, the normal mode utilises a "Fundamental analysis" operation to assess the market and decide whether (i) to perform the "Order" operation to complete the trade and then inform the customer by executing a "Notification" operation or (ii) to end the session.

We considered an instance of the FX system in which each of the six operations is performed by invoking a set of three functionally equivalent third-party services organised according to the 'SEQ-R1' pattern from Calinescu et al. (2021). According to this pattern, the execution of the i-th operation involves invoking the services svc_i1, svc_i2 and svc_i3 that can provide this operation as follows. First, svc_i1 is invoked and its invocation succeeds with unknown probability pi1; if the invocation fails, then the invocation of svc_i1 is retried with a predetermined probability ri1, or the invocation of svc_i2 is attempted with probability 1-ri1. Similarly, the invocations of svc_i2 succeed with unknown probability pi2, and fail with probability 1-pi2; in the latter case, svc_i2 is retried with predetermined probability ri2, or abandoned in favour of invoking svc_i3, which executes successfully with probability pi3. The parametric discrete-time Markov chain model of the FX system has 18 unknown parameters (i.e., p11, p12, p13, ..., p61, p62 and p63) and 41 states, and its structure can be examined here.

Suppose that the engineering team of the FX system use VERACITY to verify whether the third-party services meet the nonfunctional requirements from Table 6 (NB: this case study was carried out using a VERACITY version - available here - in which we replaced PRISM with the parametric model checker ePMC, because PRISM runs out of memory when analysing the parametric FX model).

**Table 6: Nonfunctional requirements for the FX system**
ID	Requirement	PCTL expression
R1	The probability that the transaction executed successfully is above bound₁	P>bound₁ [F(state=WF_SUCC)]
R2	The expected response time for execution shall be below bound₂	R{"time"}<bound₂ [F((state=WF_SUCC)\|(state=WF_FAIL))]
R3	The expected cost of the third party service used by the execution shall be below bound₃	R{"cost"}<bound₃ [F((state=WF_SUCC)\|(state=WF_FAIL))]

FX system models, properties, requirements, and experimental results files

**Table 7: Data files for the FX system**
File(s)	Description
DTMC model	DTMC model of the FX system that can be analysed in PRISM
PCTL Properties	A PRISM properties file with the three properties for the nonfunctional requirements listed in Table 6
VERACITY model	DTMC model for the FX system that can be analysed using VERACITY
FX system requirements	A file with the three the nonfunctional requirements listed in Table 6
Results	Raw experimental results from the experiments reported below.

Analysis

The box plots below show the additional testing budget required to complete the verification of the nonfunctional requirements for the FX system when the round budget is partitioned using the uniform method instead of the VERACITY method, and assuming two scenarios: 1) a scenario with the same component testing cost for each component (left) and 2) a scenario with different component testing costs (right). A set of 60 experiments were carried out - 10 for each scenario and each confidence level in {90%, 95%, 99%}.

Quantitative Verification with Adaptive Uncertainty Reduction

Case Studies

Case Study 1: Tele-Assistance system

Description

TAS models, properties, requirements, and experimental results files

Analysis

Case Study 2: Online shopping web application

Description

Web application models, properties, requirements, and experimental results files

Analysis

Case Study 3: HTTP request handling

Description

Web application models, properties, requirements, and experimental results files

Analysis

Case Study 4: Foreign-exchange (FX) trading system

Description

FX system models, properties, requirements, and experimental results files

Analysis