In this section we present the results of the user evaluation and the benchmark tests carried out.
Usability and User Satisfaction Evaluation
We have evaluated the mobile application through interviews and user studies to understand how a patient uses the mobile application to control their biometrical parameters, for example, blood pressure. Additionally, we have checked whether the visualization mechanisms and user interfaces are suitable for the end-user. Twenty-three users (fourteen men, nine women) have participated in the experiment over a period of four days. The experiments were incorporated into their daily activities to simulate actual situations. The amount of time that each user tested our prototypes was 40 minutes per day on average. The population included ten retired users and thirteen active users with different professional profiles, all between the ages of 35 and 72. This evaluation (see Figure 5) is focused on user experience. These items are a subset of the MoBiS-Q questionnaire  and we have applied a Likert-type scale to evaluate the validity of each item, where 5 is the highest rating meaning “fully satisfactory” and 1 is the lowest rate meaning “not satisfactory at all”.
Sensors 13 06524f5 1024
Figure 5. Summarized results about the use of the applications to monitoring patients with CVD. According to the ease of use, patients feel that the application is easy to use because the interaction is very simple and short and the mobile application responds mostly automatically.
First, the patient measures his/her blood pressure. Once the blood pressure is read, the pressure meter displays the patient levels and sends these values to the mobile device. Then, the patient can graphically visualize the measures trends during last days. The user can also introduce and visualize information about their daily activities such as diet, medicines and physical activities. The goals of this evaluation are to know if the patient accepts this solution and to evaluate the functionality of the mobile monitoring application. This evaluation was applied to patients with hypertension, where the mobile device is used to monitor their blood pressure levels in conjunction with annotations about their diet and activities. In general, the users gave high ratings to most of items. Specifically, the mode was 4 out of 5. Since we are performing a Likert based experiment, we will report mode values instead of average values. Obtained mode values reported a high satisfaction level, especially in the first four items. They gave lower ratings to issues related to the way to input information; thus, it could be improved. We plan to adapt our previous contributions on physical activity recognition through accelerometers  and touching interaction based on Near Field Communication  to improve and automate those aspects.
In general, this statistical analysis helped us to estimate the application suitability for the user. Nevertheless, a better evaluation is necessary regarding the number of participants and statistical techniques. We will explain these issues in more detail in the conclusions section.
In summary and as Figure 5 shows, the aspects evaluated were the following:
Usability of the application. Usability has been evaluated regarding to visual and functional aspects of the graphical user interface (see Figure 6). The visual aspect belongs to the ability to use the application according to the interfaces developed or how friendly the interfaces are for patients.
Assessment of the application in comparison with the handwriting of the blood pressure levels and annotation of daily activities. We could check if the application has integrated the same functionalities of handwritten annotations and if the mobile application could improve these tasks.A 62% of users indicated that, thought the application, they had a better experience than the handwritten annotations.
Response time of the application. This aspect evaluates the required time to provide answers for medical surveillance after the biometrical levels of the patient are obtained. After this aspect was evaluated, eighteen users marked a rating of good or very good (4–5), while rest of users rated this issue as moderate (3).
. Measuring the Performance of CVD Risk Reasoning Services
An application should be intuitive and easy to use and also should have a good performance and a satisfactory time response. A poor performance degrades the user engagement and the experience of an application. To test the behaviour of our system we have employed the JMeter Training tool and measured its performance with heavy load. This tool offers a straightforward method to simulate the load on a server, defining the number of requests, the ramp-up period and how many times the experiment must be repeated. The ramp-up period determines the interval time that JMeter has to wait before starting each new user’s request. Due to the intrinsic nature of this system, we propose 100 physicians as the maximum number of concurrent users, since it is very uncommon to have more than 100 physician requests concurrently. Therefore, we tested the system with 10, 50 and 100 simulated requests. According to  the average face-to-face patient care is 10.7 minutes. This study reinforces the issue that less than 7 requests need to be handled per hour.
Our test plan is composed by nine experiments, which can be divided in three groups depending on its ramp-up period:
0 seconds ramp-up period.
5 seconds ramp-up period.
10 seconds ramp-up period.
In each group we measure the server load with 10, 50 and 100 simulated users’ requests and repeat them 50 times. We generated one thousand random requests with different users’ data, which were also randomly chosen from the JMeter test plan. CVD risk web services were deployed in an Apache Tomcat 7.0.29 server running on an Acer Aspire 4830TG laptop with an Intel Core i5-2410M and 8 GB of RAM. The operating system is Ubuntu 11.10 for 64 bits, a distribution of GNU/Linux, which runs the Java 1.6.0 26 version. The JMeter data collection was executed on a different laptop and both computers were connected to the same local network. The mean response time is below 520 milliseconds for all the tests. The best response time for 50 and 100 users was reported with the “5 seconds ramp-up period” and the best response time for 10 users was in the “0 seconds ramp-up period” group. Figure 8 shows that the server can attend more than 200 requests per minute and almost 300 request per minute.
To summarize, we have shown that the total request and execution time is not very high and the system could be deployed in nursing homes, community health centres and hospitals in an appropriate way. Besides, using real server machines instead of commodity laptop could improve the results and cut down the response time.