Usability Testing

Contents

  1. Definition and Motivation
  2. First Usability Test
  3. Second Usability Test

References

Definition and Motivation

Definition

The term usability has a well-known definition from ISO, the International Organization for Standardization (9241-11), namely “the extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency, and satisfaction in a specified context of use.” [1] There are many other papers that define usability in various ways, but the key points of most definitions are directed at the ease of learning, usefulness, effectiveness, and satisfaction of and with a product. [2]

A usability test is a method which tests the features of a product by giving users tasks related to those features and observing the users’ behavior and collecting their feedback. This is a popular method used especially in the technology industry, where a user’s interactions with the product are extremely important, such as mobile/web applications, devices, and interfaces. A usability test is crucial for improving and prioritizing the needs of the user during the product development and design phases.

The facilitator (also called a conductor or moderator), participants, and tasks make up the three main components of a usability test. The facilitator supervises the tasks, conducts tests, and observes the user’s behavior. The participants perform the tasks and give the facilitator feedback. Tasks are presented as scenarios, which each depict a sequence of events users can react to with an action when they engage in a system. A scenario should accurately communicate the task to the participants without ambiguity, and the described events should be realistic. The scenario should give no direct clues or steps as to how to use the function being tested. Examples and more information on the tasks we used in our own tests can be found in the First Usability Testing section. [2]

Motivation

There is a limit to how much members of the SmartUni team can evaluate SmartUni themselves. Team members know how to use the functions and understand the purpose and use of each component. Therefore, they can’t assess the intuitiveness of the functions, since they were involved in the development of those functions.

An objective review from users of the target group, i.e. students from the University of Osnabrück, without prior exposure to SmartUni was needed to properly evaluate our site and applications. Generally, serviced web applications require regular testing. We conducted one usability test per semester which, taking into account that the project has been in development for over two semesters (about a year), resulted in a total of two testings.

First Usability Test

Objective

The first usability test took place after the initial deployment of SmartUni in February 2022. We were interested in how users interacted with a basic version of our application, whether the first components we designed following the style guide and our design philosophy were intuitive to use, and where users struggled to find their way around. Essentially, since we were still in the very early development phase and had not received any feedback on our web application from people outside the study project, we were interested in gathering as much feedback as possible from potential users of SmartUni.

Methods

Creating Scenarios. Before we created tasks for the participants to work on using SmartUni, we defined which functionalities should be tested in the first testing round. Each team leader made a list of functionalities that would be implemented before the first deployment of SmartUni. This information was used by the test conductors to create a list of to-be-tested features with respect to the specific teams:

  • Core: User registration and login, profile settings, and password change
  • SmartPlanner: Single- and multiday event creation
  • StudyBuddyMatch: Filling out a small questionnaire

During a usability test, tasks are presented to the participants in the form of scenarios: Scenarios should provide context to the participant, leading them to mimic realistic use-cases of the to-be-tested functionalities [3]. We tried to motivate participants to interact with SmartUni by phrasing the scenarios in a form that directly addressed them. Following this guideline, our scenarios contained wordings like

“You would like to…”

“You want to…”

We also followed the guideline to avoid giveaway wording, e.g. to get the users to click on the register menu item or sign up button, instead of

“register”, or “sign up”

we chose the wording:

“create a new account.”

Furthermore, we made sure that all scenarios could be fulfilled without prior experience with SmartUni.

We ended up with six scenarios to test the functionalities listed above. One example of a finalized scenario that was created to test the process of navigating to the SmartPlanner and creating a single-day event was:

“You have a 90-minute English exam on the 4th of May at 10am in the university building 01/E01. You have to bring a pen and a blank sheet of paper. You would like to note this exam in your SmartUni calendar.”

Choice of Testing Type. Due to the COVID-19 pandemic, we decided for a remote testing format in which experimenters and participants were spatially separated. This allowed participation from home and thus carried a smaller risk of a potential infection compared to an on-site testing format, in which experimenter and participant would have been located in the same room.

We decided to include both synchronous and asynchronous remote testing for our first usability test due to their individual advantages: Synchronous testing, in which the conductor and participant communicate in real-time, e.g. through an online meeting tool, has the benefit that the participant has a direct contact person during the test in case of questions. Asynchronous testing, on the other hand, allows for more temporal flexibility for the participants, since they are able to take part according to their personal preference until a given date and provide the experimenters with data, e.g. a screen recording, afterwards.

Acquired Data. Since the goal of the usability test was to gather as much user feedback as possible, we decided for the think-aloud technique to gather data. As the name suggests, the think-aloud technique asks the participant to vocalize their thoughts, allowing the test conductors to understand the participant’s train of thought, and therefore creates an understanding of why certain functionalities of a product are unintuitive to use. At the end of each test session, we asked the participants for general feedback on SmartUni on a functional as well as on a design level.

Participants. In a usability test, participants should mimic real users of a product [3]. This is why we came up with different requirements for our participants: Firstly, our participants were students at Osnabrück University, since this is the target group of SmartUni. Furthermore, we selected participants that had not interacted with SmartUni before. This ensured that all participants had the same level of expertise in using SmartUni and helped us to receive feedback on the user-friendliness upon the first interaction with the application.

Other requirements for our participants were defined with regard to the chosen remote testing type: Participants needed to be able to ensure a stable internet connection, have the option to record their screen and voice, ensure that they were able to carry out the test uninterrupted, and for the synchronous setting, also have a working webcam.

Since four to five participants reveal most usability problems in an application [3], we aimed for at least five participants in total. We ended up with seven participants, with four participants in the asynchronous and three in the synchronous setting.

Procedure. The participants were guided through the test via a Google Form. Participants first had to consent to the data collection and then received test instructions. Afterwards, they were presented with the six scenarios they had to work on in a successive order. Lastly, participants were asked to leave feedback on SmartUni and the test procedure.

Using a Google Form ensured that the testing procedure for both the synchronous and asynchronous tests was the same, except for the presence of a moderator and observer in the synchronous setting via a video conferencing tool. The role of the moderator was to greet the participant, explain the test setup to them, and be the contact person for the participant during the test in case of questions. The role of the observer was to take notes on usability problems during the test, while remaining in the background and refraining from commenting during the test.

Prior to their test, all participants received an email with the link to the Google Form. For the synchronous participants, we suggested meeting slots in the following week, from which they were required to choose one. For the asynchronous participants, we requested that they participate within one week and upload their screen and voice recordings to a protected folder.

Pilot Study. Before conducting the usability test with the actual participants, we ran a pilot study with one test person in the synchronous setting. Firstly, the pilot study showed that the scenarios were well-defined, meaning that the participant was able to understand and work with them without asking clarification questions. Secondly, the pilot study helped the two test conductors to get acquainted with the roles of the moderator and observer during the usability test. We realized that the moderator sometimes had to remind the participant to think aloud, which is why we added this reminder as a written note beneath each scenario for the main study (Fig. 1).

Fig. 1: Reminders to think aloud were added to all scenarios in the Google Form.

Result Communication

The results from the first usability test were presented to the project members in a document containing screenshots with notes and explanatory sections. The detected usability flaws were extracted from the transcribed usability tests and the observer protocols from the synchronous tests. They were clustered into six categories based on the six tested views, and added as short and informative notes onto the screenshots to allow for an easy understanding of the detected usability flaws with a link to the affected element. Longer explanations of detected problems and general feedback on SmartUni and its design were provided in descriptions. We also decided not to just communicate usability flaws but also information on those design components that were intuitive to use to encourage incorporation of those or similar elements in following designs.

A selection of screenshots from the result documentation can be found in Figs. 2 - 4.

Fig. 2: Example screenshot from the result documentation. Multiple usability issues were found in the event creation pop-up. The notes that explain the respective usability problem were directly connected to the affected elements. To highlight positive user experiences, intuitive components were mentioned as well.

Fig. 3: The most severe usability problem we detected: Nearly all participants failed to enter their date of birth fully; one managed to enter their date of birth only after multiple tries.

Fig. 4: Incorporation of general feedback: Where appropriate, feedback on the design of SmartUni (in this case on the color of the sidebar) was added to the screenshots as well.

Each team was advised to go over the result documentation together and find solutions to the detected usability problems relevant to their team. Finding solutions included redesigning and reimplementing the affected functionality (Figs. 5 - 6).

Fig. 5: Example of a SmartPlanner redesign initiated by a usability problem: Users did not expect their input to be deleted upon the closing and reopening of the event creation pop-up. The SmartPlanner team decided to include a confirmation pop-up, which opens up upon clicking the close button and informs the user that their changes will be discarded if they continue. They also have the option to remain on the event creation pop-up and finalize the event creation.

Fig. 6: Example for a settings redesign after the usability test: Users are now provided with information on the correct formatting of their date of birth and also have the option to select a date via a calendar datepicker that opens up upon clicking on the calendar icon.

Second Usability Test

Objective

Since the first usability test in February, considerable progress was made on all modules. The second usability test was designed to deliver even more specific and practical directions to the development team. The second usability test was prepared in June 2022 and took place for one week during the month of July before all the modules reached completion on the 15th of August.

Methods

The second usability test employed the general structure of the first usability test. The theoretical and methodological basis was the same for both tests, but the content of the second test deviated slightly. See section “First usability testing - Methods” for a detailed description of the methods used in the initial test.

Creating Scenarios. Since the test took place in the final stage of development, each team decided on 2-4 features based on how relevant they were for completion of development. We used the scenario guidelines used in the first testing. Scripts for the scenarios for each module were written by team members of the respective modules (Core, SmartPlanner, and StuddyBuddyMatch) as the original creators naturally knew the most about their own functions and how to put them to test. To obtain better results, the scenarios were elaborately revised several times.

The tested items were oriented around the following core questions:

  • if the functions are intuitive to use
  • if the user can navigate the functions without explicit directions
  • if the user can complete the usage of functions successfully
  • how the user interacts with the functions

We selected 9 scenarios to test the features listed below:

  • Core: User profile page, Feedback page

  • SmartPlanner: Creating events, Category filtering, Drag and drop function, Sidebar tasks

  • StudyBuddyMatch: Personal academic interest page, Sidebars, Partner preference page

Choice of Testing Type. As in the first test, we decided to include both synchronous and asynchronous methods. Regarding the synchronous test, it was taken into account that on-site testing was again possible because most of campus life had returned to normal as the COVID-19 cases decreased. However, the test was conducted at the end of the semester, so many students were not able to meet in person. Taking this into consideration, we decided to conduct all tests remotely, as before, to remain flexible. We also noticed the advantage of recording the user’s words and behaviors from the test so that other team members could observe and analyze the results together.

Participants. Participants were chosen based on the specifications of the initial usability test. First, participants were required to be students of the University of Osnabrück. Second, they were required to not have previous experience with SmartUni. Requirements for remote testing, like stable internet connection and ability to record, were checked as the individual sessions took place.

We recruited 7 participants in total. One participant was chosen to conduct a pilot run to evaluate the scenarios and improve the testing conditions. 3 synchronous participants and 3 asynchronous participants resulted in a final sample of N = 6 for the second usability test.

Procedure. The general flow of the test procedure remained the same as for the first test: An e-mail containing information to guide the test session, with a Google Form attachment containing information about the respective scenario, was sent to the participants.

The nine scenarios participants had to complete were more complicated compared to those of the first test. Therefore, we decided to pre-prepare 7 virtual users with respective virtual user IDs and passwords for the participants to use, since it takes time to make a new account and some tasks required them to use functions which require their user account to already be set up. This was done to prevent extending the testing time unncessarily. All virtual user accounts were equipped with identical profile information and an identical schedule in the calendar.

Pilot Study. The pilot study was conducted through one synchronous test so we were able to get instant feedback and improve the scenarios and Google Forms. From the pilot study, we found and were able to correct logical problems in the scenarios and errors on the server version of our web application:

First, we realized we needed to remind testers they should use the given user credentials before the test, as they tended to forget about it and registered themselves again unnecessarily. We also added reminders to the Google Forms for the participants to “Think aloud” and “Move along” even if problems should arise during the tasks, to avoid prolonged testing time.

Secondly, a server error occurred for the completion of scenario 1 (modifying and saving the settings page). However, to make the schedule of the second usability test, we postponed fixing this bug and decided to maintain the current scenario, notifying the users that errors can happen and it could be from our side and feedback about this can also be helpful.

Subsequently, in Scenarios 4 and 5, it was found that the given schedule set on the scenario and the schedule set on each virtual user’s SmartPlanner calender did not match. We changed this part in the scenarios instead of editing every virtual user’s setting.

Finally, it was discovered for Scenario 8 that a particular course intended to test the course submission function could not be found in the application database. Another course that was able to be submitted was used for this scenario instead and the SBM team was notified about the error with the original course.

Result Communication

Results were obtained by analyzing the participant videos and Google document answers obtained from the testing sessions. By thoroughly watching the participants while they worked on the scenarios, it became evident whether users of the website struggled with certain tasks or whether it was easy for them to accomplish. A document with all the comments about a specific participant’s test session was created in this process. In addition to noting down if the scenario goal was met or if the user aborted their mission beforehand, special attention was given to the intuitiveness with which the participants performed their task. From this, conclusions were drawn about the overall usability of the SmartUni website and merged into a result document containing all test users’ reactions combined. The result document was made accessible to all team members of SmartUni. The structure of the result document is comprised of separate sections that correspond to the modules. Each usability flaw in the document is sorted into its corresponding module section. This way, every module team was able to easily get an overview over the flaws that actually concerned them and could discuss in their group how to handle them best. In the final remarks at the end, general advice concerning all teams was given. Apart from the general advice section, there was no significant difference in the process of obtaining and communicating results in comparison to the first usability test.

Core Results: For the Core team, the most concerning usability flaw was the discovery of a server error that always occured when trying to save the user profile. Besides that, the process of finding and editing Core related pages was assessed. Most functions passed this test successfully, although the intuitiveness of the profile picture upload was still lacking and there was an issue with dropdowns repeating their selection options. Furthermore, the Safari browser heavily distorted the proportions of the SmartUni website elements.

Fig. 7: The Safari browser distorted website element proportions.

SmartPlanner Results: For SmartPlanner, it was discovered that the event creation’s title section needed to be redesigned in order to be recognized by all users as an editable field - otherwise, event creation fails (events without a name can’t be saved). It also became evident that event times for multi-day events should be editable and displayed individually for each day. Users also frequently wished for task deadlines to be visually presented in the planner. By far the most complaints and difficulties resulted from users not understanding the different concepts of events vs. tasks. It was determined that the different behavior of events and tasks and their editing process might need to be explained to first time users by providing a help text to explain the planner functions. It was also determined that in the help text, it might need to be mentioned that dragging and dropping is a great option to handle tasks, since many users seemed not to be aware of this possibility.

Fig. 8: SmartPlanner's drag and drop functionality.

StudyBuddyMatch Results: For StudyBuddyMatch, we determined that it would be beneficial to provide an explanation about how and on what basis a study buddy is chosen, so users won’t be confused about the different questionnaires and questions they are asked before starting the matching process. On the other hand, adding courses and grades as well as deleting them was intuitive to users. The same applied to using the sidebar for navigation. We also determined that adding a field to find a partner for courses that have not finished yet would be a nice addition to what is already provided.

Fig. 9: Image from report section recommending the addition of the ability to find study buddies for ongoing courses.

General Results: Overall, the results showed that every module would have profitted from having mandatory fields in forms marked visually (by having a star sign or similar symbol as a visual marker), since missing a mandatory question when filling out a form results in the deletion of any previous input when trying to save. It then has to be entered by a now frustrated user. In addition, it would have been greatly appreciated by the users if a helptext showed up when hovering over a symbol unknown to them on the site. Also, rethinking and redesigning the saving process would have resulted in the prevention of a lot of loss of input, since by the end of the usability test, it was clear that it was not quite intuitive how to save and whether something had been saved, since there were different buttons for saving and advancing to the next page or section.

After the results were obtained, the findings of the usability test were presented to the full SmartUni team in the next full-team meeting. Hints and ideas how to solve the problems were briefly addressed and the previously mentioned result document was made available to everyone. The exact process of which improvements could still be implemented by the deadline and how to best proceed was discussed and handled individually in each module team just as with the first usability test.

References

[1] Chen, Y-H, Rorissa, A & Germain, CA 2015, ‘Usability Definitions in a Dynamically Changing Information Environment’, portal: Libraries and the Academy, vol. 15, no. 4, pp. 601–621

[2] Ghasemifard, Mahboubeh Shamsi, Abol Reza Rasouli Kenari, Vahid Ahmadi, N. , 2015, A New View at Usability Test Methods of Interfaces for Human Computer Interaction. Global Journal Of Computer Science And Technology.

[3] Rubin and D. Chisnell, Handbook of usability testing: how to plan, design and conduct effective tests, 2nd ed. Indianapolis, IN, USA: Wiley Publishing, 2008.