no solo usabilidad: revista sobre personas, diseño y tecnología
Abierta nueva convocatoria:
Máster Universitario Online en Diseño de Experiencia de Usuario de UNIR

19 de Diciembre de 2016

Measuring UX using your own tools

Hassan Montero, Carmel

Resumen: This article describes a method to quantify the User Experince by collecting data with custom tools. It will show us which metrics can be used as part of a Usability Testing process and how the generated information can be collected in order to be analysed using the capabilities of a specialised data analytics platform.

User Experience (UX) might be one of the most difficult to define phenomenon in the IT sector in the last twenty years. Nowadays, we can find some literature that even formulates ways to measure it. Although data doesn’t have all the answers, metrics in User Experience are likely to be the best objective way to know whether your product is actually working for the people who are meant to use it.

The international standard definition ISO 9241-11 defines usability as the extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use.

Although there are no specific guidelines on how to measure effectiveness (being able to complete a task), efficiency (the amount of effort required to complete the task), and satisfaction (the degree to which the user was happy with his or her experience), it’s not very difficult to find some handy ways of collecting that data so we can have more than a guess. That’s why most user-testing tools contain some combination of completion rates, errors, task times, task-level satisfaction, test-level satisfaction, help access, and lists of usability problems (typically including frequency and severity).

Usability testing is not about testing users. We don’t evaluate how good or bad they are when using software, but the product quality in terms of a satisfactory user experience. Usability Testing also means UX Research. As an iterative process, these outcomes are helping to get a better understanding of our users, how they are, and how they really interact with the product despite our beliefs.

Based on the assumption that good or bad design can be measured, we should take into consideration that:

  1. There are many different techniques which provide qualitative and quantitative data.
  2. There are also different types of usability testing: with and without users involvement.

On the other hand, performing usability testing may seek different objectives based on the stage and needs of the product. Generically, it will help to:

How can we quantify the User Experience? Metrics and methods

In order to have a wide perspective of the usability status of a digital product, it is possible to obtain information in three different ways:

  1. Based on what users do
  2. Based on what users tell
  3. Based on what we (the testers) observe

These three viewpoints are equally important and although not all of them might be considered quantifiable, there’s a clear intention to analyse them as a combined indicator of the User Experience. Both quantitative and qualitative data collected will be used to understand the what and why.

Measuring User Performance (what users do)

Performance metrics will be calculated based on specific user behaviours, scenarios and tasks and are the best way to evaluate the effectiveness and efficiency of the product. We can potentially estimate also the magnitude of usability issues based on a confidence interval.

Performance metrics are one of the most important indicators of overall usability but it is important to understand the limitations: they tell us what is going wrong but not why.

Metrics

Measuring Evaluated Issues (what we observe)

Usability issues identified during an expert review seem to be purely qualitative. They typically include the identification and description of a problem that a user experienced and maybe the assessment of the underlying problem.

Usability issues can be identified during a user testing session and included as part of the task information collection. It is also useful to observe users during an in-person lab study.

Another way of collecting usability issues is by mapping them to a category and a level of severity. Like during Expert Reviews, when a usability issue is identified it can be organised according to a taxonomy.

The amount of issues found besides the level of critically are good scores to get an overall indicator of the usability based on what we observe.

Measuring User Perceived Usability (what the user says)

Self-reported data will give us the most important information about users’ perception of the product and their interaction with it.

We can collect self-reported data during the user testing session as part of a think-aloud protocol while users are interacting with the product, or immediately after each task and at the end of the user testing session.

In case of collecting data after the execution of a task, users can share what they thought was most difficult. If we collect the same information before and after the task execution we can compare ‘expectation’ vs ‘experience’.

Finally, users can be asked to complete a Software Usability Survey (SUS), a standard questionnaire which gives us a score about how they have perceived the overall usability of the solution.

Measuring perceived usability is a good strategy to understand user’s expectations and potential satisfaction of our customers.

Collecting data customising existing tools

Knowing what we need to know is the first step to understanding user behaviour. The next is to make it real by actually obtaining the data.

In a previous article I encouraged designers to build their own tools, with “experimental hacking” being a good way of appreciating the value a designer or UX specialist with coding skills can bring.

Based on that approach, we can orchestrate different tools to obtain an aggregated collection of information that is easy to analyse. At least we need:

  1. A tool to manage test sessions
  2. A tool to store and analyse the data
  3. A tool to collect user interactions from the prototypes

Optionally - but strongly recommended - is a tool to interact with the data in a user-friendly manner, transforming data into information and actionable knowledge.

In the following sections I’ll describe how to use Wekan, SurveyJS and Valo. Using them in tandem produces results that would traditionally require specific user testing tools.

Managing test sessions with Wekan & SurveyJS

Wekan is an open-source Trello-like Kanban tool that helps us manage user test cases. Likewise, SurveyJS is nothing but an open-source Javascript library that help us to build surveys, polls and questionnaires.

Using Wekan is pretty straightforward: it has boards, lists and cards, and for each card members, labels and comments can be attached with more to come in the future. What may appear as a simple Kanban tool is customisable to your needs.

In order to transform Wekan into a usability testing tool, a convention is needed:

  1. A Board will be a Test Case (including in the description the prototype version).
  2. A List will be an Scenario (or a combination of Scenario+Prototype+Participant)
  3. A Card will be a Task to do for that Scenario.
  4. A Member is either a Tester (board admin user) or a Participant (normal user).
  5. A Label is used both as
    1. A flag: start/stop the execution of a task
    2. A tag: marking the result of the execution like success, wrong, give up, moderator call, too long and also associating the level of simplicity perceived by the user (from 1 to 5)
  6. A Comment on a card will be just a tester annotation and they can follow their own convention.
wekan screenshot

Figure 1: Example of an scenario and a task detail.

Obviously, setting the test plan with Wekan or creating a survey is not enough to collect that data. Integration with an analytical tool like Valo is required.

From Wekan to Valo

Once a participant is assigned to a task, new labels and comments on a card result in a message sent to Valo.

The message contains the information associated with the Board, the List, the Card, and the Member, as well as data based on the Label or Comment convention. The payload sent to Valo is as a consequence a meaningful set of data contextualised in a specific period of time.

Now, every time a testing session is executed, the data is visible in real-time and recorded for a post-session analysis,

From SurveyJS to Valo

The reason why we need a survey tool is to ask participants few questions to describe themselves before the testing session. This way, the analysis of the data can be segmented by user characteristics.

The model we use to collect this information is called a contributor in Valo which is nothing but a source of information. A contributor has a metadata document that will be used to create filters and groups (a.k.a. Domains) based on the contributor information from the survey.

Even if you don’t chose Valo as the platform to perform the analytics, it is advisable to choose an engine that let you model users as the source of information.

Collecting user interactions

Collecting data from the execution of a test plan can give us information about the time on task, task success rate and error rate found during a test session for a user. However, there are two valuable metrics relating to the effort for a user to perform a certain task: Efficiency and Learnability.

These metrics require a definition of effort. In the most basic way, the effort can be measured by the number of interactions that a user has to make. However, we should also consider the cognitive load which is not directly measured looking at the interactions.

Either way, users effort can be collected from a product or a prototype based on their direct manipulation. For HTML-built software, it is certainly easy to include a Javascript library with event listeners that will collect information about:

Once the event is triggered, the payload is built and send to Valo immediately, so we can perform real-time analysis as well as historical queries.

Storing and analysing data with Valo

Valo is a platform to store and analyse data coming from different sources (e.g. Wekan, SurveyJS, a third-party API) that allow to perform real-time analytics (during test sessions) and historical analytics (for summative and formative assessment) in the exactly same way.

We use a data model in Valo based on two main streams:

In order to facilitate the relation between them both, the ‘User Interaction’ stream is fed with a ‘watermark’ field, also included in the ‘Test Case’ stream and defined by the task identifier (the card ID). This way we will be able to map any interaction performed during the execution of a specific task with all the events - comments and annotations - related with the task.

Both streams include a ‘contributor’ field that contains the ID of the participant of the test, thus allowing us to associate the contributor information to each payload.

Conclusions

One of the main reasons of using a specific tool like Valo for data analytics is avoiding oversimplification of the analysis that user behaviour requires. We need to move from counting items and plotting bar charts to take advantage of machine learning algorithms and smart analytics to get valuable insights from user-generated data.

Although it requires some effort from User Experience practitioners, building your own tools to perform user testing will not just save you some money but will give you another perspective about how to model and design user interactions and user needs. The way you analyse the data is the way you wonder about your users and the way you envision the experience of your product.

Bibliography

Albert, W.; Tullis, T. (2013). Measuring the User Experience, 2nd Edition. Publisher: Morgan Kaufmann, May 2013

Lewis, J.R.; Sauro, J. (2012). Quantifying the User Experience. Publisher: Morgan Kaufmann, April 2012

Special thanks to Wekan, SurveyJS, and Valo authors to make their software available and free to use for the creation of this article.

Compartir:

Facebook Twitter Google LinkedIn

Carmel Hassan Montero es Ingeniera Informática por la Universidad de Granada. Empezó como desarrolladora web y acabó especializándose en Usabilidad y Diseño de Interacción. Después de cinco años dedicada a la industria IT sanitaria, trabaja como Senior UX Specialist en ITRS liderando el diseño para la Experiencia de Usuario de productos para la visualización de información y análisis de datos en tiempo real construidas sobre Valo.

Más información: carmel.es

Citación recomendada:

Hassan Montero, Carmel (2016). Measuring UX using your own tools. En: No Solo Usabilidad, nš 15, 2016. <nosolousabilidad.com>. ISSN 1886-8592

No Solo Usabilidad - ISSN 1886-8592. Todos los derechos reservados, 2003-2023
email: info (arroba) nosolousabilidad.com