Evaluation Methodology for a Clinically-Oriented Rehabilitation Support System

System Usability Scale (SUS), NASA Task Load Index (NASA-TLX), and Cognitive Walkthrough—including in-text citations, example designs, how scoring works, and a bibliography.



🧪 Usability Evaluation Methods

1. System Usability Scale (SUS)

The System Usability Scale is a standardized, ten-item questionnaire developed to evaluate the overall usability of a system (Brooke, 1996). Participants respond using a 5-point Likert scale from "Strongly Disagree" (1) to "Strongly Agree" (5). SUS is popular for its simplicity and effectiveness in providing a reliable usability score.

Example SUS Items (partial list):

  1. I think that I would like to use this system frequently.
  2. I found the system unnecessarily complex.
  3. I thought the system was easy to use.

(Items alternate between positive and negative to reduce bias.)

👉 Example Evaluation Design:

  • Participants: 10 clinicians and 5 novice users
  • Task: Use a dashboard to review a rehabilitation session
  • After task: Fill SUS questionnaire

🎯 SUS Scoring:

  • For odd-numbered items (positive statements): score = scale position - 1
  • For even-numbered items (negative statements): score = 5 - scale position
  • Sum all 10 adjusted scores, multiply by 2.5 → gives a score out of 100

Example:

Item

Response (1–5)

Adjusted Score

1

4

3

2

2

3

3

4

3

4

2

3

...

...

...

Total raw = 30 → SUS = 30 × 2.5 = 75

Interpretation: 75 = "Good Usability" (scores above 68 are considered above average).


2. NASA Task Load Index (NASA-TLX)

NASA-TLX measures the perceived workload on six subscales: Mental Demand, Physical Demand, Temporal Demand, Performance, Effort, and Frustration (Hart & Staveland, 1988).

👉 Example Evaluation Design:

  • Task: Complete a 10-minute exercise tracking task using the system
  • After task: Participants rate each dimension on a scale of 0–100
  • Optional: Weighting each subscale via pairwise comparisons

🎯 NASA-TLX Scoring:

  • Raw TLX: Average of the six ratings
  • Weighted TLX (optional): Multiply each rating by its weight, then average weighted scores

Example Raw TLX Ratings:

Dimension

Rating (0–100)

Mental Demand

50

Physical Demand

20

Temporal Demand

40

Performance

70

Effort

60

Frustration

30

Raw TLX = (50 + 20 + 40 + 70 + 60 + 30) / 6 = 45

Interpretation: Lower scores indicate lower perceived workload.


3. Cognitive Walkthrough (CW)

The Cognitive Walkthrough is a usability inspection method focusing on learnability, especially for new users (Wharton et al., 1994). Experts simulate a new user performing tasks and ask: “Will the user know what to do next?”

👉 Example Evaluation Design:

  • Define user goal: “View patient’s last session performance.”
  • Task steps: Navigate to dashboard → Click on 'History' → Select patient → View report
  • Analysts ask at each step:
    1. Will the user try to achieve the right effect?
    2. Will the user notice that the correct action is available?
    3. Will the user associate the correct action with the effect?
    4. If correct action is performed, will user see progress?

🎯 CW Output:

  • Qualitative feedback per step
  • Log errors or confusing labels
  • Suggest UI improvements

Example Insight: “Label ‘Review’ was misunderstood as ‘Edit’ by all novice users.”


🧾 Bibliography

  • Brooke, J. (1996). SUS: A "quick and dirty" usability scale. In P. W. Jordan, B. Thomas, B. A. Weerdmeester, & I. L. McClelland (Eds.), Usability Evaluation in Industry (pp. 189–194). London: Taylor & Francis.
  • Hart, S. G., & Staveland, L. E. (1988). Development of NASA-TLX (Task Load Index): Results of empirical and theoretical research. In Advances in Psychology (Vol. 52, pp. 139–183). North-Holland.
  • Wharton, C., Rieman, J., Lewis, C., & Polson, P. (1994). The cognitive walkthrough method: A practitioner's guide. In Nielsen, J., & Mack, R. L. (Eds.), Usability Inspection Methods (pp. 105–140). New York: John Wiley & Sons.

Comments

Popular Posts