Evaluation Methodology for a Clinically-Oriented Rehabilitation Support System
System Usability Scale (SUS), NASA Task Load Index
(NASA-TLX), and Cognitive Walkthrough—including in-text citations,
example designs, how scoring works, and a bibliography.
🧪 Usability Evaluation
Methods
1. System Usability Scale (SUS)
The System Usability Scale is a standardized,
ten-item questionnaire developed to evaluate the overall usability of a system
(Brooke, 1996). Participants respond using a 5-point Likert scale from
"Strongly Disagree" (1) to "Strongly Agree" (5). SUS is
popular for its simplicity and effectiveness in providing a reliable usability
score.
Example SUS Items (partial list):
- I
think that I would like to use this system frequently.
- I
found the system unnecessarily complex.
- I
thought the system was easy to use.
(Items alternate between positive and negative to reduce
bias.)
👉 Example Evaluation
Design:
- Participants:
10 clinicians and 5 novice users
- Task:
Use a dashboard to review a rehabilitation session
- After
task: Fill SUS questionnaire
🎯 SUS Scoring:
- For odd-numbered
items (positive statements): score = scale position - 1
- For even-numbered
items (negative statements): score = 5 - scale position
- Sum
all 10 adjusted scores, multiply by 2.5 → gives a score out of 100
Example:
|
Item |
Response (1–5) |
Adjusted Score |
|
1 |
4 |
3 |
|
2 |
2 |
3 |
|
3 |
4 |
3 |
|
4 |
2 |
3 |
|
... |
... |
... |
Total raw = 30 → SUS = 30 × 2.5 = 75
Interpretation: 75 = "Good Usability" (scores
above 68 are considered above average).
2. NASA Task Load Index (NASA-TLX)
NASA-TLX measures the perceived workload on six
subscales: Mental Demand, Physical Demand, Temporal Demand,
Performance, Effort, and Frustration (Hart &
Staveland, 1988).
👉 Example Evaluation
Design:
- Task:
Complete a 10-minute exercise tracking task using the system
- After
task: Participants rate each dimension on a scale of 0–100
- Optional:
Weighting each subscale via pairwise comparisons
🎯 NASA-TLX Scoring:
- Raw
TLX: Average of the six ratings
- Weighted
TLX (optional): Multiply each rating by its weight, then average
weighted scores
Example Raw TLX Ratings:
|
Dimension |
Rating (0–100) |
|
Mental Demand |
50 |
|
Physical Demand |
20 |
|
Temporal Demand |
40 |
|
Performance |
70 |
|
Effort |
60 |
|
Frustration |
30 |
Raw TLX = (50 + 20 + 40 + 70 + 60 + 30) / 6 = 45
Interpretation: Lower scores indicate lower perceived
workload.
3. Cognitive Walkthrough (CW)
The Cognitive Walkthrough is a usability inspection
method focusing on learnability, especially for new users (Wharton et
al., 1994). Experts simulate a new user performing tasks and ask: “Will the
user know what to do next?”
👉 Example Evaluation
Design:
- Define
user goal: “View patient’s last session performance.”
- Task
steps: Navigate to dashboard → Click on 'History' → Select patient → View
report
- Analysts
ask at each step:
- Will
the user try to achieve the right effect?
- Will
the user notice that the correct action is available?
- Will
the user associate the correct action with the effect?
- If
correct action is performed, will user see progress?
🎯 CW Output:
- Qualitative
feedback per step
- Log
errors or confusing labels
- Suggest
UI improvements
Example Insight: “Label ‘Review’ was misunderstood as ‘Edit’
by all novice users.”
🧾 Bibliography
- Brooke,
J. (1996). SUS: A "quick and dirty" usability scale. In
P. W. Jordan, B. Thomas, B. A. Weerdmeester, & I. L. McClelland
(Eds.), Usability Evaluation in Industry (pp. 189–194). London:
Taylor & Francis.
- Hart,
S. G., & Staveland, L. E. (1988). Development of NASA-TLX (Task
Load Index): Results of empirical and theoretical research. In Advances
in Psychology (Vol. 52, pp. 139–183). North-Holland.
- Wharton, C., Rieman, J., Lewis, C., & Polson, P. (1994). The cognitive walkthrough method: A practitioner's guide. In Nielsen, J., & Mack, R. L. (Eds.), Usability Inspection Methods (pp. 105–140). New York: John Wiley & Sons.


Comments
Post a Comment