A complete Rules set for planning, automating, and analysing usability tests for web-based design-systems using TypeScript, Playwright, and modern UX analytics tooling.
You've built a beautiful design system. Your components are pixel-perfect, your tokens are consistent, and your documentation is comprehensive. But here's the hard truth: none of that matters if users can't actually use your components effectively.
Most teams discover usability issues when it's too late—after frustrated developers implement broken patterns, after users abandon tasks, after support tickets pile up. You're essentially shipping blindfolded, hoping your design decisions translate to real-world success.
Your design system faces three critical challenges that traditional testing approaches can't solve:
Component-Level Blindness: Unit tests verify components render correctly, but they can't tell you if users understand how to interact with your DatePicker or if your Button hierarchy creates confusion.
Integration Chaos: Components work perfectly in isolation but become unusable when combined. Your Modal + Form + Validation pattern might be technically sound but cognitively overwhelming.
Scale vs. Insight Trade-off: Manual usability testing gives you deep insights but only covers a fraction of your component library. Automated testing scales but misses the nuanced human factors that make or break user experience.
These Cursor Rules transform your design system development with automated usability testing that runs alongside your existing CI/CD pipeline. Instead of hoping your components are usable, you'll know with quantifiable metrics and actionable insights.
Automated Task-Based Testing: Convert real user workflows into Playwright scripts that measure task completion rates, time-on-task, and error patterns across your entire component library.
Embedded UX Analytics: Inject session recording, heatmaps, and accessibility scanning directly into your development workflow—not as an afterthought, but as a core quality gate.
Design System-Specific Metrics: Track component-level SUS scores, interaction success rates, and cognitive load patterns that generic testing tools completely miss.
// Your current reality: technically correct but unusable
const SearchComponent = () => {
return (
<div className="search-container">
<Input placeholder="Enter search terms" />
<Button variant="primary">Search</Button>
<Button variant="secondary">Advanced</Button>
</div>
);
};
// ✅ Unit tests pass
// ❓ No idea if users can actually search effectively
// Your new reality: provably usable components
test('Search interaction - primary user flow', async ({ page }) => {
await measureTask('complete-search', async () => {
await page.goto('/components/search');
await page.fill('[data-qa=search-input]', 'design tokens');
await page.click('[data-qa=search-button]');
// Automated usability validation
await expect(page).toHaveURL(/results/);
await expectTaskSuccess(); // > 80% completion rate required
await expectTimeOnTask({ max: 8000 }); // < 8 seconds typical
});
});
// ✅ Unit tests pass
// ✅ 94% task completion rate
// ✅ 5.2s average time-on-task
// ✅ Zero accessibility violations
Catch Usability Regressions Before Deployment: Your CI pipeline now fails builds when task success rates drop below 80% or SUS scores fall under 68—preventing unusable components from reaching production.
Component-Level Usage Analytics: Know exactly which components cause user friction with metrics like:
Automated Accessibility Integration: Every component interaction automatically triggers axe-core scanning, ensuring WCAG compliance isn't an afterthought but a built-in quality gate.
Before: Build component → Write unit tests → Ship → Hope it works → Discover usability issues weeks later through support tickets.
After: Build component → Write unit tests → Run usability automation → Get task completion metrics → Fix issues before merge → Ship with confidence.
// Automated form usability validation
test('Registration form - complete signup flow', async ({ page }) => {
await page.goto('/forms/registration');
await measureTask('complete-signup', async () => {
// Test realistic user interactions
await page.fill('[data-qa=email]', '[email protected]');
await page.fill('[data-qa=password]', 'SecurePass123!');
await page.click('[data-qa=submit]');
// Measure what matters
await expectNoFormErrors();
await expectTaskCompletion();
});
// Results: 91% completion rate, 12s average time-on-task
});
Before: Update component library → Deploy to staging → Manually test a few scenarios → Cross fingers and deploy to production.
After: Update component library → Automated usability regression testing → Compare metrics against baseline → Deploy only if improvements or no degradation.
// Automated regression detection
test('Navigation menu - post-redesign validation', async ({ page }) => {
const baseline = await getUsabilityBaseline('navigation-menu');
await measureTask('find-account-settings', async () => {
await page.click('[data-qa=user-menu]');
await page.click('[data-qa=account-settings]');
});
const results = await getTaskMetrics();
// Fail build if regression detected
expect(results.completionRate).toBeGreaterThan(baseline.completionRate * 0.8);
expect(results.timeOnTask).toBeLessThan(baseline.timeOnTask * 1.2);
});
npm install @playwright/test axe-core
mkdir -p tests/usability/{components,flows,_helpers}
Add to your playwright.config.ts:
export default defineConfig({
projects: [
{ name: 'usability-desktop', use: { ...devices['Desktop Chrome'] } },
{ name: 'usability-mobile', use: { ...devices['iPhone 13'] } },
],
reporter: [['html'], ['json', { outputFile: 'usability-results.json' }]],
});
// tests/usability/components/button.spec.ts
import { test, expect } from '@playwright/test';
import { measureTask, expectTaskSuccess } from '../_helpers';
test('Primary button - call-to-action flow', async ({ page }) => {
await page.goto('/components/button');
await measureTask('click-primary-action', async () => {
await page.click('[data-qa=primary-button]');
await expect(page).toHaveURL(/success/);
});
await expectTaskSuccess();
});
Update your existing components with usability tracking:
// Your existing Button component
export const Button = ({ children, onClick, variant = 'primary', ...props }) => {
return (
<button
data-qa={`${variant}-button`} // Add this line
className={`btn btn-${variant}`}
onClick={onClick}
{...props}
>
{children}
</button>
);
};
# .github/workflows/usability.yml
- name: Run Usability Tests
run: npx playwright test tests/usability
- name: Check Usability Gates
run: |
if [ $(jq '.taskSuccess' usability-results.json) -lt 80 ]; then
echo "Task success rate below threshold"
exit 1
fi
Week 1: Establish baseline metrics across your component library
Month 1: Ship measurable improvements
Quarter 1: Transform your design system culture
Your design system will stop being a collection of pretty components and become a proven user experience platform with quantifiable usability built into every interaction.
The question isn't whether you need better usability testing—it's whether you're ready to ship components you know users can actually use successfully. These rules make that transformation automatic, measurable, and continuous.
You are an expert in Web UX research, TypeScript, Node.js, Playwright, React-Testing-Library, and modern analytics tooling (Maze, Hotjar, Lookback).
Key Principles
- Treat usability as a first-class quality attribute; embed tests in every PR.
- Combine moderated insight (manual) with unmoderated scale (automation + analytics).
- Tasks must be atomic, jargon-free, and mapped to a single UX goal.
- Recruit 5–7 representative participants per major persona; compensate fairly.
- Always record: (1) screen, (2) audio, (3) interaction events (DOM, clicks, scroll).
- Quantify with SUS, Time-on-Task, Error Rate, Task-Success; qualify with verbatim quotes.
- Prototype early, test continuously; ship measurable improvements each sprint.
TypeScript
- Use strict mode & “noImplicitAny”. 100 % typed code ⇒ fewer runtime surprises.
- Prefer async/await over callbacks for Playwright & analytics SDKs.
- Naming: `actionSubject_expectedResult`. Eg. `clickSignup_showsWelcome`.
- Place usability scripts in `tests/usability/<feature-name>/`.
- Export one default function per test file: `export default async function run(page: Page) { … }`.
- Keep helpers in `tests/usability/_helpers.ts`. No circular imports.
Error Handling & Validation
- Fail fast: wrap every Playwright action in `expect.soft` to gather multi-assert reports.
- Capture uncaught errors via `page.on('pageerror', handler)` and attach to test artefacts.
- Always time-box tasks (default 3 × expected-happy-path) and report `timeoutMs`.
- Store raw logs + HAR in `/artifacts/<build-id>/<test-id>/` for replay.
Playwright Rules (Automation Framework)
- Use `@playwright/test` projects: one per viewport (`mobile`, `tablet`, `desktop`).
- Tag scenarios with `@ux-high` / `@ux-low` priority for pipeline gating.
- Wrap flows in `measureTask()` helper:
```ts
export async function measureTask(name: string, cb: () => Promise<void>) {
const t0 = performance.now();
await cb();
const duration = performance.now() - t0;
test.info().annotations.push({ type: 'duration', description: `${duration}` });
}
```
- Inject accessibility scan after each major step using `axe-core`:
```ts
await new AxeBuilder({ page })
.withTags(['wcag2a','wcag2aa'])
.analyzeAndReport();
```
React (Design-System Components)
- All interactive components expose `data-qa` selectors for stable test hooks.
- Provide keyboard operability (Tab, Enter, Space) and validate with Playwright `page.keyboard` actions.
- Use Storybook stories as test entry points to isolate component-level usability tasks.
Analytics / Session Replay
- Lazy-load Hotjar/Lookback snippets behind an “opt-in” flag to maintain GDPR compliance.
- Initialise analytics in `useEffect(() ⇒ initAnalytics({ userId, buildId }))` only when `process.env.NODE_ENV === 'production'`.
- Use heatmap sample rate ≤ 10 % to minimize performance impact.
CI/CD Integration
- Pipeline stages: `lint → unit → accessibility → usability (Playwright) → deploy-preview → moderated-studies`.
- Fail the build if: (a) task-success < 80 %, (b) SUS < 68, or (c) regression in Time-on-Task > 20 %.
- Store metrics in InfluxDB; visualise in Grafana dashboard `UX/Usability-Regression`.
Testing Patterns
- Script template:
1. Warm-up question (context).
2. Primary task (measurable goal).
3. Follow-up probing (open ended).
- Keep facilitator talk ≤ 15 % of session; avoid confirmation bias.
- Use `think-aloud` prompting: “Please verbalise what you’re thinking.”
Performance Patterns
- Pre-seed test pages with realistic data fixtures; disable 3rd-party ads for noise-free metrics.
- Record FPS & CLS during tasks with `page.metrics()`. Flag CLS > 0.1 as defect.
Security & Privacy
- Mask sensitive fields in recordings with CSS class `.private-field`.
- Anonymise participant IDs; store consent forms alongside videos.
Examples
1. Simple task script (Markdown):
```md
## Task: Locate Primary Button
You have 5 seconds: where would you click to save your settings?
```
2. Playwright automation snippet:
```ts
test('Onboarding – create workspace', async ({ page }) => {
await measureTask('create-workspace', async () => {
await page.goto('/signup');
await page.fill('[data-qa=email]', '[email protected]');
await page.click('[data-qa=continue]');
await expect(page).toHaveURL(/setup/);
});
});
```
Folder Layout
- `tests/`
• `usability/`
– `signup.spec.ts`
– `settings.spec.ts`
– `_helpers.ts`
• `accessibility/`
• `unit/`
- `artifacts/` (auto-generated)
Common Pitfalls & Guardrails
- DO NOT recruit only internal team; external, unbiased users are mandatory.
- DO NOT run > 30 min sessions without breaks; cognitive fatigue skews data.
- ALWAYS pilot your script with 1 internal & 1 external user before full rollout.