Skip to content

Conversation

@cdransf
Copy link
Member

@cdransf cdransf commented Oct 28, 2025

Description

SWC-1263

Proof of concept implementing automated accessibility testing using Playwright with two complementary approaches:

1. ARIA Snapshot Testing

  • Captures and validates accessibility tree structure
  • Detects unintended changes to component semantics
  • Human-readable YAML format serves as living documentation

2. aXe-core WCAG Validation

  • Automated WCAG 2.0/2.1 Level A/AA compliance checking
  • Validates color contrast, ARIA attributes, keyboard accessibility
  • Focused on standards compliance (excludes best-practice rules)

Test Coverage

25 tests passing across both generations (~30s runtime):

1st Generation (13 tests):

  • Badge: 6 tests (3 ARIA snapshots, 3 aXe validations)
  • Status Light: 7 tests (3 ARIA snapshots, 4 aXe validations)

2nd Generation (12 tests):

  • Badge: 4 tests (2 ARIA snapshots, 2 aXe validations)
  • Status Light: 8 tests (3 ARIA snapshots, 5 aXe validations)

Key Features

  • Dual Storybook Support: Playwright projects handle both generations (1st gen: port 8080, 2nd gen: port 6006)
  • Deterministic Testing: No arbitrary timeouts (waits for custom element definition, visibility, upgrade)
  • Collocated Tests: Tests live with components in test/*.a11y.spec.ts
  • Shared Helpers: Generation-agnostic utilities in 1st-gen/test/a11y-helpers.ts work for both gens
  • Auto-start Storybooks: Both instances launch automatically when running tests
  • Consolidated Documentation: Single source of truth at ACCESSIBILITY_TESTING.md

Usage

yarn test:a11y          # Run all tests (both generations)
yarn test:a11y:1st      # Only 1st gen tests
yarn test:a11y:2nd      # Only 2nd gen tests
yarn test:a11y:ui       # Interactive Playwright UI

Files Added/Modified

Test Files:

  • 1st-gen/packages/badge/test/badge.a11y.spec.ts
  • 1st-gen/packages/status-light/test/status-light.a11y.spec.ts
  • 2nd-gen/packages/swc/components/badge/test/badge.a11y.spec.ts
  • 2nd-gen/packages/swc/components/status-light/test/status-light.a11y.spec.ts

Shared Utilities:

  • 1st-gen/test/a11y-helpers.ts - Deterministic helpers for both generations

Configuration:

  • 1st-gen/playwright.config.ts - Dual-Storybook Playwright setup with projects
  • 1st-gen/package.json - Test scripts (test:a11y, test:a11y:1st, test:a11y:2nd, test:a11y:ui)
  • package.json - Root-level convenience commands

Documentation:

  • ACCESSIBILITY_TESTING.md - Comprehensive guide covering quick start, how-to, and best practices

ARIA Snapshots (auto-generated baselines):

  • 1st-gen/packages/badge/test/badge.a11y.spec.ts-snapshots/*.aria.yml
  • 1st-gen/packages/status-light/test/status-light.a11y.spec.ts-snapshots/*.aria.yml
  • 2nd-gen/packages/swc/components/badge/test/badge.a11y.spec.ts-snapshots/*.aria.yml
  • 2nd-gen/packages/swc/components/status-light/test/status-light.a11y.spec.ts-snapshots/*.aria.yml

Motivation and Context

Problem: Manual accessibility testing is time-consuming and inconsistent. Regressions can slip through unnoticed.

Solution: Automated testing catches accessibility issues early in development and serves as executable documentation of expected behavior.

Approach: This POC demonstrates the pattern on Badge and Status Light components across both 1st and 2nd generation. If approved, the pattern can be applied to all components.

Technical Decisions

1. WCAG-only scanning

  • Excludes best-practice rules (e.g., "page must have h1")
  • Focuses on WCAG 2.0/2.1 Level A/AA compliance
  • Reasoning: We test isolated components in Storybook, not full pages

2. Deterministic waits

  • Custom gotoStory() helper waits for specific conditions
  • No arbitrary timeouts or waitForLoadState('networkidle')
  • Faster, more reliable than waiting for network idle

3. Playwright projects for dual-generation support

  • Separate configurations for 1st gen (port 8080) and 2nd gen (port 6006)
  • Allows running tests independently or together
  • Auto-starts both Storybook instances as needed

4. Collocated tests

  • Tests live in each component's test/ directory
  • Follows existing test organization patterns
  • Easier discovery and maintenance

5. Shared, generation-agnostic helpers

  • Single helper library at 1st-gen/test/a11y-helpers.ts
  • Works for both 1st gen (sp-*) and 2nd gen (swc-*) components
  • No code duplication

Related Issue(s)

  • This is a proof-of-concept RFC for team discussion
  • No existing issue (greenfield implementation)

Author's Checklist

  • I have read the CONTRIBUTING and PULL_REQUESTS documents
  • I have reviewed the Accessibility Practices for this feature
  • I have added automated tests to cover my changes
  • I have included updated documentation

Reviewer's Checklist

Questions for Discussion

  1. Approach: Do ARIA snapshots + aXe validation provide the right balance?
  2. Coverage: Should we add these tests to all components or start with high-priority ones?
  3. CI Integration: How should these run in CI? (All PRs? Scheduled? On-demand?)
  4. Maintenance: Who owns keeping snapshots updated when designs change?
  5. Snapshot Management: ARIA snapshots are committed - is this the right approach?

Manual Testing

Verify tests run successfully:

  1. Clean environment: pkill -f "storybook"
  2. Run all tests: cd 1st-gen && yarn test:a11y
  3. Expected: 25 passing tests in ~30 seconds
  4. Verify both Storybooks auto-start (ports 8080 and 6006)

Review test outputs:

  1. Open HTML report: yarn test:a11y:report
  2. Review ARIA snapshots in **/test/*-snapshots/ directories
  3. Verify human-readable YAML format

Try individual commands:

  • yarn test:a11y:1st - Should run 14 1st gen tests
  • yarn test:a11y:2nd - Should run 11 2nd gen tests
  • yarn test:a11y:ui - Should open Playwright UI
  • yarn test:a11y badge --update-snapshots - Update baselines

Review documentation:

  • Read ACCESSIBILITY_TESTING.md for comprehensive guide
  • Follow "Adding tests to a component" section
  • Verify examples are clear and copy-paste ready

Note: This is a POC for discussion. Not intended for immediate merge - seeking feedback on approach and implementation before broader rollout.

@cdransf cdransf self-assigned this Oct 28, 2025
@cdransf cdransf added the Status: WIP PR is a work in progress or draft label Oct 28, 2025
@changeset-bot
Copy link

changeset-bot bot commented Oct 28, 2025

⚠️ No Changeset found

Latest commit: 1651e27

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@github-actions
Copy link
Contributor

github-actions bot commented Oct 28, 2025

📚 Branch Preview

🔍 Visual Regression Test Results

When a visual regression test fails (or has previously failed while working on this branch), its results can be found in the following URLs:

Deployed to Azure Blob Storage: pr-5835

If the changes are expected, update the current_golden_images_cache hash in the circleci config to accept the new images. Instructions are included in that file.
If the changes are unexpected, you can investigate the cause of the differences and update the code accordingly.

@github-actions
Copy link
Contributor

Tachometer results

Currently, no packages are changed by this PR...

@cdransf cdransf force-pushed the cdransf/a11y-testing-spike branch 8 times, most recently from 0c06161 to 1e47ff4 Compare October 28, 2025 20:10
@cdransf cdransf added Status: Ready for review PR ready for review or re-review. Status: WIP PR is a work in progress or draft and removed Status: WIP PR is a work in progress or draft Status: Ready for review PR ready for review or re-review. labels Oct 28, 2025
@cdransf cdransf marked this pull request as ready for review October 28, 2025 20:56
@cdransf cdransf requested a review from a team as a code owner October 28, 2025 20:56
@cdransf cdransf added the Status: Ready for review PR ready for review or re-review. label Oct 28, 2025
@cdransf cdransf changed the title test: add Playwright accessibility testing POC for Badge and Status Light feat(testing): add Playwright accessibility testing POC for Badge and Status Light Oct 28, 2025
@cdransf cdransf marked this pull request as draft October 28, 2025 21:02
@rubencarvalho rubencarvalho force-pushed the barebones branch 2 times, most recently from 3ff495e to fd099fd Compare November 3, 2025 14:45
@cdransf cdransf force-pushed the cdransf/a11y-testing-spike branch 4 times, most recently from bd182cc to c86b099 Compare November 3, 2025 21:25
@cdransf cdransf marked this pull request as ready for review November 4, 2025 21:06
Copy link
Collaborator

@graynorton graynorton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cdransf This looks great—I'm so excited to see it happening!

I haven't had time for a thorough review yet, but wanted to leave some high-level feedback based on an initial look:

  • Can we move the shared bits of config and functionality so that they live in 2nd-gen, not 1st-gen? In general, it's fine to add component-specific stuff to 1st-gen during this transition period, but any new, non-component-specific code that we want to carry forward into the future should find a place in 2nd-gen to live.
  • Can you take the content of ACCESSIBILITY_TESTING.md and find a place for it in the new top-level CONTRIBUTOR-DOCS hierarchy?
  • [Minor, but a bit concerning] - Do you understand the significance of the changed files where the changes are simply the addition / removal of the license header? Are these linter-prompted changes?

@cdransf cdransf removed the Status: WIP PR is a work in progress or draft label Nov 4, 2025
@cdransf cdransf force-pushed the cdransf/a11y-testing-spike branch 2 times, most recently from 3e59ffd to 019281e Compare November 4, 2025 22:51
@cdransf
Copy link
Member Author

cdransf commented Nov 4, 2025

* [Minor, but a bit concerning] - Do you understand the significance of the changed files where the changes are simply the addition / removal of the license header? Are these linter-prompted changes?

This was an interesting one and I believe it may be an interaction between our linting and our pre-commit hook. .eslintrc.json has "ignorePatterns": ["1st-gen/packages/icons/src/icons-*.svg.ts"]. lint-staged.config.js has

'*.ts': [
    'eslint --fix --format pretty --cache',  // ← --fix automatically fixes violations!
    'prettier --cache --write',
],

So it matches icons-medium.svg.ts, the notice/notice rule detects the header and --fix restores it before getting automatically re-staged. We'd need to update lint-staged.config.js to prevent that *.ts match from superseding the rule in .eslintrc.json.

@cdransf
Copy link
Member Author

cdransf commented Nov 4, 2025

  • Can you take the content of ACCESSIBILITY_TESTING.md and find a place for it in the new top-level CONTRIBUTOR-DOCS hierarchy?

I moved it below the section on patching dependencies as it's fairly thorough and can stand as its own section in the documentation. It's also linked to from the Testing section of 03_working-in-the-swc-repo.md.

@cdransf
Copy link
Member Author

cdransf commented Nov 4, 2025

* Can we move the shared bits of config and functionality so that they live in 2nd-gen, not 1st-gen? In general, it's fine to add component-specific stuff to 1st-gen during this transition period, but any new, non-component-specific code that we want to carry forward into the future should find a place in 2nd-gen to live.

Absolutely (and this makes more sense than where I had it). I've gone ahead and moved things over, verified the commands and updated the PR description.

@cdransf cdransf requested a review from graynorton November 4, 2025 23:28
@graynorton
Copy link
Collaborator

* [Minor, but a bit concerning] - Do you understand the significance of the changed files where the changes are simply the addition / removal of the license header? Are these linter-prompted changes?

This was an interesting one and I believe it may be an interaction between our linting and our pre-commit hook. .eslintrc.json has "ignorePatterns": ["1st-gen/packages/icons/src/icons-*.svg.ts"]. lint-staged.config.js has

'*.ts': [
    'eslint --fix --format pretty --cache',  // ← --fix automatically fixes violations!
    'prettier --cache --write',
],

So it matches icons-medium.svg.ts, the notice/notice rule detects the header and --fix restores it before getting automatically re-staged. We'd need to update lint-staged.config.js to prevent that *.ts match from superseding the rule in .eslintrc.json.

Thanks for looking into it! Based on the latest commits to the barebones branch, it looks like these changes are as intended (i.e., represent actual linting fixes), but the fixed versions of these particular files hadn't been committed yet. It looks like this later commit to barebones takes care of it...

116aa68

...so I think rebasing your branch on the latest barebones would clean up the file diff on this PR.

Copy link
Collaborator

@graynorton graynorton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

As I mentioned in a reply, I think rebasing on the latest barebones should clean up the diff for this PR (related to those confusing linting changes).

Base automatically changed from barebones to main November 5, 2025 15:37
@cdransf cdransf force-pushed the cdransf/a11y-testing-spike branch 4 times, most recently from 507fe7f to 74a79e5 Compare November 5, 2025 19:32
@cdransf cdransf force-pushed the cdransf/a11y-testing-spike branch from 94064dc to fdf30c7 Compare November 5, 2025 22:32
…ight

Implements automated accessibility testing using two complementary approaches:

1. ARIA Snapshot Testing
   - Captures and validates accessibility tree structure
   - Detects unintended changes to component semantics
   - Serves as living documentation of expected a11y structure

2. aXe Rule Validation
   - Automated WCAG 2.0/2.1 Level A/AA compliance checking
   - Excludes best-practice rules (focused on component testing)
   - Validates color contrast, ARIA attributes, and more

Test Coverage (14/14 passing, ~6s runtime):
- Badge: default, icons, semantic variants, color contrast
- Status Light: sizes (s/m/l), disabled state, color contrast

Key Implementation Details:
- Integrated with existing Storybook stories (no duplication)
- Element visibility waits (reliable, fast)
- WCAG-only scanning (appropriate for isolated components)
- HTML report generation for debugging

Files Added:
- 1st-gen/playwright.config.ts - Playwright configuration
- 1st-gen/test/playwright-a11y/aria-snapshots.spec.ts - ARIA tests
- 1st-gen/test/playwright-a11y/axe-validation.spec.ts - aXe tests
- 1st-gen/test/playwright-a11y/README.md - Documentation
- RFC_A11Y_TESTING.md - Comprehensive RFC with scaling plan
- README.A11Y.md - Quick start guide

Usage:
  yarn test:a11y       # Run all accessibility tests
  yarn test:a11y:ui    # Open Playwright UI for debugging
  yarn test:a11y:1st   # Run only 1st gen
  yarn test:a11y:2nd   # Run only 2nd gen
@cdransf cdransf force-pushed the cdransf/a11y-testing-spike branch from fdf30c7 to 1005da2 Compare November 5, 2025 22:32
@cdransf cdransf force-pushed the cdransf/a11y-testing-spike branch from 43de61f to 1651e27 Compare November 5, 2025 22:43
@cdransf cdransf merged commit 1691e33 into main Nov 5, 2025
22 checks passed
@cdransf cdransf deleted the cdransf/a11y-testing-spike branch November 5, 2025 22:59
rubencarvalho added a commit that referenced this pull request Nov 6, 2025
rubencarvalho added a commit that referenced this pull request Nov 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Status: Ready for review PR ready for review or re-review.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants