AdaScout
Sole Developer & Architect
Platform for scanning websites for WCAG 2.2 AA compliance using multiple analysis engines. A dedicated scanner worker runs Playwright + axe-core against Browserless Chromium via CDP. A separate Convex action path uses Browserbase Stagehand with Gemini/MiniMax for AI-powered accessibility analysis. PDF documents are analyzed with pdfjs-dist (metadata, tagging, text layer, reading order, tables, images). Reports exported as PDF (browser print) and Excel (exceljs). Mounts the BrowserLaunch Convex component for task orchestration.
axe-core, Stagehand AI, custom policy, PDF analysis
Metadata, tagging, text layer, OCR confidence, reading order, tables, images
OpenAI gpt-4o (default), Google Gemini 2.5 Flash, MiniMax M2-Stable
axe, stagehand, policy, pdf — normalized into unified model
The Problem
Businesses face ADA lawsuits when their websites aren't accessible. Manual audits miss issues, existing tools only check one dimension, and PDF accessibility is often ignored entirely. A comprehensive solution needs to scan HTML, analyze PDFs, and provide AI-powered remediation guidance.
The Solution
Built a multi-engine scanning platform: (1) axe-core via Playwright on Browserless Chromium for rule-based HTML checks, (2) Browserbase Stagehand with Gemini for AI-powered WCAG 2.2 AA analysis, (3) pdfjs-dist pipeline for PDF accessibility (metadata, tagging, OCR, reading order). Custom policy checks (image-missing-alt, image-empty-alt) supplement axe. Results normalized from multiple sources (axe, stagehand, policy, pdf) into a unified findings model.
Technical Decisions
Key architecture decisions and their outcomes
Multi-engine over single-tool scanning
No single tool catches all accessibility issues. axe-core is rule-based and misses context. AI catches nuance but can hallucinate.
Combined axe-core for deterministic rules, Stagehand + Gemini for AI interpretation, custom policy checks for gaps, and pdfjs-dist for document accessibility.
Comprehensive coverage. Each engine's weaknesses are covered by another's strengths.
Separate scanner worker vs. Convex actions
Playwright + axe-core needs long-running browser sessions. Convex actions have execution time limits.
Built a dedicated scanner worker (Node.js process) that connects to Browserless via CDP. Convex actions handle the Stagehand/Browserbase path (managed browser sessions).
Heavy scanning runs without timeout constraints. Lighter AI analysis uses managed Browserbase sessions.
Engineering Details
- Scanner worker: connects to BROWSERLESS_CDP_URL (ws://), runs AxeBuilder.analyze(), maps violations to findings
- Stagehand path: Convex action → Browserbase session → stagehand.extract() with WCAG 2.2 AA instruction
- PDF pipeline: pdfjs-dist extraction → rule engine (pdf.metadata.*, pdf.tagging.*, pdf.text-layer.*, pdf.images.*)
- Finding normalization: all sources (axe, stagehand, policy, pdf) mapped to unified schema with source discriminator
- BrowserLaunch integration: enqueueTask on queue 'adascout_scans' with externalRef linking to scan run pages
Key Highlights
- Multi-engine scanning: axe-core + Stagehand AI + custom policy checks + PDF analysis
- Dedicated scanner worker: Playwright + @axe-core/playwright on Browserless Chromium (CDP)
- AI accessibility analysis: Browserbase Stagehand with Google Gemini 2.5 Flash
- PDF pipeline: pdfjs-dist with 20+ rule checks (metadata, tagging, text layer, OCR, reading order, tables)
- Normalized findings model: unified output from axe, stagehand, policy, and pdf sources
- Report exports: PDF (browser print), Excel (exceljs), CSV
- BrowserLaunch component integration for task orchestration and replay
Tech Stack
Skills & Technologies
Related Articles
AI in Production: Lessons From Shipping to Real Users
Our first AI feature hallucinated a refund policy that did not exist. A customer followed it. Here is what we learned about putting language models in front of real people.
Real-Time Everything: Why We Stopped Polling and Never Went Back
Our trading dashboard polled every 5 seconds and users complained about stale data. We rebuilt on Convex with real-time subscriptions and the difference was not incremental — it was a different product.