Simplifying automated testing to improve application quality more

Designing synthetic test orchestration for pre-production environments, so DevOps teams can catch critical issues before users ever see them.

Observability
CI/CD integrations
Developer experience

Problem

Problem: Inconsistent pre-prod coverage; teams stitched signals manually across tools.

Solution

Solution: CAT dashboard + config aligned to pipelines; failures surfaced first; linked to deployments.

Impact

Impact: Reduced setup time and clarified root-cause paths before production.

2
Desserts sent
6
Product meetings called
4
Teams onboarded

Launched in February 2024, Continuous Automated Testing (CAT) entered Limited Preview as a new feature by New Relic to refine the product experience through customer feedback. This effort aimed to bridge the CI/CD monitoring gap, focusing on seamless integration with cloud providers and New Relic's observability tools, setting it apart from competitors. As the Senior Lead UX Designer, I worked with a PM, three engineers, and a content writer from discovery to launch to establish this essential testing capability within the observability framework.

Company

New Relic Inc

My Team

One PM, three engineers, one content writer

Timeline

Project start: October 2023 | LA launch: February 2024 | LP Launch: Anticipated late 2025

Tools used

Figma (for mocks and flows), FigJam (for workshops), Slack, Confluence

My Role

Senior lead UX designer

Impact

Enabled pre-prod visibility for internal teams, reduced test setup time, clarified root cause through UI and metadata

Problem

CI/CD pipelines at New Relic and for many users were identifying critical issues too late, causing delayed releases and increased risk. Test coverage was inconsistent, and tracing results back to their source was difficult. End-to-end testing was particularly complex, hindering quick releases. A tool was needed to integrate smoothly into workflows, detect issues early, and provide clear resolution paths.

Solution

We developed a dashboard and configuration flow that integrates into CI/CD workflows, identifying issues early and linking failures to their source. Continuous Automated Testing leverages New Relic’s synthetic monitoring for pre-production testing. Users can configure tests with parameter overrides and expand context with metadata. Tests can be initiated in batches via New Relic’s API or integrations like NerdGraph, CLI, GitHub Actions, or Jenkins.

Jobs to be done

In enterprise and technical design, we rarely have fully detailed personas for each feature. Instead, I focus on jobs to be done: the specific tasks and outcomes real users need. They’re here to get something working, solve the problem, and move on with their day.

DevOps engineers and SREs: When a CI pipeline triggers on commit or PR, I want to orchestrate the right test suites and environments with configurable controls, so I can enforce quality gates and keep pipelines predictable.
Internal product teams: When a test fails in pre-production, I want to trace it to the related config, deployment, and code change, so I can isolate root cause and fix it quickly.
QA automation engineers: When staging is ready, I want to run synthetic tests before promotion and block release on regressions, so I can catch issues before production.
Release managers: When coordinating a release, I want to schedule required tests, capture pass or fail evidence, and collect sign-offs, so I can ship low-risk deployments with auditability.

Constraints and considerations

Testing workflows varied widely between teams, from highly automated pipelines to manual runs.

Needed to reach parity with competitors like DataDog, while also ensuring that the CI/CD testing product fit into New Relic’s existing pipeline.

Test results had to be easy to trace back to their source, even across multiple services.

Key contributions and design approach

My user-centered design approach emphasized "jobs to be done" to ensure the solution addressed real-world tasks and outcomes for DevOps engineers, SREs, internal product teams, QA automation engineers, and release managers. This focus helped ensure the solution provided the necessary control, customization, traceability, and proactive regression detection needed by various stakeholders to ensure smooth, low-risk deployments. This approach was crucial in an enterprise technical design context where specific user needs are paramount.

The design process navigated several key constraints and considerations, including the wide variation in testing workflows across different teams, necessitating a flexible solution. A significant challenge was to achieve parity with competitors while seamlessly integrating CAT into New Relic’s existing pipeline. Furthermore, ensuring the traceability of test results back to their source, even across multiple services, was a critical requirement that informed design decisions

As the sole designer from discovery through launch, my iterative approach began with mapping current team testing methodologies, from fully automated CI/CD pipelines to manual regression testing. I collaborated extensively with engineering to define CAT's integration points within New Relic’s observability tools and worked through various information architecture models for presenting test results. Crucially, I validated flows with internal product teams to ensure ease of configuration and direct traceability of failures to code or deployment changes.

An image of the design operations document I used to organize all of my information

Impact and outcomes

Through this process, we achieved several key design and product highlights, leading to tangible improvements:

• Enabled pre-production visibility for internal teams.

• Reduced test setup time.

• Clarified root cause through intuitive UI and metadata.

• Designed a robust alerting structure to present the most critical, often failing, results upfront.

• Linked failed tests directly to specific deployment changes or configuration updates to vastly improve traceability.

• Developed configuration flows that fit naturally into CI/CD pipelines, minimizing the need for tool-switching.

• Successfully onboarded 4 internal teams during the initial phase.

Future vision

The release timeline projected a Public Preview coming soon, leading to General Availability (GA) in 2025. Next steps for CAT focused on fleshing out the onboarding experience for general availability. This was a conscious decision to initially prioritize and elevate the core test response experience, with an expectation of gathering iterative data from the hand-held teams that started with CI/CD testing. New Relic encourages users to "Stay tuned for more details as we continue to develop this capability".

What I learned

This project was my first experience with a feature that was technically net-new, but very tightly coupled with an existing product (Synthetic Monitoring). It presented unique challenges, including the need to scale token logic and component models for flexible assertions, introduce new layouts without breaking consistency, and navigate rapid iteration while preserving accessibility and clarity. The project also taught me the importance of strategic reuse; I laid the foundation for future CAT-Synthetic Monitoring integration by intentionally designing modular pieces that could bridge both products. This was a period of strategic work under pressure, significantly shaping my approach to fast-moving, cross-surface UX systems.

Simplifying automated testing to improve application quality more

TL;DR

Context

Problem

Solution

The deep dive

Jobs to be done

Constraints and considerations

Key contributions and design approach

Impact and outcomes

Future vision

What I learned

Other case studies