Introduction

Context

Note: This post was originally written by me on the BrowserStack Blog. I've adapted it here with permission.

I'm a Senior Product Manager at BrowserStack, and this is a story about how we built an exciting feature. For those of you who are unfamiliar with BrowserStack, think of it like the AWS of Testing. BrowserStack has a plethora of products - including a device cloud that provides on-demand access to any popular Android or iOS device for manual or automation testing.

I'm building a 0-1 product called Test Observability - an all-in-one test reporting, debugging, monitoring and analytics solution. We're trying to change the way that automation tests are leveraged in the modern CI/CD pipeline.

What is the Quality Gate feature? Why did we build it?

All engineering teams want to move faster, but they often lack confidence when promoting code from one environment to another or merging code - confidence that features won't break and that the pipeline won’t be slowed by non-issues.

Once code is committed or a PR is created, every team scrambles to get automated tests passing so they can move the code to production. Over the years, BrowserStack’s Automate & App Automate products have helped users drastically reduce build time by enabling test parallelization at scale. However, when tests often fail with false negatives, it increases the time teams spend verifying builds before sending a PR on its way.

I conceptualized a feature called Quality Gate, that is designed to completely eliminate manual verification of builds, enabling PRs to be merged or deployed to production automatically. Users can configure a number of quality rules that run once a build is complete and automatically block or allow code merges and deployments.

Some of you might know of Quality Gate features in tools like SonarQube - so this isn't a wholly unfamiliar concept. However, I wanted to create a Quality Gate that relies not on unit test coverage or coding quality metrics, but reflect how the rubber meets the road when testing the functionality of an application.

Here's what one of our early users had to say about the feature:

Using BrowserStack Quality Gate, you don't need to rely solely on the 'test case run status' anymore but can target something more fundamental - which is 'errors'. This is a much robust feedback cycle than before. This, in my humble opinion, is a game changer!
Pramod Yadav, Test Lead @ Mercell

This post covers the following:

How Quality Gate enables continuous deployment and faster verification
Understanding the need for Quality Gate
The capabilities of Quality Gate
How these platforms could evolve to ensure success

How Quality Gate enables continuous deployment and faster verification

Test Observability is uniquely positioned to help users precisely instrument and surface insights for all types of tests - Functional, Unit, API, or others — whether the test runs on BrowserStack infrastructure, a developer’s laptop, or an in-house browser grid. Engineers and QAs all value every type of test - but didn't have an easy way to aggregate them together, let alone set automated quality standards. Test Observability supports a wide range of test types and test frameworks (from Playwright to JUnit XML Reports) and require zero code changes to your existing setup.

With an intuitive rule builder and a ready-to-integrate API, you can set up Quality Gate on Test Observability in minutes!

Here's a quick overview of how Quality Gate fits into your pipeline:

With Quality Gate, teams can configure rules based on their specific priorities to ensure high standards of quality.

Here are some examples of automation you can put in place:

Allow a deployment if all your P0/P1 test cases pass but a few P2/P3 test cases fail.
Implement an instant roll-back tool if tests are breaking in Production.
Enforce organization-wide best practices on flakiness, test performance, and more.
Set thresholds for different failure categories, such as Product Bugs, Automation Issues, Network Issues, etc. — all powered by our ML-based auto failure analysis.
Ensure a minimum number of tests run (especially your critical tests) before allowing code merges.

Building Quality Gate

Understanding the need for Quality Gate

While building Test Observability and speaking to countless customers, we realized that the tradeoff between confidence and pace is a common dilemma; one without an apparent solution. We spoke to multiple customers who each phrased the same issue in different ways without realizing that a Quality Gate is the solution.

Customers would come to us with a simple question, “Do you happen to have an API?” prompting us to ask, “Yes, but what would you like to do with it?”

Here are some of the responses we'd get:

"I want to give developers visibility in PR checks on where failures came from."
"We need to ensure we don’t let too many flaky tests into a merge or deploy."
"I have a bunch of always failing tests I don’t care about. How can I stop my pipeline from failing for these?"
"There must be multiple ways to gauge a build other than just pass/fail tests. We would like to access a few different metrics to let developers know if a build is good to go or not."

When we asked users what was stopping them from going full CI/CD with this information, we realized the gap that we needed to fill. We realized that giving our users all the information they want - plus the ability to set up automation in a single API or plugin they could add to their pipeline was what they needed.

So we that's what we did!

The Capabilities of the Quality Gate

Configuring Rules & Profiles

We wanted users to access as many fine-grained metrics as possible while building CI/CD automation with Quality Gate. Quality standards aren't uniform, they're specific to each team, and even each module.

Therefore, we ensured that Test Observability offers the following rules out of the box, with filters available for each rule, to target individual tests or classes either statically or dynamically.

Users can group multiple rules into Quality Profiles to target builds dynamically, such as:

Unique Errors - Check for regressions in terms of errors or stack traces being thrown across builds.
Test Status - Set absolute or relative thresholds on the various test statuses in your build.
Smart Tags - Set absolute or relative thresholds for metrics like Flakiness, New Failures, and more.
Failure Category - Set thresholds for ML-analyzed failure categories.
Alerts - Check for the presence of pre-configured alerts.

The BrowserStack Default Quality Profile

To help users get started faster and also realize the value of the feature, we shipped a feature called BrowserStack Default Quality Profile. This comes with 4 rules out of the box.

No tests are marked with the New Failures Smart Tag: This ensures that no new tests start failing in a build.
No New Unique Errors are detected from the last build: This ensures that there is no regression of errors from the last build creeping in (this is to ensure that even if no new tests failed, no new error crept in).
No tests are automatically marked with the Product Bug failure category: This ensures that no tests that are ML-analyzed to be Product Bugs are allowed to creep into a merge or deploy.
Tests marked with the Flaky Smart Tag is less than 15%: Flakiness is part and parcel of every build but has to be limited. Based on millions of tests run on Test Observability, we’ve found that this is a good threshold for users to start with.

Based on extensive user research and millions of data points from Test Observability tests, we believe that these represent a well-rounded set of best practices for users to build into their Quality Profiles. We hope it can serve as inspiration for users to build more nuanced Quality Profiles that can help them take their automation testing to the next level.

Integrating into your CI/CD workflow

You can integrate the Quality Gate status into your CI/CD or SCM tool with a simple API call to Quality Gate. You can get started in minutes with sample recipes for your pipeline, which will automatically fetch the Quality Gate result and pass or fail your CI/CD job.

Fin.

We've had fantastic feedback on the feature so far, and it has helped make Test Observability, an already exciting tool, into a highly strategic tool for our customers. We want to continue innovating in this area and provide new ways for users to use the Quality Gate in your existing tooling or via new rules!

Try out Quality Gates with Test Observability today and share your feedback!