TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Show HN: Reflect – Create end-to-end tests using AI prompts

2 pointsby tmcnealalmost 2 years ago

1 comment

tmcnealalmost 2 years ago
Hi HN,<p>Three years ago we launched Reflect on HN (<a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=23897626">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=23897626</a>). We&#x27;re back to show you some new AI-powered features that we believe are a big step forward in the evolution in automated end-to-end testing. Specifically, these features raise the level of abstraction for test creation and maintenance.<p>One of our new AI-powered features is something we call Prompt Steps. Normally in Reflect you create a test by recording your actions as you use your application, but with Prompt steps you define what you want tested by describing it in plain text, and Reflect executes those actions on your behalf. We&#x27;re making this feature publicly available so that you can sign up for a free account and try it for yourself.<p>Our goal with Reflect is to make end-to-end tests fast to create and easy to maintain. A lot of teams face issues with end-to-end tests being flaky and just generally not providing a lot of value. We faced that ourselves at our last startup, and it was the impetus for us to create this product. Since our launch, we&#x27;ve improved the product by making tests execute much faster, reducing VM startup times, adding support for API testing, cross-browser testing etc, and doing a lot of things to reduce flakiness, including some novel stuff like automatically detecting and waiting on asynchronous actions like XHRs and fetches.<p>Although Reflect is used by developers, our primary user is non-technical - someone like a manual tester, or a business analyst at a large company. This means it&#x27;s important for us to provide ways for these users to express what they want tested without requiring them to write code. We think LLMs can be used to solve some foundational problems these users experience when trying to do automated testing. By letting users express what they want tested in plain English, and having the automation automatically perform those actions, we can provide non-technical users with something very close to the expressivity of code in a workflow that feels very familiar to them.<p>In the testing world there&#x27;s something called BDD, which stands for Behavior-Driven Development. It&#x27;s an existing way to express automated tests in plain English. With BDD, a tester or business analyst typically defines how the system should function using an English-language DSL called &quot;Gherkin&quot;, and then that specification is turned into an automated test later using a framework called Cucumber. There are two main issues that we&#x27;ve heard a lot when talking to users practicing BDD:<p>1. They find the Gherkin syntax to be overly restrictive. 2. Because you have to write a whole bunch of code in the DSL translation layer to get the automation to work, non-technical users who are writing the specs have to rely heavily on the developers writing the DSL translation layer. In addition, the developers working on the DSL layer would rather just write Selenium or Playwright code directly versus having to use English language as a go-between.<p>We think our approach solves for these two main issues. Reflect&#x27;s prompt steps have no predefined DSL. You can write whatever you want, including something that could result in multiple actions (e.g. &quot;Fill out all the form fields with realistic values&quot;). Reflect takes this prompt, analyzes the current state of the DOM, and queries OpenAI to determine what action or set of actions to take to fulfill that instruction. This means that non-technical users who practice BDD can create automated tests without developers having to build any sort of framework under the covers.<p>Our other AI feature is something we call the &#x27;AI Assistant&#x27;. This is meant to address shortcomings with the Selectors (also called Locators) that we generate automatically when you&#x27;re using the record-and-playback features in Reflect. Selectors use the page structure and styling of the page to target an element, and we generate multiple selectors for each action you take in Reflect. This approach works most of the time, but sometimes there&#x27;s just not enough information on the page to generate good selectors, or the underlying web application has changed significantly at the DOM-layer while being semantically equivalent to the user. Our &quot;AI Assistant&quot; feature works by falling back to querying the AI to determine what action to take when all the selectors on hand are no longer valid.<p>This uses the same approach as prompt steps, except that the &quot;prompt&quot; in this case is an auto-generated description of the action that we recorded (e.g. something like &quot;Click on Login button&quot;, or &quot;Input x into username field&quot;). We&#x27;re usually able to generate a good English-language description based on the data in the DOM, like the text associated with the element, but on the occasions that we can&#x27;t, we&#x27;ll also query OpenAI to have it generate a test step description for us. This means that Selectors effectively become a sort of caching layer for retrieving what element to operate on in for a given test step. They&#x27;ll work most of the time, and element retrieval is fast. We believe that this approach will be resilient to even large changes to the page structure and styling, such as a major redesign of an application.<p>It&#x27;s still early days for this technology. Right now our AI works by analyzing the state of the DOM, but we eventually want to move to a multi-modal approach so that we can capture visual signals that are not present in the DOM. It also has some limitations - for example right now it doesn&#x27;t see inside iframes or Shadow DOM. We&#x27;re working on addressing these limitations, but we think our coverage of use cases is wide enough that this is now ready for real-world use.<p>We&#x27;re excited to launch this publicly, and would love to hear any feedback. Thanks for reading!