TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Launch HN: Meticulous (YC S21) – Catch JavaScript errors before they hit prod

122 点作者 Gabriel_h大约 3 年前
Hey HN, I&#x27;m Gabriel, founder of Meticulous (<a href="https:&#x2F;&#x2F;www.meticulous.ai" rel="nofollow">https:&#x2F;&#x2F;www.meticulous.ai</a>). We&#x27;re building an API for replay testing. That is, we enable developers to record sessions in their web apps, then replay those sessions against <i>new</i> frontend code, in order to catch regressions before the code is released.<p>I was inspired to start Meticulous from my time at Dropbox, where we had regular &#x27;bug bashes&#x27; for our UX. Five or six engineers would go to a meeting room and click through different flows to try to break what we built. These were effective but time consuming—they required us to click through the same set of actions each time prior to a release.<p>This prompted me to start thinking about replaying sessions to automatically catch regressions. You can&#x27;t replay against production since you might mutate production data or cause side effects. You could replay against staging, but a lot of companies don&#x27;t have a staging environment that is representative of production. In addition, you need a mechanism to reset state after each replayed session (imagine replaying a user signing up to your web application).<p>We designed Meticulous with a focus on regressions, which I think are a particularly painful class of bug. They tend to occur in flows which users are actively using, and the number of regressions generally scales with the size and complexity of a codebase, which tends to always increase.<p>You can use Meticulous on any website, not just your own. For example, you can start recording a session, then go sign up to (say) amazon.com, then create a simple test which consists of replaying against amazon.com twice and comparing the resulting screenshots. You can also watch recordings and replays on the Meticulous dashboard. Of course, normally you would replay against the base commit and head commit of a PR, as opposed to the production site twice.<p>Our API is currently quite low-level. The Meticulous CLI allows you to do three things:<p>1) You can use &#x27;yarn meticulous record&#x27; to open a browser which you can then use to record a session on a URL of your choice, like localhost. You can also inject our JS snippet onto staging, local, dev and QA environments if you want to capture a larger pool of sessions. This is intended for testing your own stuff! If you inject our snippet, please ask for the consent of your colleagues before recording their workflows. I would advise against production deployments, because our redaction is currently very basic.<p>2) You can use &#x27;yarn meticulous replay&#x27; to replay a session against a URL of your choice. During replay, we spin up a browser and simulate click events with Puppeteer. A list of exceptions and network logs are written to disk. A screenshot is taken at the end of the replay and written to disk.<p>3) You can use &#x27;yarn meticulous screenshot-diff&#x27; to diff two screenshots.<p>There are lots of potential use cases here. You could build a system on top of the screenshot diffing to detect major regressions with a UX flow. You could also try to diff exceptions encountered during replay to detect new uncaught JS exceptions. We plan to build a higher-level product which will provide some testing out of the box.<p>Meticulous captures network traffic at record-time and mocks out network calls at replay-time. This isolates the frontend and avoids causing any side effects. However, this approach does have a few problems. The first is that you can&#x27;t test backend changes or integration changes, only frontend changes. (We are going to make network-stubbing optional, though, so that you can replay against a staging environment if you wish.) The second problem with our approach is that if your API significantly changes, you will need to record a new set of sessions to test against. A third problem is that we don&#x27;t yet support web applications which rely heavily upon server-side rendering. However, we felt these trade-offs were worth it to make Meticulous agnostic of the backend environment.<p>Meticulous is not going to replace all your testing, of course. I would recommend using it in conjunction with existing testing tools and practices, and viewing it as an additional layer of defense.<p>We have a free plan where you can replay 20 sessions per month. I&#x27;ve temporarily changed our limit to 250 for the HN launch. Our basic plan is $100&#x2F;month. The CLI itself is open-source under ISC. We&#x27;re actively discussing open sourcing the record+replay code.<p>I&#x27;d love for you to play around with Meticulous! You can try it out at <a href="https:&#x2F;&#x2F;docs.meticulous.ai" rel="nofollow">https:&#x2F;&#x2F;docs.meticulous.ai</a>. It&#x27;s rough around the edges, but we wanted to get this out to HN as early as possible. Please let us know what you might want us to build on top of this (visual diffs? perf regressions? dead code analysis? preventing regressions?). We would also love to hear from people who have built any sort of replay testing out at their company. Thank you for reading and I look forward to the comments!

19 条评论

kall大约 3 年前
Looks like an impressive tool that makes a previously hard but useful process an order of magnitude more approachable.<p>With waldo.io and Checkly It joins the list of QA force multipliers that would make my life, as the sole developer in a bootstrapped startup trying to punch above it&#x27;s weight, much easier. First, they give me a taste with a free plan, then hit me with production pricing we still can&#x27;t justify.<p>If I have this right, 20 sessions is just a trial and 1000 sessions is very careful use.<p>If these scenarios are so easy to create, I would imagine you would make something like 50 (?), run them against every deploy (10 a day?). That&#x27;s 15.000 replays, right?<p>So the 100$ plan is something like 10 scenarios at 3 deploys a day. That sounds too scaled back to get good use out of it. Or do I have the wrong idea about the intended use case?<p>I get it though: they all have reasonable pricing for a unique service that provides real, obvious value AND may actually be expensive to run. I&#x27;m just a little sad about not getting to use them.
评论 #31236599 未加载
sbuccini大约 3 年前
Pretty slick! I wish we had this a long time ago. At the time, our testing infrastructure was a bunch of very flaky Selenium tests that we would run on through SauceLabs. The tests were super slow, mainly because we tried to reduce flakiness by buffering clicks&#x2F;interactions with sleep() commands. All around, a painful experience which developers hated, which meant engineers did everything they could to avoid adding&#x2F;modifying tests. It was the worst vicious cycle.<p>Biggest concern I would have is portability. One benefit of testing suites, when done right, is they gain more coverage over time, especially against regression bugs. I would be very concerned about building up a large suite of tests for my most critical flows on proprietary tech that could be rendered worthless in an instant if the company goes bust, decides to pivot, etc.
评论 #31245302 未加载
评论 #31237112 未加载
评论 #31237104 未加载
inglor大约 3 年前
Good luck!<p>We tried and failed to create a “bug capture” offering in Testim.io - what helped us work with comapnies like Microsoft&#x2F;Salesforce and eventually make an exit and sell to a much larger player (Tricentis) is focusing on rock-solid AI improving tests. The founder still believes the capture idea (qa capture bugs for devs) has a lot of merit but I think there are fundamental issues with anything that doesn’t reproduce timing perfectly (some do like Firefox’s replay.<p>I’m not with Testim anymore but still very excited people tacking this problem and I warmly recommend pinging Oren@testim.io (the founder, an engineer, a GDE and a nice guy) for pointers - he likes giving free advice and investing in new players in the space to cultivate the ecosystem (most companies currently have no e2e tests)
评论 #31237550 未加载
评论 #31237369 未加载
tough大约 3 年前
Heh, I&#x27;ve been waiting for this Show&#x2F;HN&#x2F;Launch, as I applied to the company a few months back via workatastartup, and it seemed to me like this would be an awesome product.<p>Looks very promising, wish the team the best!<p>Catch those errors before hitting prod, sounds like the dream<p>PS: As on open sourcing the record+replay code, I&#x27;m sure that&#x27;d be awesome, I only have this on my radar <a href="https:&#x2F;&#x2F;github.com&#x2F;openreplay&#x2F;openreplay" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;openreplay&#x2F;openreplay</a> as a FOSS alternative to fullstory&#x2F;logrocket for now.
评论 #31236857 未加载
lbj大约 3 年前
Wow, that&#x27;s impressive. Love the network request&#x2F;response capturing for the replays.
a_c大约 3 年前
Hi Gabriel, congratulations for the launch!<p>I think software development is due for a disruption and your take on testing is spot on.<p>As part of a dev tool belt we developing, we are building a tool to translate user interaction into a selenium script, and have the selenium script run on our server. User get to take away the script so they don&#x27;t get vendor lock-in. What is your approach into replaying a user session?<p>On a broader picture, I think what you do has potential beyond QA. e.g. if run on production, have CS hop on the same session as a troubled user.<p>Just checked your profile. It seems we both are based in london. Maybe we can grab a beer to discuss the potential.
polskibus大约 3 年前
How does it differ from ages old selenium and it’s browser plug-in to record tests? Afaik the worst bit is the indeterminism introduced by network waits, etc that makes e2e complicated and often not worth its upkeep.
gary_chambers大约 3 年前
Congrats on the launch!<p>I&#x27;ve been looking forward to this launch for a while; I&#x27;ve spent a lot of time experimenting with session recording and how it can work for regression testing, reproducing bugs, measuring performance, sharing feedback during development - there is so much potential here.<p>&gt; We&#x27;re actively discussing open sourcing the record+replay code.<p>I think open source is a good call - supporting an option to self-host would make a lot of sense, since session recording will inevitably slurp up PII or sensitive data which could put off some users.
Klaster_1大约 3 年前
When talking about &quot;capturing network traffick&quot;, does that include SSE and WebSockets? If used for regression testing, how do you go about updating existing recordings?
评论 #31239340 未加载
bluelightning2k大约 3 年前
Congrats on the launch.<p>How do you deal with generating resilient selectors automatically? That&#x27;s a class of problem which plagues this type of tool in my understanding.
评论 #31247720 未加载
BasilPH大约 3 年前
What do you think the pros and cons are compared to playwright.dev? The top-level features of recording, replaying, and diff-ing seem very close in my understanding.
评论 #31237919 未加载
password4321大约 3 年前
Evaluating technologies by analogy, how far off base am I?<p>jQuery, Angular, React<p>are analogous to:<p>Selenium, Puppeteer, Playwright<p>Is Selenium still worth considering for a brand new project, though primarily due to ecosystem instead of implementation (Ruby, IDE, etc.)?<p>To frame the question in context of the analogy, though now completely off-topic:<p>What is the React equivalent of datatables.net?
aurbano大约 3 年前
Best of luck!<p>Gabriel and the team are really awesome and this product is a genius idea - I&#x27;ll definitely be using it at my company as it could save us a ton of work in setting up our testing pipelines.<p>- Alejandro :)
catalypso大约 3 年前
Congrats on the launch and good luck!<p>Reading through thus just reminded me of Datadog browser tests. It&#x27;s not exactly the same, but it might be interesting to check them out.
vsroy大约 3 年前
Could you explain to me the benefit this tool offers over something like Playwright + docker-compose (which I think also does stuff like this)?
评论 #31240188 未加载
arunaugustine大约 3 年前
Would like to know how it differs or compares with Cypress. Is it the mocking of the network calls that makes it different?
dom96大约 3 年前
Congrats on the launch! Really exciting to see this coming together :)
dominicwhyte大约 3 年前
Congrats on the launch! meticulous.ai is great and the team is A+
NWMatherson大约 3 年前
Wow