TechEcho

10 comments

cmehdyalmost 5 years ago

This is the kind of article I'd love to read more of, as in more of each bit! It allowed me to discover the very well made docs to contribute to Firefox[0], which feels very welcoming to an enthusiastic non-genius-expert engineer, who happens to have some experience with CI, testing automation, and a couple languages.I assume the overhead of the project (and subsequent tweaks to model, re-training and validation) is sufficiently negligible compared to the measured benefits even if those weren't as clear-cut as 70%. I'm unaware of how much compute is required for the task, but likely less than many compute-years per day :)One thing I did not notice in the approach to the modelization of the problem is any link/tag regarding the platform for which the code changes are made, and the programming languages used. There seems to be some evidence that certain languages could lead to more defect fixing commits[1], and I don't know if there's evidence that some platforms are more prone to bugs (I'm sure wars of words have been fought over this). But would it make sense to have that sort of information inform the model in a way? I fully understand that I might be out of my depth here.[0] <a href="https://firefox-source-docs.mozilla.org/setup/index.html" rel="nofollow">https://firefox-source-docs.mozilla.org/setup/index.html</a>[1] <a href="https://cacm.acm.org/magazines/2017/10/221326-a-large-scale-study-of-programming-languages-and-code-quality-in-github/fulltext" rel="nofollow">https://cacm.acm.org/magazines/2017/10/221326-a-large-scale-...</a>

评论 #23803695 未加载

hohenheimalmost 5 years ago

Fantastic read. My only concern is that there wasn't any talk around cost of false positives (selecting a test to run where it is unnecessary) vs false negatives (incorrectly dismissing a relevant test), as those costs in terms of their effect is not symmetrical.The cost of a bug slipping through because a test being skipped will be higher than running an irrelevant test to a commit.

评论 #23807604 未加载

评论 #23805273 未加载

评论 #23803383 未加载

评论 #23805932 未加载

pesentialmost 5 years ago

Similar work done at Facebook: <a href="https://engineering.fb.com/developer-tools/predictive-test-selection/" rel="nofollow">https://engineering.fb.com/developer-tools/predictive-test-s...</a>

ackbar03almost 5 years ago

I always thought software/gui testing would be a great application for ai, although I've never really sat down to think about how it could be done

评论 #23807296 未加载

srinivasupadhyaalmost 5 years ago

similar work at google: <a href="https://www.google.com/url?sa=t&source=web&rct=j&url=https://research.google.com/pubs/archive/45861.pdf&ved=2ahUKEwjq-ay_2sXqAhWSzDgGHZ6mB8IQFjAHegQIBhAB&usg=AOvVaw1XsFJdUcbLPk1oFl9HxWtD&cshid=1594488052487" rel="nofollow">https://www.google.com/url?sa=t&source=web&rct=j&url=https:/...</a>

评论 #23804464 未加载

Tarq0nalmost 5 years ago

Interesting. So for training they use features:> In the past, how often did this test fail when the same files were touched?> How far in the directory tree are the source files from the test files?> How often in the VCS history were the source files modified together with the test files?But for prediction all they input is a tuple (TEST, PATCH), and XGboost works fine without the additional features?

评论 #23806989 未加载

sillysaurusxalmost 5 years ago

The most interesting part of this to me was something tangential: they use Redis Queues. Anyone have experience with this? Good or bad impressions?The documentation is tantalizing, but hilariously short: <a href="https://devcenter.heroku.com/articles/python-rq" rel="nofollow">https://devcenter.heroku.com/articles/python-rq</a>Very "And then draw the rest of the owl." Oh really, you can just do `from utils import count_words_at_url; q.enqueue(count_words_at_url, '<a href="http://heroku.com')`" rel="nofollow">http://heroku.com')`</a> and presto, your blocking function -- whose source code exists locally -- is run successfully at the other end?I'll have to set aside some time to try this out. Python does have introspection facilities that could make that possible. I could imagine that since the code is executed on the same box, it's relatively simple to send a request like "here's which module the function was loaded from; here's the order all modules were loaded in; load those modules and call this function." But it leaves so many questions: serialization, performance, scaling, and all the tiny bugs that inevitably come up.I guess I was hoping someone could give me a quick gut check of positive/negative reactions. The full RQ documentation is slightly better: <a href="https://python-rq.org/docs/" rel="nofollow">https://python-rq.org/docs/</a> but has some worrying signs:Make sure that the function call does not depend on its context. In particular, global variables are evil (as always), but also any state that the function depends on (for example a “current” user or “current” web request) is not there when the worker will process it. If you want work done for the “current” user, you should resolve that user to a concrete instance and pass a reference to that user object to the job as an argument.Yes, sure, global variables are the root of satan, but they're also a fact of life in many scenarios.Interesting approach... I wonder how much of a nightmare it makes devops...

评论 #23803384 未加载

评论 #23803004 未加载

gorgoileralmost 5 years ago

Stability and speed from Firefox are always welcome. I’d love to see some performance gains on armhf (raspberry pi 4) in particular. It’s good, and close to being blissfully simple.

评论 #23805765 未加载

data_dersalmost 5 years ago

really cool project. nice high-level overview of all the components. However, I still don't understand the impact measurement -- how do you measure the impact of this against the baseline? I didn't get that part in the effectiveness section. Maybe I'm too newb -- but you could A/B test this, right? 50% of PRs are subjected to automated tooling, 50% manual and compare compute cost and failures b/w the two?

评论 #23814923 未加载

Diane09974almost 5 years ago

yesss

10 comments

cmehdyalmost 5 years ago

评论 #23803695 未加载

hohenheimalmost 5 years ago

评论 #23807604 未加载

评论 #23805273 未加载

评论 #23803383 未加载

评论 #23805932 未加载

pesentialmost 5 years ago

Similar work done at Facebook: <a href="https://engineering.fb.com/developer-tools/predictive-test-selection/" rel="nofollow">https://engineering.fb.com/developer-tools/predictive-test-s...</a>

ackbar03almost 5 years ago

I always thought software/gui testing would be a great application for ai, although I've never really sat down to think about how it could be done

评论 #23807296 未加载

srinivasupadhyaalmost 5 years ago

评论 #23804464 未加载

Tarq0nalmost 5 years ago

评论 #23806989 未加载

sillysaurusxalmost 5 years ago

评论 #23803384 未加载

评论 #23803004 未加载

gorgoileralmost 5 years ago

Stability and speed from Firefox are always welcome. I’d love to see some performance gains on armhf (raspberry pi 4) in particular. It’s good, and close to being blissfully simple.

Testing Firefox more efficiently with machine learning

10 comments

Testing Firefox more efficiently with machine learning

10 comments