TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Testing in Data Science with Katharine Jarmul (podcast)

1 pointsby variedthoughtsover 7 years ago

1 comment

variedthoughtsover 7 years ago
A discussion with Katharine Jarmul, kjam, about some of the challenges of data science with respect to testing.<p>Some of the topics we discuss:<p>* experimentation vs testing * testing pipelines and pipeline changes * automating data validation * property based testing * schema validation and detecting schema changes * using unit test techniques to test data pipeline stages * testing nodes and transitions in DAGs * testing expected and unexpected data * missing data and non-signals * corrupting a dataset with noise * fuzz testing for both data pipelines and web APIs * datafuzz * hypothesis * testing internal interfaces * documenting and sharing domain expertise to build good reasonableness * intermediary data and stages * neural networks * speaking at conferences