Hey HN! I'm Dean, one of the creators of DAGsHub, a place to host data science projects. DAGsHub lets you track experiments, data, models, and code, with Git. I want to share something we've been working on – Data Science Pull Requests (DS PRs) – expanding Pull Requests (PRs) to include data, models, and experiments. We wanted to create DS PRs to automate the data science review process and enable Open Source Data Science (OSDS).<p>If you've worked on a data science project with other people or tried reviewing someone else's data science work, you know how hard it is to get the information you need in order to understand someone else's work or explain your own to make the review process meaningful.<p>OSDS has the potential to change the world as OSS did, but let's face it – OSDS doesn't really exist. If you maintain an OSDS project and want to accept pull requests – you have to do it almost entirely manually or resort to accepting only code changes (no way to accept data bug fixes). On the other side, if you want to improve your ML portfolio by contributing to some OSDS project, you're also stuck. You can either fork the project and not contribute your work (which means it’s never reviewed – you don't learn as much) or go through a painstaking manual effort.<p>DS PRs let you:<p>- Review, compare, and comment on experiments (metrics, parameters, visualizations), in the context of your PR.<p>- See what data and models have changed (not just code)<p>- Compare and diff notebooks<p>- After review, merge code, data, and models all at once.<p>We're building DAGsHub to be a community-first data science platform. That means you can easily discover and contribute to data science projects created by others, not only by improving code but also by improving data. This has the potential to make ML work better, for everyone.<p>There is a lot of work to do. Your input would be greatly appreciated!<p>https://dagshub.com/docs/collaborating_on_dagshub/data_science_pull_requests/