I was part of an experimental neuroimaging group that tested Pachyderm OSS years ago and at the time we were really impressed with the versioning capabilities it provided. For us at the time it made it easy for each researcher to grab and change data as needed for their own development without requiring support from eng.
I have a “data science pipeline” coordinated with a Makefile and run on CI/CD (GitLab) with reports generated as build artifacts. Big stuff checked in with Git LFS.<p>Why would I use Pachyderm?
I've also wondered why I should use Pachyderm. Decided to give it a try, and wrote the following blog about it : <a href="https://medium.com/bigdatarepublic/pachyderm-for-data-scientists-d1d1dff3a2fa" rel="nofollow">https://medium.com/bigdatarepublic/pachyderm-for-data-scient...</a>
" Finally, version control for your data "