Oh man, if this could plug into git and be a LFS replacement, that would be awesome. I work in a field where folks run into situations where they think they need LFS, and rarely does it work out well. If someone can figure out an ergonomic and durable LFS-like blob versioning system that can align with git histories, that would be incredible.
How does this compare with other systems, like DVC (<a href="https://dvc.org/" rel="nofollow">https://dvc.org/</a>) for example?
The comparison with DVC is biased <a href="https://github.com/Oxen-AI/oxen-release/blob/main/Performance.md">https://github.com/Oxen-AI/oxen-release/blob/main/Performanc...</a><p>I'd nowhere near the same performance with oxen. The analysis is very biased to help Oxen. I wish people had more integrity before trying so hard to push a half-baked product into the market.
Link to the actual project source <a href="https://github.com/Oxen-AI/Oxen">https://github.com/Oxen-AI/Oxen</a>
Great to see more people in this space! We are the authors of XetHub (posted in Dec ‘22, ShowHN: <a href="https://news.ycombinator.com/item?id=33969908" rel="nofollow">https://news.ycombinator.com/item?id=33969908</a>) and also think a git-like workflow is perfect for ML dataset management, except that we actually integrate with git (like LFS). <A quick benchmark suggests we are 2x your published performance!>
Being realistic here, 3rd party provider for data handling will be a no-go for many firms, for infosec reasons. Whereas a hub with no ui might also be a no-go for convenience reasons. I understand that oxenhub is a way to monetise the project but is there a self-hosted 'enterprise' version of that anywhere in the plans?
Any plans for adding exclusive locking and option to delete old versions of a file? These are really important if working with unmergeable, large files.
I see there are already a bunch of questions about how this compares to other tools like DVC, dolt, pachyderm.io and LFS? I would just like to add one to that list:<p>How does this compare to lakeFS?