TechEcho

Existing OSS C++ projects like ClickHouse and DuckDB support reading from Iceberg tables. Writing requires Spark, PyIceberg, or managed services.<p>In this PR <a href="https://github.com/timeplus-io/proton/pull/928" rel="nofollow">https://github.com/timeplus-io/proton/pull/928</a>, we are open-sourcing a C++ implementation of Iceberg integration. It's an MVP, focusing on REST catalog and S3 read/write(S3 table support coming soon). You can use Timeplus to continuously read data from MSK and stream writes to S3 in the Iceberg format. No JVM. No Python. Just a low-overhead, high-throughput C++ engine. Docker/K8s are optional. Demo video: <a href="https://www.youtube.com/watch?v=2m6ehwmzOnc" rel="nofollow">https://www.youtube.com/watch?v=2m6ehwmzOnc</a><p>Help us improve the code to add more integrations and features. Happy to contribute this to the Iceberg community. Or just roast the code. We’ll buy the virtual coffee.

2 comments

tingfirst2 months ago

Pretty cool to see a C++ R/W Iceberg client without dependency, and even better open-sourced. The pipeline is all about processing and routing, ideally, to open and flexible destination with no lock-in and long-term retention. Writing into Apache Iceberg is becoming critical to give users real control, rather than into specific data warehouses or lakehouses that are hard to move out.

ZiliangXK2 months ago

Look like other folks are doing similar things, wondering any benchmark results so far compared with Java implementation

评论 #43438963 未加载

Show HN: Open-Source C++ Apache Iceberg Client with Write Support

2 comments

Show HN: Open-Source C++ Apache Iceberg Client with Write Support

2 comments