TechEcho

For the last year I’ve been developing Hyperparam — a collection of small, fast, dependency-free open-source libraries designed for data scientists and ML engineers to actually look at their data.- Hyparquet: Read any Parquet file in browser/node.js- Icebird: Explore Iceberg tables without needing Spark/Presto- HighTable: Virtual scrolling of millions of rows- Hyparquet-Writer: Export Parquet easily from JS- Hyllama: Read llama.cpp .gguf LLM metadata efficientlyCLI for viewing local files: npx hyperparam dataset.parquetExample dataset on Hugging Face Space: <a href="https://huggingface.co/spaces/hyperparam/hyperparam?url=https%3A%2F%2Fhuggingface.co%2Fdatasets%2Fglaiveai%2Freasoning-v1-20m%2Fblob%2Frefs%2Fconvert%2Fparquet%2Fdefault%2Ftrain%2F0000.parquet" rel="nofollow">https://huggingface.co/spaces/hyperparam/hyperparam?url=http...</a>No cloud uploads. No backend servers. A better way to build frontend data applications.GitHub: <a href="https://github.com/hyparam">https://github.com/hyparam</a> Feedback and PRs welcome!

11 comments

abeppu20 days ago

Though these tools might be interesting, I wish they had called this something else. This isn't at all related to the concept of hyperparameters which people commonly refer to as hyperparams. And in their copy, the only reference to hyperparameters seems to be misusing the term.> This stems from an industry-wide realization that model performance is ultimately bounded by data quality, not just model architecture or hyperparameters.Generally we think of model architecture + weights (parameters) as making up the model itself, and hyperparam(s|eters) are the more relevant to how one arrives at those weights -- and for this reason are more relevant to the efficacy of training than the performance of the resultant model.

评论 #43858561 未加载

wbradmoore20 days ago

Why not WASM? Seems like something like duckdb-wasm or datafusion-wasm can do the same thing?

评论 #43858010 未加载

klntsky20 days ago

That's a lot of names for a bunch of tools that do a single task each.What I would really benefit of is a hypothetical LLM chat app that is focused on data migration or processing pipelines.

评论 #43858985 未加载

yujian20 days ago

It's super interesting to be able to see the data in the web

dmosites20 days ago

The iceberg reader sounds cool but how does it handle auth? Most iceberg tables are not publicly accessible.

评论 #43858249 未加载

barabbababoon20 days ago

Very cool stuff. Is this some kind of lighter weight duckdb-wasm? did I get this right?

doppenhe20 days ago

Very cool, does `npx hyperparam dataset.parquet` phone home?

评论 #43857914 未加载

newusertoday20 days ago

very nice. I wanted something like this for Parquet but couldn't find one, this one looks great.

cyrdax20 days ago

Anyone benchmark this vs. duckdb-wasm?

评论 #43859360 未加载

lorr120 days ago

You’re right. Pythons the worst

pranshu5419 days ago

Looks interesting!

11 comments

abeppu20 days ago

评论 #43858561 未加载

wbradmoore20 days ago

Why not WASM? Seems like something like duckdb-wasm or datafusion-wasm can do the same thing?

评论 #43858010 未加载

klntsky20 days ago

That's a lot of names for a bunch of tools that do a single task each.What I would really benefit of is a hypothetical LLM chat app that is focused on data migration or processing pipelines.

评论 #43858985 未加载

yujian20 days ago

It's super interesting to be able to see the data in the web

dmosites20 days ago

The iceberg reader sounds cool but how does it handle auth? Most iceberg tables are not publicly accessible.

评论 #43858249 未加载

barabbababoon20 days ago

Very cool stuff. Is this some kind of lighter weight duckdb-wasm? did I get this right?

doppenhe20 days ago

Very cool, does `npx hyperparam dataset.parquet` phone home?

评论 #43857914 未加载

newusertoday20 days ago

very nice. I wanted something like this for Parquet but couldn't find one, this one looks great.

cyrdax20 days ago

Anyone benchmark this vs. duckdb-wasm?

评论 #43859360 未加载

lorr120 days ago

You’re right. Pythons the worst

pranshu5419 days ago

Looks interesting!

Show HN: Hyperparam: OSS tools for exploring datasets locally in the browser

11 comments

Show HN: Hyperparam: OSS tools for exploring datasets locally in the browser

11 comments