TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Show HN: Hyperparam: OSS tools for exploring datasets locally in the browser

77 点作者 platypii13 天前
For the last year I’ve been developing Hyperparam — a collection of small, fast, dependency-free open-source libraries designed for data scientists and ML engineers to actually look at their data.<p>- Hyparquet: Read any Parquet file in browser&#x2F;node.js<p>- Icebird: Explore Iceberg tables without needing Spark&#x2F;Presto<p>- HighTable: Virtual scrolling of millions of rows<p>- Hyparquet-Writer: Export Parquet easily from JS<p>- Hyllama: Read llama.cpp .gguf LLM metadata efficiently<p>CLI for viewing local files: npx hyperparam dataset.parquet<p>Example dataset on Hugging Face Space: <a href="https:&#x2F;&#x2F;huggingface.co&#x2F;spaces&#x2F;hyperparam&#x2F;hyperparam?url=https%3A%2F%2Fhuggingface.co%2Fdatasets%2Fglaiveai%2Freasoning-v1-20m%2Fblob%2Frefs%2Fconvert%2Fparquet%2Fdefault%2Ftrain%2F0000.parquet" rel="nofollow">https:&#x2F;&#x2F;huggingface.co&#x2F;spaces&#x2F;hyperparam&#x2F;hyperparam?url=http...</a><p>No cloud uploads. No backend servers. A better way to build frontend data applications.<p>GitHub: <a href="https:&#x2F;&#x2F;github.com&#x2F;hyparam">https:&#x2F;&#x2F;github.com&#x2F;hyparam</a> Feedback and PRs welcome!

11 条评论

abeppu13 天前
Though these tools might be interesting, I wish they had called this something else. This isn&#x27;t at all related to the concept of hyperparameters which people commonly refer to as hyperparams. And in their copy, the only reference to hyperparameters seems to be misusing the term.<p>&gt; This stems from an industry-wide realization that model performance is ultimately bounded by data quality, not just model architecture or hyperparameters.<p>Generally we think of model architecture + weights (parameters) as making up the model itself, and hyperparam(s|eters) are the more relevant to how one arrives at those weights -- and for this reason are more relevant to the efficacy of training than the performance of the resultant model.
评论 #43858561 未加载
wbradmoore13 天前
Why not WASM? Seems like something like duckdb-wasm or datafusion-wasm can do the same thing?
评论 #43858010 未加载
klntsky13 天前
That&#x27;s a lot of names for a bunch of tools that do a single task each.<p>What I would really benefit of is a hypothetical LLM chat app that is focused on data migration or processing pipelines.
评论 #43858985 未加载
yujian13 天前
It&#x27;s super interesting to be able to see the data in the web
dmosites13 天前
The iceberg reader sounds cool but how does it handle auth? Most iceberg tables are not publicly accessible.
评论 #43858249 未加载
barabbababoon13 天前
Very cool stuff. Is this some kind of lighter weight duckdb-wasm? did I get this right?
doppenhe13 天前
Very cool, does `npx hyperparam dataset.parquet` phone home?
评论 #43857914 未加载
newusertoday13 天前
very nice. I wanted something like this for Parquet but couldn&#x27;t find one, this one looks great.
cyrdax13 天前
Anyone benchmark this vs. duckdb-wasm?
评论 #43859360 未加载
lorr113 天前
You’re right. Pythons the worst
pranshu5412 天前
Looks interesting!