TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Ask HN: Data storage question

6 pointsby matthodanover 15 years ago
How would you store user-submitted tabular data if you didn't know the column data types of the submitted data prior to it being uploaded?<p>Note: The data must be able to be queried.<p>One idea is to convert each cell in the table of data into a row in a database that holds the value, data type, and position of the cell in the uploaded table.<p>Any other thoughts?

6 comments

mbrubeckover 15 years ago
I used to work at blist (now called Socrata) where we had to solve this exact problem.<p>You could use a schema-less database - either a document store like CouchDB, or a schema-less table/column store like Cassandra or Tokyo Tyrant.<p>Or you could use a standard RDBMS and generate new tables on the fly. (This is what blist did when I worked there; I think they may have a different storage model now.) Or implement a column store or tuple store as a layer on top of the RDBMS, like Infobase:<p><a href="http://openlibrary.org/about/tech" rel="nofollow">http://openlibrary.org/about/tech</a>
评论 #858446 未加载
nostrademonsover 15 years ago
Read in the file, then issue the appropriate CREATE TABLE commands through your database connection to dynamically create the appropriate columns. Keep a table with metadata on the tables you've created (you really only need the table names, you can issue a DESCRIBE to get everything else, but there may be a bunch of other metadata you'd like to store, like the original file name &#38; format, date uploaded, etc.) Query as normal.<p>I've been down the row-per-cell route before. It seems to be one of those ideas that everyone comes up with, sounds really clever at first, and is the wrong solution in 99% of cases. Problem is that it's really hard to get efficient querying - almost every query requires a full table scan.
vyrotekover 15 years ago
Windows Azure Table Storage supports schema-less data. You can give it any sort of entity and store them in something similar to a table but without schema restrictions. I use it for one of my projects and love it.<p><a href="http://www.microsoft.com/azure" rel="nofollow">http://www.microsoft.com/azure</a><p>edit - Thought I would also point out you can whatever language you like to use it. You use a REST API to perform all your queries.
johnmover 15 years ago
This sounds perfect for a native XML "database". Check out e.g., MarkLogic Server. That's what we built <a href="http://markmail.org/" rel="nofollow">http://markmail.org/</a> on top of.
matthodanover 15 years ago
Thanks for all of the great comments. I'll need to do a little reading to get up to speed on these ideas.
acangianoover 15 years ago
You could use DB2 Express-C (which is free); you can then store the data as XML and query it at will.