How would you store user-submitted tabular data if you didn't know the column data types of the submitted data prior to it being uploaded?<p>Note: The data must be able to be queried.<p>One idea is to convert each cell in the table of data into a row in a database that holds the value, data type, and position of the cell in the uploaded table.<p>Any other thoughts?
I used to work at blist (now called Socrata) where we had to solve this exact problem.<p>You could use a schema-less database - either a document store like CouchDB, or a schema-less table/column store like Cassandra or Tokyo Tyrant.<p>Or you could use a standard RDBMS and generate new tables on the fly. (This is what blist did when I worked there; I think they may have a different storage model now.) Or implement a column store or tuple store as a layer on top of the RDBMS, like Infobase:<p><a href="http://openlibrary.org/about/tech" rel="nofollow">http://openlibrary.org/about/tech</a>
Read in the file, then issue the appropriate CREATE TABLE commands through your database connection to dynamically create the appropriate columns. Keep a table with metadata on the tables you've created (you really only need the table names, you can issue a DESCRIBE to get everything else, but there may be a bunch of other metadata you'd like to store, like the original file name & format, date uploaded, etc.) Query as normal.<p>I've been down the row-per-cell route before. It seems to be one of those ideas that everyone comes up with, sounds really clever at first, and is the wrong solution in 99% of cases. Problem is that it's really hard to get efficient querying - almost every query requires a full table scan.
Windows Azure Table Storage supports schema-less data. You can give it any sort of entity and store them in something similar to a table but without schema restrictions. I use it for one of my projects and love it.<p><a href="http://www.microsoft.com/azure" rel="nofollow">http://www.microsoft.com/azure</a><p>edit - Thought I would also point out you can whatever language you like to use it. You use a REST API to perform all your queries.
This sounds perfect for a native XML "database". Check out e.g., MarkLogic Server. That's what we built <a href="http://markmail.org/" rel="nofollow">http://markmail.org/</a> on top of.