Searchcode.com’s SQLite database is probably 6 terabytes bigger than yours

242 pointsby polyrand3 months ago

13 comments

I’ll bet you some CERN PhD student has a forgotten 100 TB detector calibration database in sqlite somewhere in the dead caverns of collaboration effort.

评论 #43078042 未加载

评论 #43078079 未加载

bborud3 months ago

I've been using RWMutex'es around SQLite calls as a precaution since I couldn't quite figure out if it was safe for concurrent use. This is perhaps overkill?Since I do a lot of cross-compiling I have been using <a href="https://modernc.org/sqlite" rel="nofollow">https://modernc.org/sqlite</a> for SQLite. Does anyone have some knowledge/experience/observations to share on concurrency and SQLite in Go?

评论 #43078393 未加载

评论 #43078533 未加载

评论 #43077859 未加载

评论 #43077398 未加载

评论 #43077936 未加载

评论 #43077728 未加载

评论 #43077438 未加载

评论 #43082475 未加载

评论 #43078789 未加载

评论 #43077369 未加载

antithesis-nl3 months ago

Yup, they win. My biggest SQLite database is 1.7TB with, as of just now 2314851188 records (all JSON documents with a few keyword indexes via json_extract).Works like a charm, as in: the web app consuming the API linked to it returns paginated results for any relevant search term within a second or so, for a handful of concurrent users.

评论 #43077189 未加载

评论 #43077321 未加载

评论 #43077359 未加载

评论 #43077217 未加载

aerioux3 months ago

nit: this code snippet has a typo (dbWrite wasn't defined)``` dbRead, _ := connectSqliteDb("dbname.db") defer dbRead.Close() dbRead.SetMaxOpenConns(runtime.NumCPU())dbRead, _ := connectSqliteDb("dbname.db") defer dbWrite.Close() dbWrite.SetMaxOpenConns(1)```

评论 #43078757 未加载

1f60c3 months ago

searchcode doesn't seem to work for me. All queries (even the ones recommended by the site) unfortunately return zero results. Maybe it got hugged?<a href="https://searchcode.com/?q=re.compile+lang%3Apython" rel="nofollow">https://searchcode.com/?q=re.compile+lang%3Apython</a>

评论 #43079391 未加载

评论 #43078586 未加载

评论 #43077499 未加载

评论 #43077520 未加载

评论 #43079257 未加载

评论 #43078553 未加载

gedw993 months ago

I use marmot which gives me a multi master SQLite.It uses a simple crdt table structure, and allows me to have many live SQLite instances in all data enters.A Nat Jetstream server is used as a core with all SQLite DBS connected to it.It operates off the WAL and is simple to install.It also replicates all blobs to S3 , with the directory structure in the SQLite db.With a Cloudflare domain , the users request is automatically sent to the nearest Db.So it replaces cloudscapes D1 system for free . Just a hetzner 4 euro cos is enough.Mine is holding about 200 gb of data.<a href="https://github.com/maxpert/marmot">https://github.com/maxpert/marmot</a>

lutusp3 months ago

Wait ...<pre><code> dbRead, _ := connectSqliteDb("dbname.db") defer dbRead.Close() dbRead.SetMaxOpenConns(runtime.NumCPU()) dbRead, _ := connectSqliteDb("dbname.db") defer dbWrite.Close() dbWrite.SetMaxOpenConns(1) </code></pre> Is dbWrite ever declared? I know it's just an example, but still ...Now I have to ask myself whether this error resulted from relying on a human, or not relying on one.

评论 #43083341 未加载

leighleighleigh3 months ago

I've been looking for a service just like searchcode, to try and track down obscure source code. All the best, hope it can be sustainable for you.

aliasav3 months ago

I have been contemplating postgres hosted by AWS vs locally using SQLite with my Django app. I have low volume in my app, low concurrent traffic. I may have joins in future since the data is highly relational.I still chose postgres managed by AWS mainly to reduce the operational overhead, I keep thinking if I should have just gone with sqlite3 though

评论 #43079444 未加载

feverzsj3 months ago

I'd consider no relational db scales reads vertically better than SQLite. For writes, you can batch them or distribute them to attached dbs. But, either way, you may lose some transaction guarantee.

评论 #43078832 未加载

评论 #43079359 未加载

rednafi3 months ago

Fascinating read. Those suggesting Mongo are missing the point.The author clearly mentioned that they want to work in the relational space. Choosing Mongo would require refactoring a significant part of the codebase.Also, databases like MySQL, Postgres, and SQLite have neat indexing features that Mongo still lacks.The author wanted to move away from a client-server database.And finally, while wanting a 6.4TB single binary is wild, I presume that’s exactly what’s happening here. You couldn’t do that with Mongo.

评论 #43078724 未加载

评论 #43078528 未加载

评论 #43078729 未加载

Alifatisk3 months ago

Is the site like grep.app?

morphle3 months ago

Without reading the article, I always quickly estimate what a DRAM based in-memory database would cost. You build a mesh network of FPGA PCBs with 12 x 12.5 Gbps links interconnecting them. Each $100 FPGA has 25 GB/s DDR3 or DDR2 controllers with up to 256 GB. DDR3 DIM have been less than $1 per GB for many years.6.4x($100+256x4x$1)=$7,193.6 worst case pricingSo this database would cost less than $8000 in DRAM chips hardware and can be searched/indexed in parallel in much less than a second.Now you can buy old DDR2 chips harvested from ewaste at less than $1 per GB, so this database could cost as low as $1000 including the labour for harvesting.You can make a wafer scale integration with SRAM that holds around 1 terabyte SRAM for $30K including mask set costs. This database would cost you $210.000. Note that SRAM is much faster than DDR DRAM.You would need 15000/7 customers of this size database to cover the manufacturing of the wafers. I'm sure there are more than 2143 customers that would buy such a database.Please invest in my company, I need 450,000,000 up front to manufacture these 1 terabyte wafers in a single batch. We will sprinkle in the million core small reconfigurable processors[1] in between the terabyte SRAM for free.For the observant: we have around 50 trillion transistors[2] to play with on a 300mm wafer. Our secret sauce is a 2 transistor design making SRAM 5 times denser that Apple Silicon SRAM densities at 3nm on their M4.Actually we would not need reconfigurable processors, we would intersperse Morphle Logic in between the SRAM. Now you could reprogram the search logic to do video compression/decompression or encryption by reconfiguring the logic pattern[3]. Total search of each byte would be a few hundred nanoseconds.The IRAM paper from 1997 mentions 180nm. These wafers cost $750 today (including mask set and profit margin) and now you would need less than a million investment up front to start mass manufactoring. You just would need 3600 times the wafer amount compared to the latest 3nm transistor density on a wafer.[1] Intelligent RAM (IRAM): chips that remember and compute (1997) <a href="https://sci-hub.ru/10.1109/ISSCC.1997.585348" rel="nofollow">https://sci-hub.ru/10.1109/ISSCC.1997.585348</a>[2] Merik Voswinkel - Smalltalk and Self Hardware <a href="https://www.youtube.com/watch?v=vbqKClBwFwI&t=262s" rel="nofollow">https://www.youtube.com/watch?v=vbqKClBwFwI&t=262s</a>[3] Morphle Logic <a href="https://github.com/fiberhood/MorphleLogic/blob/main/README_MORPHLE_LOGIC.md">https://github.com/fiberhood/MorphleLogic/blob/main/README_M...</a>

评论 #43079638 未加载