TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Gmail to SQLite

334 点作者 tehlike4 天前

24 条评论

unsnap_biceps4 天前
I&#x27;m curious as to why you choose to break out specific headers in the schema.<p>For example, you have recipients, subject, and sender as JSON fields, when you could have just a headers field with all of them, and even add the rest of the headers in the message.<p>If it&#x27;s performance related, you can still have headers as a single json blob and then use generated columns for the specific fields.<p>For example<p><pre><code> CREATE TABLE IF NOT EXISTS &quot;messages&quot; ( &quot;id&quot; INTEGER NOT NULL PRIMARY KEY, -- internal id &quot;message_id&quot; TEXT NOT NULL, -- Gmail message id &quot;thread_id&quot; TEXT NOT NULL, -- Gmail thread id &quot;headers&quot; JSON NOT NULL, -- JSON object of { &quot;header&quot;: value }, &quot;subject&quot; TEXT GENERATED ALWAYS AS (json_extract(&quot;headers&quot;, &#x27;$.Subject&#x27;)) VIRTUAL NOT NULL) ... ); CREATE INDEX subjectidx on messages(subject); </code></pre> I&#x27;ve found this model really powerful, as it allows users to just alter table to add indexed generated columns as they need for their specific queries. For example, if I wanted to query dkim status, it&#x27;s as simple as<p><pre><code> ALTER TABLE messages ADD dkim TEXT GENERATED ALWAYS AS (json_extract(&quot;headers&quot;, &#x27;$.&quot;Dkim-Signature&quot;&#x27;)) VIRTUAL NOT NULL); CREATE INDEX dkimidx on messages(dkim); SELECT dkim, COUNT(0) FROM messages GROUP BY dkim; </code></pre> or whatever you want.
评论 #43944654 未加载
评论 #43946563 未加载
评论 #43948148 未加载
评论 #43943894 未加载
评论 #43943853 未加载
terhechte4 天前
I build something to visualize huge amounts of email (such as from Gmail) some years ago:<p><a href="https:&#x2F;&#x2F;github.com&#x2F;terhechte&#x2F;postsack">https:&#x2F;&#x2F;github.com&#x2F;terhechte&#x2F;postsack</a>
评论 #43943604 未加载
评论 #43944522 未加载
评论 #43945529 未加载
the_mitsuhiko4 天前
I really lament that you cannot sign in even with an application specific password any more and you need to get an oauth client and go through an oauth flow. It’s my email, but Google takes away an open standard even for myself to access it.
评论 #43944555 未加载
评论 #43945533 未加载
评论 #43948840 未加载
renegat0x04 天前
I recently tried to integrate Gmail in my app [0], and I poured too much time on it. I decided it is not worth to support Gmail.<p>Gmail to SQLite describes 6 steps to get credentials working, but it is not true for me. After 6 steps:<p>- that Google said that my app was not published, so I published it<p>- Google said that app cannot be internal, because I am not a workspace user<p>- for external apps<p>- then it said I cannot use the app until it is verified<p>- in verification they wanted to know domain, address, other details<p>- they wanted to have my justification for scopes<p>- they wanted to have video explaining how the app is going to be used<p>- they will take some time to verify the data I provided them<p>It all looks like a maze of settings, where requiring any of users to go above the hoops required by Google is simply too much.<p>Links:<p>[0] <a href="https:&#x2F;&#x2F;github.com&#x2F;rumca-js&#x2F;Django-link-archive">https:&#x2F;&#x2F;github.com&#x2F;rumca-js&#x2F;Django-link-archive</a>
评论 #43948499 未加载
评论 #43948832 未加载
oulipo4 天前
What&#x27;s the best open-source GMail backup software that exists? Someone has setup something like that? (also archiving attachments, etc)
评论 #43945110 未加载
评论 #43944910 未加载
评论 #43945504 未加载
TekMol4 天前
Shouldn&#x27;t this be &quot;imap to sqlite&quot; or something? Why tie it to one specific email provider?
评论 #43943930 未加载
评论 #43945058 未加载
bytter4 天前
Funny... I did the same thing yesterday, just because I wanted to list my recipient emails by domain. Code is awful, but here it is: <a href="https:&#x2F;&#x2F;github.com&#x2F;hugoferreira&#x2F;gmail-sqlite-db">https:&#x2F;&#x2F;github.com&#x2F;hugoferreira&#x2F;gmail-sqlite-db</a>
评论 #43948675 未加载
yread4 天前
Would be nice to enable fulltext search as well
评论 #43943821 未加载
EvanAnderson4 天前
I am reminded a little of the Postgres-backed IMAP server Archiveopteryx: <a href="https:&#x2F;&#x2F;github.com&#x2F;aox&#x2F;aox">https:&#x2F;&#x2F;github.com&#x2F;aox&#x2F;aox</a><p>The schema from AOX always looked really good to me, but I never have gotten to really giving it a try. I wanted to use it, primarily, to get analytics about my mail and for search (not a daily-driver IMAP server).
评论 #43951420 未加载
rantingdemon3 天前
Interesting tool. I&#x27;m trying it out now. I had to jump through some hoops in Google&#x27;s admin panel that probably had me creating some OAuth org for my personal account...<p>It is now syncing my messages, but very slowly. Some Async magic could probably be cool :)
ThinkBeat4 天前
What is the cost for bandwidth here? As someone with a 40GB+ Gmail account, will I get billed for the transfer using this tool?<p>It is easy to fix though, since I can get Google Take Out (is that the name?) which I think is free and then parse file files once downloaded.<p>Still using this tool would be faster from a get it going perspective.
shinryuu3 天前
Would have been nice if you supported google takeout with mbox instead.
hamburglar4 天前
This looks great and simple. I’ll likely try it out. Any chance you’re working on including attachment metadata (and&#x2F;or broken out access to the attachments themselves) in the future?
flas9sd4 天前
having sqlite exporters for platforms is great help for archiving, but also general questions: I used <a href="https:&#x2F;&#x2F;github.com&#x2F;ltdangle&#x2F;mail2db">https:&#x2F;&#x2F;github.com&#x2F;ltdangle&#x2F;mail2db</a> to see how much mail volume I still receive monthly on a mail account that I want to move away from. A top10 of senders directed my un- and resubscribe actions.
vladgur4 天前
Awesome!<p>Feature request: parse email content to extract unsubscribe links and allow me to unsubscribe from most frequent senders easily
alimbada4 天前
I did something similar using Got Your Back and some C# hacked together in LinqPad to help me analyse my emails.
1vuio0pswjnm73 天前
Wasn&#x27;t there a period where one could get an XML feed of their Gmail, many years ago.
gitroom4 天前
man, the whole gmail backup mess reminds me why i avoid locking myself into someone else&#x27;s sandbox. figuring out what actually keeps me sticking with a platform even when i know all the downsides - is it just laziness or something deeper?
noer4 天前
This is just a single table DB though? At that point, why not just export to a csv or dataframe or whatever and leverage analysis packages to analyze whatever you wanted to.<p>I admittedly might just not have or understand the use case nor have I thought about how large a Gmail account actually is so feel free to ignore if I&#x27;m missing something!
评论 #43947886 未加载
jokoon4 天前
I would have preferred a script that parses the mail backup Google sends you.<p>I think it&#x27;s a big eml file.
评论 #43943885 未加载
flashblaze3 天前
Is there anything similar written in TypeScript?
pdimitar4 天前
Would love a comparison to gbackup-rs[0].<p>To me having to install a tool through Python is a show-stopper.<p>[0] <a href="https:&#x2F;&#x2F;github.com&#x2F;djipko&#x2F;gbackup-rs">https:&#x2F;&#x2F;github.com&#x2F;djipko&#x2F;gbackup-rs</a>
评论 #43946574 未加载
pdyc4 天前
this is great if only there was a tool for whatsapp to sqlite it would make my data so much more useful
评论 #43944454 未加载
评论 #43945192 未加载
einpoklum4 天前
Let us stop using GMail:<p>* Google collects vast amounts of personal data, specifically through receiving all of your email and analyizing it.<p>* It builds elaborate user profiles and uses them to target you with ads designed to better influence you.<p>* Its hold on information (from different sources) has made it excessively powerful economically, and thus also politically.<p>* Google&#x2F;Alphabet has long started to affect legislation, including through direct registered lobbying: ~15 Million USD in 2024 (opensecrets.org).<p>* It has been known to pass, and likely still passes, the information it collects - including copies of your email correspondence - on to the US government (Edward Snowden leaks).<p>and finally:<p>* There are multiple email providers, many of them quite good - both for pay and gratis. Naturally most of the gratis ones have their own interests, but nothing like Google.
评论 #43944546 未加载
评论 #43945987 未加载
评论 #43945401 未加载
评论 #43944768 未加载