My reading list strategy:<p>- Send to Feedbin (<a href="https://feedbin.com/blog/2019/08/20/save-webpages-to-read-later/" rel="nofollow">https://feedbin.com/blog/2019/08/20/save-webpages-to-read-la...</a>)<p>- Never look at it again
Great writeup. I too have a long reading list - currently at 133.<p>I use my own little side project (Savory) to track the list. When I come across a page that I don't have the time or energy to finish right now, I save it and add the "reading" tag to it. When I have free time, I can open the reading tag and pick up something new to read.<p>The best part is that I usually add a couple more tags (e.g. "security" or "economics" etc.) when I save a link. This way, the reading list allows me to filter by topic. It has been an unexpected hack to attack the growing list, since I am usually able to finish multiple articles in a single run, all in the same topic, because there is usually a link between them even when I might have saved them days or weeks apart.<p>Anyway I like how OP actually has a reading history. I really need to add something similar in Savory. Right now, when I finish reading something, I just remove the "reading" tag and I don't get a neat history.
I'm still waiting for a web extension that sends a copy of the webpage I'm looking at (for more than 1 minute) to an endpoint I specify along with some metadata like the URL and user-agent. Obviously, block certain domains like banks or email.<p>I'd like something to build up a searchable index of everything I've read recently so I can easily find content again yet this is NOT something I want a 3rd party to do. I want to self-host something like a tiny Go or Rust server that only uses 10mb of ram to index all the pages into an embed rocks/level/badger/etc. database.
This is a neat writeup. It's fun to think about how to potentially automate this kind of tracking.<p>> I wish there was an easy way to filter by "independent websites"<p>This side comment from the post is intriguing. Other than manual curation, I wonder if there is a way to identify commercial vs independent domains? This would make a really good entry point for a specialty search engine for indie sites only.
I Print-to-PDF everything I've ever found interesting to spend longer than 2 minutes lingering on .. and as a result, I've got 20+ years of Internet articles to go through and read offline, any time.<p>Its very interesting to see the change of quality of technical writing over the last two decades. There's a definite, observable increase in click-bait style writing.
I have started using Obsidian (at work).
And I copy/paste into it any web content, text or image I find useful from the intranet, emails, meet.
I try my best to organise things. But for the most part, I use the search engine and the [autogenerated] links.<p>The only requirement when adding content is to figure out whether it should be added to an existing note or a new dedicated note should be created.
[btw, note nesting does exist in Obsidian]<p>With this simple workflow, you completely eliminate the notion of provenance of the knowledge.
The knowledge is here and up to your organisational habits.<p>After some time doing that, you end up with VERY dense notes (in term of knowledge/line ratio), and very few useless (distracting) content.<p>For the moment I like that A LOT !
Tampermonkey is great for this, because it can log <i>everything</i> and it brings its own XMLHttpRequest:<p><pre><code> GM_xmlhttpRequest({
method : 'GET',
url : 'https://myloggingurl/?client='+client+'&url='+encodeURIComponent(window.location.href)+
'&title='+encodeURIComponent(document.title),
responseType : 'json',
onerror : function(e) { console.error('URL-Logger', e); },
});
</code></pre>
I've been logging all my web activities since 2018, it's been a great tool. On the server side, I filter out ad spam and other extraneous URLs, and then run a cronjob that converts all <i>new</i> HTML documents it sees to PDFs with wkhtmltopdf. It's been a great tool for finding stuff in those moments where I go "hm, I remember seeing something about this months ago..."
The author doesn't appear to have documented the bookmarklet itself. If they are here or another person, can you suggest what it might look like to have a bookmarklet collect the url, page title, meta description and image, and then set window.location.href ?
I used to use a browser extension that would track every page I visited and index everything. This was a few years ago, and I can't remember what it was called or why I stopped using it. I think I was paying something like $5/mo. I'd like to find something like that again, it was really useful. I think it would be even more powerful with an AI agent that could organize all the information into categories, and answer questions like "What was the article I was reading last week about <x>?"<p>Is anyone building something like this? (It would be great if I could run something on my own server.)
I came across this video by Curtis McHale that completely changed the way I keep track of everything:<p><a href="https://youtu.be/xlDfpcipCm4" rel="nofollow">https://youtu.be/xlDfpcipCm4</a><p>I used to try bookmarking things using the built in browser bookmark manager and then later using Raindrop and even copying links into Obsidian but this wasn’t really all that effective. After watching the video I trialled DevonThink and was massively impressed. Now, every article I read that I find interesting I save as either a pdf or web archive so I can search and find it later. I also do the same for useful stack overflow posts so I know I’ll be able to find them if necessary. On top of this I bookmark all kinds of useful sites and categorise them in folders in their respective databases.<p>This allows me to keep Obsidian for just pure notes/writing. If I want to link between the two I can also use Hook to embed links between the two applications.<p>If I want to get proper reference formatting for something, I can open it from DevonThink in the browser and then save it to Zotero. Alternatively some people save everything to zotero instead of DevonThink and then index the folder using DevonThink so it is included in their search. Either approach works.<p>Highly recommend anyone with a Mac trying out the free trial of DevonThink, I think it’s like 100 hours of usage. Would dislike going back to living without it.
I think an interesting angle would be a categorization by the author of what they found was useful/fluff/low quality. Would be a good way to figure out where you're wasting time vs getting value (of course sometime the point is to waste time...)
But in a tag sense, not a content sense.<p>Automatic resolution followup would be interesting. If you read an article about a new research result, you should get an followup on how it came out, months or years later. If you read about an arrest, you should eventually get the case disposition. If you read about a new product, a followup when there are substantial reviews after customers have experience with it.
Could we feed the authors reading list into an AI and guess his OS, his Amazon history, or his likely topics of conversations at a dinner party? Really curious if you could mirror his decision making and tastes by what was in his reading list for some period of time.
I've always found it surprising that something like this isn't more prioritized by browsers<p>Why only let the NSA and advertisers retain and analyze your full browsing history?
i've always wondered if you can just export firefox's history db...i'd then take a periodic dump of it and add it into another dedicated db for searching. god knows I spend way too much time looking up for things I had once read....
I automate a log of all the HTTP requests the computer makes, which naturally includes all the websites I visit.^1 I am not always using a browser to make HTTP requests, and for recreational web use I use a text-only one exclusively, so a "browser history" is not adequate.<p>In the loopback-bound forward proxy that handles all HTTP traffic from all applications, I add a line to include the request URL in an HTTP response header, called "url:" in this example. As such, it will appear in the log. For example, something like<p><pre><code> http-response add-header url "https://%[capture.req.hdr(1)]%[capture.req.uri]" </code></pre>
This allows me to write simple scripts to copy URLs into a simple HTML page. I then read the simple HTML with a text-only browser (links).<p>For example, something like<p><pre><code> cat > 1.sh
#!/bn/sh
WIDTH=120;
echo "<!-- $(date '+%Y-%m-%d %T') --><ol><pre>"
x=$(echo x|tr x '\034');
tr -d '\034' \
|sed -e "s/.*/& &/;s/ .\{${WIDTH}\}/&$x/;s/$x.*//" \
|sed -e "/./{s/.* /<li><a href=&>/;s|$|</a></li>|;}" \
-e '#certain urls to exclude' \
-e '/cdx?url=/d' \
-e '/dns-query?dns=/d' \
-e '/ds5q9oxwqwsfj.cloudfront.net/d' \
-e '/index.commoncrawl.org/d'
^D
grep url: 1.log|cut -d' ' -f3-|1.sh > 1.htm
links 1.htm
</code></pre>
What about POST data. That is captured into another HTTP response header, called "post-data:" in this example<p><pre><code> http-request add-header post-data %[req.body] if { method POST }
</code></pre>
To look at the POST data I might do something like<p><pre><code> grep post-data: 1.log|cut -d' ' -f3-|less
</code></pre>
1. I also use a system for searching the www or specific www sites from the command line. The search results URLs for each query are stored in simple HTML format similar to the above. One query per file. What's non-obvious is that each file can contain search results from different sources, sort of like the "meta-search engine" idea but more flexible. The simple HTML format contains the information necessary to continue searches, at any time, thus allowing a more diverse and greater number of search results to be retrieved. (Sadly, www search engines have been effectively limiting the number of search result URLs we can retrieve with Javascript and cookies disabled.) The command line program reads the information to continue a search from the simple HTML comments.
I use a service that will pull selected articles from my Pocket account and formats and prints them to a nice booklet that is sent to me one a month. I find this makes me more conscious when deciding whether to add an article to Pocket as I am now asking myself if I <i>really</i> want to read it later in the printed booklet (vs. just adding it to "the list" to remove it from the tab bar).
I'm coming up on 3k articles read, probably most of it's all from hn. Jesus I have too much! I use my own app, leftwrite.io to keep track of everything I read and the notes I make. A retrospective might be fun though it'll make it very clear how much of nothing I do.
Thanks for sharing the `window.location.href`. I had attempted to do something like this in the past and gave up when I hit the HTTP issue you are referring to, specifically in the websites (mostly news sites) that I predominantly spent time with.
Maybe add an RSS version of <a href="https://pages.tdpain.net/readingList/" rel="nofollow">https://pages.tdpain.net/readingList/</a><p>Who knows, perhaps there is a nascent meta-RSS movement developing.
Zotero simplified a lot of my note taking / "information retaining" from stuff I bookmark / read. Much better than Obsidian etc.<p>> Add thing to Zotero, 99% of the time the metadata comes with it and is searchable already.<p>> To mark it as read, make a note for it (usually consists of main idea, good / bad ideas in the article, few sentences).<p>> Zotero entries without note are assumed unread.<p>for my diary / misc notes I use vscode with some markdown plugins, foam has like daily note functionality which is nice, add a new diary entry and add some tags ezpz
I have not any tracking and my history/cookies is cleaned regularly. But my reading history is might be the boringest thing for analysis - everything about Lisp and some darknet activities. Former I anyway keep learning and collecting, and latter I really prefer to be forgotten.
I’ve made a Shortcut a while back that basically saves a webpage to pdf and adds tags. When I’ve read an article it get a new tag “already read” and that’s it.<p>Now the tags are an iOS/iPadOS/macOS thing sure, but the pdfs I can take with me to any platform.
I have different reading lists on hacker news, twitter, reddit and medium and because of this I never read anything that I don’t read directly… If you need to share between them you need some convenient app for your phone and computer.
Does anyone have a similar bookmarklet for adding things to Notion on mobile (iOS, Chrome app)?<p>You can “share to” the Notion app, but you have to type a bunch of info in manually. Would love to make it a one-tap.
Tangent<p>I was briefly trying to summarize pages. APIs out there, eg. Summarizer API. But it depends on your summary/take away if you read it.<p>I save all my tabs before I purge all the windows.