Related idea: Wildcard [0]<p>It would be cool to have a shared community repository of site adapters, in the spirit of adversarial interoperability [1]. It's probably the most tedious and boring part of such projects, once it's abstracted away it would be much more fun to experiment. This could also be useful for projects like Fraidycat [2] or RSS feed generators like Politepol [3], alternative UIs like Woob [4] etc<p>[0] <a href="https://github.com/geoffreylitt/wildcard" rel="nofollow">https://github.com/geoffreylitt/wildcard</a><p>[1] <a href="https://www.eff.org/deeplinks/2019/10/adversarial-interoperability" rel="nofollow">https://www.eff.org/deeplinks/2019/10/adversarial-interopera...</a><p>[2] <a href="https://fraidyc.at" rel="nofollow">https://fraidyc.at</a><p>[3] <a href="https://github.com/taroved/pol" rel="nofollow">https://github.com/taroved/pol</a><p>[4] <a href="https://woob.tech" rel="nofollow">https://woob.tech</a>
Electric Tables looks quite cool and I love the thought process going into it.<p>It seems like it could pair really nicely with the work that Ink&Switch (<a href="https://www.inkandswitch.com/local-first" rel="nofollow">https://www.inkandswitch.com/local-first</a>) is doing around local-first app development and Automerge (<a href="https://github.com/automerge/automerge" rel="nofollow">https://github.com/automerge/automerge</a>) as a good way to keep disparate private copies of work in sync.<p>I have no connection to Ink&Switch, other than appreciating their work.
> <i>Note, because of technical reasons (content security policies) some sites (e.g. Twitter, Airbnb) will add to Electric Tables, but in a new tab instead of using a pop-up and it won’t grab much additional data..</i><p>so so so frustrating. extensions getting whacked into irrelevance by CSP is such a vulgar sick security misfeature. what a repulsive era of oversecuritization we've FUD'ed ourselves into. the only voices at the table are those hungry to lock down & deny power to users; technical authoritarianism without check.<p>the only workaround i can see is abandoning extensions & making devtools the new way we extend user-agency. the browsers, the standards folks are killing regular user-agency. they are forcing us to climb down to a lower security ring.<p>wonderful world changing extensions like Hypothesis are also broken on sites like twitter and airbnb. making the web read only, removing all user agency, is so not ok. projects like Electric Table show hints of the better web that many long hoped was to come, that has slowly been emerging. but this potential is being cut off, in the most critical areas. somethings got to give. we cant floruish, cant survive a corporate controlled web.
I love this idea, especially the bookmarklet aspect of it. I'm interested to see where it goes.<p>I use a couple of bookmarklets, and they're really, really handy:<p>- One automatically takes me to the pkg.go.dev documentation of a Go library if I'm looking at, say, the GitHub page<p>- The other adds the current page I'm reading to my reading list, which is a mostly complete selection of stuff I've read on the internet - it does a similar thing by extracting titles and images and saving them into a CSV on a Git repo.<p>The one issue I have with bookmarklets is that, while they will sync across mobile and desktop versions of Firefox, the implementation on Firefox mobile feels a little clunky and cumbersome, and sometimes straight-up doesn't work.
This is genius and scratches a bookmarking itch I've had for ages. I hope you'll press on with this and continue its development. Its pretty close to a state where I would pay for it if:<p>1. It integrated with whatboard.app ... either via Zapier or on its own.<p>2. I could manage tables.<p>3. I could share tables.<p>4. Search.<p>5. Themes/Skins.
Nice!<p>It would perfectly fit my use-case if it supports the following flow:<p>1. Person A makes a research and collects and ranks a set of products<p>2. When done, person A sends to person B<p>3. Person B looks at the result and provides feedback, which might include dropping items or making questions, or asking to add a column assessing a given feature of the product<p>4. repeat until converging and then archive<p>Very cool!
There's some interesting overlaps here too with Edge's "Collections", which I mention because isn't listed in the comparison list but may still be a useful comparison.<p>Does a lot of very similar things: tries to extract titles, "hero images", allows you add freeform notes, including in between URLs, has an export to Excel or OneNote or Word, and even has an auto-formatter for citations in a number of common citation formats. (Also, syncs between devices using Edge.)<p><a href="https://support.microsoft.com/en-us/microsoft-edge/organize-your-ideas-with-collections-in-microsoft-edge-60fd7bba-6cfd-00b9-3787-b197231b507e" rel="nofollow">https://support.microsoft.com/en-us/microsoft-edge/organize-...</a>
This is a really cool concept. A lot of my "web foraging" (love that expression) or data foraging really is exactly what he's mentioning here.<p>Also made me think of a related extension <a href="https://braintool.org/" rel="nofollow">https://braintool.org/</a> - whose job is to grab and organize your bookmarks into an org file.<p>I could see these two concepts being combined for pretty powerful bookmarks / personal knowledge-base without relying on a server and/or without having to go through extra steps to get the data out.<p>The custom selectors on Electric Tables is pretty cool, too - kind of a web-scraping light for ad-hoc scraping.<p>All too often I find myself with a project that's not quite worth writing a scraper, but also worth building a Google Sheet around. Electric Tables seems like it could help those cases a lot.
There was a project out of MIT CSAIL back in 2006 that did automated extraction of tabular data from web pages. e.g. product lists on a store site. It recognized pagination and looked for a sequence repeated DOM structures (and what varied in them) to identify the items. You might find it interesting:<p><a href="https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.90.5306" rel="nofollow">https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.90....</a>
I'm not on my computer so I couldn't check, but the bookmarklet opens an iframe right ? I worked on a similar project and unfortunately some websites doesn't allow that. What we end up doing was open it in another window if it was blocked. I could share the code if you're interested
This is great and is remarkably similar to a project that I have been working on :-) Instead of using a bookmarklet, I am extracting URLs from the Reading List feature of Safari. I realized hat I had collected over a thousand links in my reading list and it was getting difficult to manage this manually. One nice thing about working with the reading list is that it is shared between iPhones, iPads and Macs if they are all signed into the same Apple ID.
""" The server side scraping can also do some more heavy lifting - such as store the entire page contents in the database. This enables full text search, the ability to re-crawl URLs and more. """<p>For this aspect, inter-operation with the ArchiveBox (<a href="https://archivebox.io/" rel="nofollow">https://archivebox.io/</a>) project would be ideal.
Seems neat. I'm not sure whether I like or dislike that its not just dumping into excel or google sheets.<p>As a high concept, the idea of quick and personalized scraping of web pages as structured data sheets is a powerful one.<p>I suppose `right click > save as data row` could become a web standard, with 1st support and third party scrape scripts filling in the gaps.
Would really like this to rate restaurants (for my own remembrance) I order from on Uber Eats, Doordash, etc. Alas, the bookmarklet doesn't work with my first port of call, Uber Eats, just yet. Will be keeping an eye on this.
To make it run locally, will need a small server so that it can be addressed at localhost:<port>
I'm thinking about forking it to my github and adding a simple js server. Thoughts?
Dig it.<p>I built a dumber version of this to pipe this kind of data into air table. I wanted to track my reading queue and track extra metadata like who recommended it.