Often, after reading an article online, I want to be able to archive the contents of the article for future reference or to add notes. Obviously I could just save the webpage, but I was wondering if anyone knows of a service or application that can extract the contents of a online article (ideally into a text-based format like markdown).
To provide an answer to my own question: I have found that a combination of pythons 'readability-lxml' package and 'lnyx' works pretty well. For example,<p>python -m readability.readability -u file:///foo.html | lynx -dump -stdin<p>produces a pretty nice text format.
Archive.is works pretty well:<p><a href="http://archive.is/" rel="nofollow">http://archive.is/</a><p>(or at least, it does in non Firefox browsers. Seems uBlock and this site are conflicting at the moment).<p>You can also do the same thing with the Internet Archive itself:<p><a href="https://archive.org/web/" rel="nofollow">https://archive.org/web/</a><p>Just enter the link into the lower right text box, and click 'save page'.<p>There are others too, as well as tools you can download to locally save articles (or whole websites) for future reference.
<a href="https://zoho.com/notebook" rel="nofollow">https://zoho.com/notebook</a>
You could very well try Zoho Notebook's browser extensions available in Chrome, Firefox and Safari. Clean view the article and store it in Notebook for future reference.