I do all of my academic writing in pandoc. As compared to LaTeX this means no boilerplate (yet you can still use full LaTeX syntax for equations and the like) and, if the publisher 'needs' a Word file, you are one click away from providing it. All with plain text files that you can put under version control, get meaningful diffs, etc. It's just great.
Occasional pandoc contributor here, AMA :-)<p>Just a few links:<p>- Where everything is documented: <a href="http://pandoc.org/MANUAL.html" rel="nofollow">http://pandoc.org/MANUAL.html</a><p>- If you have questions or suggestions: <a href="https://groups.google.com/forum/#!forum/pandoc-discuss" rel="nofollow">https://groups.google.com/forum/#!forum/pandoc-discuss</a><p>- Contributing to pandoc is also a great way to get your feet wet with Haskell. In my experience, very supportive community. See <a href="http://pandoc.org/CONTRIBUTING.html" rel="nofollow">http://pandoc.org/CONTRIBUTING.html</a> and for good first issues: <a href="https://github.com/jgm/pandoc/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22" rel="nofollow">https://github.com/jgm/pandoc/issues?q=is%3Aopen+is%3Aissue+...</a><p>Finally, a great feature, that hasn't been mentioned here, is pandoc filters. Basically, pandoc provides a way for scripts (in any programming language) to hook into the transformation pipeline and modify the document AST (similar to the HTML DOM) in-between the reading and writing steps. See <a href="http://pandoc.org/filters.html" rel="nofollow">http://pandoc.org/filters.html</a>
My favorite pandoc hack is using it to convert word docs into markdown which can then be diffed similar to source code. Works great for legal redlining.
I use Pandoc to convert directories of Markdown files into static HTML websites.<p>Here's the build command for responsive.style[1]:<p><pre><code> pandoc $file -f markdown -t html5 -H templates/header-prod.html -B templates/nav.html -A templates/footer-prod.html -o (echo "../$file" | sed '$s/\.md$/.html/') -s --data-dir=./ --highlight-style breezedark --variable=file:(echo "$file" | sed '$s/\.md$/.html/')
</code></pre>
Works beautifully!<p>1: <a href="https://github.com/tomhodgins/responsive.style/blob/master/src/build-prod.sh" rel="nofollow">https://github.com/tomhodgins/responsive.style/blob/master/s...</a>
Another happy Pandoc user here :)<p>I built a pipeline to convert a Markdown file to publishing-ready files for ebooks, Kindle and paperback for my novel; the whole thing is described here: <a href="http://www.gabrielgambetta.com/tgl_open_source.html" rel="nofollow">http://www.gabrielgambetta.com/tgl_open_source.html</a><p>My website itself is static, generated from a bunch of Markdown files, some HTML templates, and a bit of postprocessing. But most of the work is done by Pandoc.
It appears that Pandoc generates PDF documents via LaTeX. One problem with this is that, as far as I can tell, LaTeX can't generate tagged PDFs. This is an accessibility problem. Granted, for documents that are heavy on math and/or graphics, the point is probably moot. But many technical documents that are distributed as PDFs would benefit from being tagged.<p>Luckily, LibreOffice can produce tagged PDFs. And unoconv is a convenient utility for doing this from the command line. So you can use pandoc to convert to a format that LibreOffice can consume, then issue a command like this:<p><pre><code> unoconv -f pdf -e UseTaggedPDF=true mydoc.odt
</code></pre>
I've tried it, and it works.
Pandoc (or latex) + make + iNotifyWait work really well together for WYSIWYG like editing too:<p><pre><code> watch: $(ALL)
while true; do \
clear; \
make $(WATCH); \
inotifywait -qr -e close_write .; \
done
</code></pre>
"make watch WATCH=build" will now compile documents on every save. Works well for single documents, collections of documents or entire websites.
Pandoc's creator, John MacFarlane, is also the lead guy on CommonMark[1].<p>There are a small number of corner cases that need to be spec'd out before CommonMark can declare a v1.0 release[2]. If you have the skills for this kind of thing, please weigh in!<p>[1] <a href="https://commonmark.org" rel="nofollow">https://commonmark.org</a><p>[2] <a href="https://talk.commonmark.org/t/issues-we-must-resolve-before-1-0-release-8-remaining/1287?u=vas" rel="nofollow">https://talk.commonmark.org/t/issues-we-must-resolve-before-...</a>
I wrote a little utility that uses Pandoc to read Markdown files like `man` pages in the terminal:<p><a href="https://github.com/ashton314/marked-man" rel="nofollow">https://github.com/ashton314/marked-man</a><p>It's just a one-liner: `pandoc -s -t man "$1" | groff -T utf8 -man | $PAGER`<p>(That was basically stolen from an answer to one of my questions on Stack Overflow—thanks to those who answered! :)
I sometimes use pandoc to clean up my markdown-formatted documents, especially given its abilities to "wrap" text and add indentation-style whitespace that makes plain-text documents look nearly suitable for publishing as-is (almost kinda like RFC docs but without header/footer cruft).<p>There are a few things (in latest version, 2.2.3.2) that don't really survive round-trip from markdown back to markdown:<p>- reference-style links (e.g. `[foo][f]`). They are converted to inline links e.g. `[foo](<a href="http://...)`" rel="nofollow">http://...)`</a>.<p>- setext vs hashmark headers. `foo\n=====` will get converted to `# foo`.<p>- markdown allows for forced-linebreak <br>s to be added with two trailing blank spaces at the end of a line. Pandoc escapes these with a trailing `\` at the end of the line.<p>These are only occasional nuisances, but overall the documents (at least in my experience) are not butchered.<p>I also occasionally go from markdown to docx for the purposes of uploading to google-docs and copy/pasting large sections into other docs. This is the only markdown-to-google-docs workflow I've found that works to preserve formatting. It's never really butchered anything, except a few times the syntax-highlighting for code-blocks gets confused and keywords get the wrong colors.
I "love" how many comments are one person praising pandoc for helping them in some workflow, and then commenters ripping into them for not using some other tool. I wonder if there's a corollary to some internet rule that the more generally useful a tool is, the more detractors will push for other tools to be used? It would help explain why programming language discussions get so contentious.<p>Pandoc is seriously a great tool! I love the way it's designed and have found it useful off and on over the years. Truly marvelous for making information available in any needed format.
Pandoc is great software for converting among file formats, such as text, markdown, HTML, PDF, etc.<p>Example:<p><pre><code> pandoc in.md -o out.html -V pagetitle="My Title" --to=html5 --template="my.html" --css "my.css"
</code></pre>
The example converts a markdown file to HTML, using a given title, a template file, and a stylesheet file.<p>The pipeline is also well implemented with Haskell, which is good for writing your own fast functional transformations.
I love pandoc. I've been using it intermittently for years to turn my Markdown and org-mode documents into other formats. Just wish it would take Asciidoc as an input format.
I used pandoc to format my book [0]. Not everything worked perfectly, I'm pretty happy with how everything turned out (especially the print version).<p>It was a little work to set up the workflow with scripts etc, but being able to write the book in markdown and still having full control over the design was definitely worth it.<p>[0] sample here: <a href="https://patricklouys.com/professional-php-sample.pdf" rel="nofollow">https://patricklouys.com/professional-php-sample.pdf</a>
You can use the Haskell-based static site generator Hakyll with Pandoc to create the best best blogging experience imho.<p>An example of how easy this is and the styles I use for my personal blog:
<a href="https://curious.observer" rel="nofollow">https://curious.observer</a>
<a href="https://github.com/davnn/curiousobserver" rel="nofollow">https://github.com/davnn/curiousobserver</a>
Maybe I used an older version but my attempts to use pandoc usually resulted in the document being butchered because the internal representation was not as expressive as the source or target formats.
If you don't want to install Haskell and other dependencies, several folks have developed Docker images for using pandoc:<p><a href="https://users.soe.ucsc.edu/~ivo/_posts/2015-03-12-repeatable-paper-generation-with-docker-and-pandoc.html" rel="nofollow">https://users.soe.ucsc.edu/~ivo/_posts/2015-03-12-repeatable...</a><p><a href="http://gbraad.nl/blog/document-generation-using-markdown-and-pandoc.html" rel="nofollow">http://gbraad.nl/blog/document-generation-using-markdown-and...</a><p><a href="https://github.com/jagregory/pandoc-docker" rel="nofollow">https://github.com/jagregory/pandoc-docker</a>
Yet another pandoc user here. I built a blog engine using Pandoc as the core. Code available here : <a href="https://github.com/subinsebastien/kyll" rel="nofollow">https://github.com/subinsebastien/kyll</a> And the website built using the blog engine is available here : <a href="http://xtel.in/" rel="nofollow">http://xtel.in/</a>
I tried to use pandoc a while ago to convert the latex-sources of arxiv.org documents to epub, since those are often much more comfortable to read on small devices than pdfs.<p>The problem I had was that latex was turned into images, but changing the font-size of the reader did not change the size of the images, making the text readable, but the maths barely readable.<p>This is something I would love to see happen though.
I like pandoc. I've been using Typora [1] for all of my writing, and it's decent, but a little slow.<p>What editor do HN folks use? I wonder if there's a leaner editor out there with an equally nice distraction-free editing interface. Thanks in advance!<p>[1] <a href="https://typora.io/" rel="nofollow">https://typora.io/</a>
What don't I use it for?<p>+ Static websites from any input to html<p>+ Markdown & TeX & References to pdf for academia<p>+ Generating manpages for new tools<p>+ Generating ebooks<p>... Let's just say I get a bit lost when it isn't available.
I love pandoc, but I'm very surprised that such an established tool has (at time of writing) 865 points and is #1 on HN.<p>I guess it's not as well-known as I thought.
i have been using catdoc and pdftotext to convert doc and pdf files, respectively. nice to see that there's an alternative that also includes a library, will be checking this out.<p>a couple questions i have, seems firstly that old school .doc files are not supported, docx yes. unfortunately i still get a lot of docs in .doc format which seems to be microsoft's proprietary format (docx seems to be more open).<p>my second question is whether or not there's a filter for golang, most of my development is in golang, so i either need to call your cli as a forked process or best to have a native library. i have never worked with haskell so not sure if i can import a haskell library from golang directly. i imagine there'd need to be a golang wrapper around the cli.
Pandoc is great! I use pandoc for all kinds of formal writing (conversion to PDF via LaTeX). We also run pandoc in production to produce customer-facing PDFs.
I write any document that doesn't need extensive custom typesetting (which is 90% of stuff) in org-mode and then use pandoc to convert it to "normal people" formats at the end. I have made a basic template for MS Word that looks pretty good.