Pandoc

1016 pointsby otp124over 6 years ago

41 comments

Schipholover 6 years ago

I do all of my academic writing in pandoc. As compared to LaTeX this means no boilerplate (yet you can still use full LaTeX syntax for equations and the like) and, if the publisher 'needs' a Word file, you are one click away from providing it. All with plain text files that you can put under version control, get meaningful diffs, etc. It's just great.

评论 #17855719 未加载

评论 #17856184 未加载

评论 #17856020 未加载

评论 #17855437 未加载

评论 #17857939 未加载

评论 #17857219 未加载

评论 #17856613 未加载

评论 #17858533 未加载

mb2100over 6 years ago

Occasional pandoc contributor here, AMA :-)Just a few links:- Where everything is documented: <a href="http://pandoc.org/MANUAL.html" rel="nofollow">http://pandoc.org/MANUAL.html</a>- If you have questions or suggestions: <a href="https://groups.google.com/forum/#!forum/pandoc-discuss" rel="nofollow">https://groups.google.com/forum/#!forum/pandoc-discuss</a>- Contributing to pandoc is also a great way to get your feet wet with Haskell. In my experience, very supportive community. See <a href="http://pandoc.org/CONTRIBUTING.html" rel="nofollow">http://pandoc.org/CONTRIBUTING.html</a> and for good first issues: <a href="https://github.com/jgm/pandoc/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22" rel="nofollow">https://github.com/jgm/pandoc/issues?q=is%3Aopen+is%3Aissue+...</a>Finally, a great feature, that hasn't been mentioned here, is pandoc filters. Basically, pandoc provides a way for scripts (in any programming language) to hook into the transformation pipeline and modify the document AST (similar to the HTML DOM) in-between the reading and writing steps. See <a href="http://pandoc.org/filters.html" rel="nofollow">http://pandoc.org/filters.html</a>

评论 #17857901 未加载

评论 #17859008 未加载

评论 #17863339 未加载

koolbaover 6 years ago

My favorite pandoc hack is using it to convert word docs into markdown which can then be diffed similar to source code. Works great for legal redlining.

评论 #17855909 未加载

评论 #17855520 未加载

评论 #17857559 未加载

评论 #17855687 未加载

评论 #17855990 未加载

err4ntover 6 years ago

I use Pandoc to convert directories of Markdown files into static HTML websites.Here's the build command for responsive.style[1]:<pre><code> pandoc $file -f markdown -t html5 -H templates/header-prod.html -B templates/nav.html -A templates/footer-prod.html -o (echo "../$file" | sed '$s/\.md$/.html/') -s --data-dir=./ --highlight-style breezedark --variable=file:(echo "$file" | sed '$s/\.md$/.html/') </code></pre> Works beautifully!1: <a href="https://github.com/tomhodgins/responsive.style/blob/master/src/build-prod.sh" rel="nofollow">https://github.com/tomhodgins/responsive.style/blob/master/s...</a>

评论 #17856046 未加载

评论 #17855385 未加载

ggambettaover 6 years ago

Another happy Pandoc user here :)I built a pipeline to convert a Markdown file to publishing-ready files for ebooks, Kindle and paperback for my novel; the whole thing is described here: <a href="http://www.gabrielgambetta.com/tgl_open_source.html" rel="nofollow">http://www.gabrielgambetta.com/tgl_open_source.html</a>My website itself is static, generated from a bunch of Markdown files, some HTML templates, and a bit of postprocessing. But most of the work is done by Pandoc.

评论 #17857408 未加载

myself248over 6 years ago

The one thing it can't do is give HN posts descriptive titles.

评论 #17857112 未加载

jaggederestover 6 years ago

Also, interesting point of trivia, the maintainer, John MacFarlane is a professor of logical philosophy at UC Berkeley.

tambourine_manover 6 years ago

One nice trick that I use all the time is to convert html to md and back again in order to clean it.Anyway, pandoc is great.

评论 #17856923 未加载

mwcampbellover 6 years ago

It appears that Pandoc generates PDF documents via LaTeX. One problem with this is that, as far as I can tell, LaTeX can't generate tagged PDFs. This is an accessibility problem. Granted, for documents that are heavy on math and/or graphics, the point is probably moot. But many technical documents that are distributed as PDFs would benefit from being tagged.Luckily, LibreOffice can produce tagged PDFs. And unoconv is a convenient utility for doing this from the command line. So you can use pandoc to convert to a format that LibreOffice can consume, then issue a command like this:<pre><code> unoconv -f pdf -e UseTaggedPDF=true mydoc.odt </code></pre> I've tried it, and it works.

评论 #17856795 未加载

flukusover 6 years ago

Pandoc (or latex) + make + iNotifyWait work really well together for WYSIWYG like editing too:<pre><code> watch: $(ALL) while true; do \ clear; \ make $(WATCH); \ inotifywait -qr -e close_write .; \ done </code></pre> "make watch WATCH=build" will now compile documents on every save. Works well for single documents, collections of documents or entire websites.

评论 #17933247 未加载

评论 #17856516 未加载

eevilspockover 6 years ago

Pandoc's creator, John MacFarlane, is also the lead guy on CommonMark[1].There are a small number of corner cases that need to be spec'd out before CommonMark can declare a v1.0 release[2]. If you have the skills for this kind of thing, please weigh in![1] <a href="https://commonmark.org" rel="nofollow">https://commonmark.org</a>[2] <a href="https://talk.commonmark.org/t/issues-we-must-resolve-before-1-0-release-8-remaining/1287?u=vas" rel="nofollow">https://talk.commonmark.org/t/issues-we-must-resolve-before-...</a>

评论 #17857282 未加载

ashton314over 6 years ago

I wrote a little utility that uses Pandoc to read Markdown files like `man` pages in the terminal:<a href="https://github.com/ashton314/marked-man" rel="nofollow">https://github.com/ashton314/marked-man</a>It's just a one-liner: `pandoc -s -t man "$1" | groff -T utf8 -man | $PAGER`(That was basically stolen from an answer to one of my questions on Stack Overflow—thanks to those who answered! :)

评论 #17858109 未加载

ryanianianover 6 years ago

I sometimes use pandoc to clean up my markdown-formatted documents, especially given its abilities to "wrap" text and add indentation-style whitespace that makes plain-text documents look nearly suitable for publishing as-is (almost kinda like RFC docs but without header/footer cruft).There are a few things (in latest version, 2.2.3.2) that don't really survive round-trip from markdown back to markdown:- reference-style links (e.g. `[foo][f]`). They are converted to inline links e.g. `[foo](<a href="http://...)`" rel="nofollow">http://...)`</a>.- setext vs hashmark headers. `foo\n=====` will get converted to `# foo`.- markdown allows for forced-linebreak s to be added with two trailing blank spaces at the end of a line. Pandoc escapes these with a trailing `\` at the end of the line.These are only occasional nuisances, but overall the documents (at least in my experience) are not butchered.I also occasionally go from markdown to docx for the purposes of uploading to google-docs and copy/pasting large sections into other docs. This is the only markdown-to-google-docs workflow I've found that works to preserve formatting. It's never really butchered anything, except a few times the syntax-highlighting for code-blocks gets confused and keywords get the wrong colors.

评论 #17856783 未加载

CodexArcanumover 6 years ago

I "love" how many comments are one person praising pandoc for helping them in some workflow, and then commenters ripping into them for not using some other tool. I wonder if there's a corollary to some internet rule that the more generally useful a tool is, the more detractors will push for other tools to be used? It would help explain why programming language discussions get so contentious.Pandoc is seriously a great tool! I love the way it's designed and have found it useful off and on over the years. Truly marvelous for making information available in any needed format.

jphover 6 years ago

Pandoc is great software for converting among file formats, such as text, markdown, HTML, PDF, etc.Example:<pre><code> pandoc in.md -o out.html -V pagetitle="My Title" --to=html5 --template="my.html" --css "my.css" </code></pre> The example converts a markdown file to HTML, using a given title, a template file, and a stylesheet file.The pipeline is also well implemented with Haskell, which is good for writing your own fast functional transformations.

phalangionover 6 years ago

I love pandoc. I've been using it intermittently for years to turn my Markdown and org-mode documents into other formats. Just wish it would take Asciidoc as an input format.

评论 #17855603 未加载

评论 #17857307 未加载

评论 #17855393 未加载

patricklouysover 6 years ago

I used pandoc to format my book [0]. Not everything worked perfectly, I'm pretty happy with how everything turned out (especially the print version).It was a little work to set up the workflow with scripts etc, but being able to write the book in markdown and still having full control over the design was definitely worth it.[0] sample here: <a href="https://patricklouys.com/professional-php-sample.pdf" rel="nofollow">https://patricklouys.com/professional-php-sample.pdf</a>

caconym_over 6 years ago

I write fiction as a hobby, I do it in markdown and use Pandoc to turn it into epub files with a custom CSS. It works great. Thanks Pandoc!

评论 #17855774 未加载

davnnover 6 years ago

You can use the Haskell-based static site generator Hakyll with Pandoc to create the best best blogging experience imho.An example of how easy this is and the styles I use for my personal blog: <a href="https://curious.observer" rel="nofollow">https://curious.observer</a> <a href="https://github.com/davnn/curiousobserver" rel="nofollow">https://github.com/davnn/curiousobserver</a>

basementcatover 6 years ago

Maybe I used an older version but my attempts to use pandoc usually resulted in the document being butchered because the internal representation was not as expressive as the source or target formats.

评论 #17855564 未加载

adzmover 6 years ago

Pandoc is also a great educational Haskell project for those looking into how it all works.

scentoniover 6 years ago

If you don't want to install Haskell and other dependencies, several folks have developed Docker images for using pandoc:<a href="https://users.soe.ucsc.edu/~ivo/_posts/2015-03-12-repeatable-paper-generation-with-docker-and-pandoc.html" rel="nofollow">https://users.soe.ucsc.edu/~ivo/_posts/2015-03-12-repeatable...</a><a href="http://gbraad.nl/blog/document-generation-using-markdown-and-pandoc.html" rel="nofollow">http://gbraad.nl/blog/document-generation-using-markdown-and...</a><a href="https://github.com/jagregory/pandoc-docker" rel="nofollow">https://github.com/jagregory/pandoc-docker</a>

评论 #17855545 未加载

评论 #17856474 未加载

subinsebastienover 6 years ago

Yet another pandoc user here. I built a blog engine using Pandoc as the core. Code available here : <a href="https://github.com/subinsebastien/kyll" rel="nofollow">https://github.com/subinsebastien/kyll</a> And the website built using the blog engine is available here : <a href="http://xtel.in/" rel="nofollow">http://xtel.in/</a>

rotorbladeover 6 years ago

I tried to use pandoc a while ago to convert the latex-sources of arxiv.org documents to epub, since those are often much more comfortable to read on small devices than pdfs.The problem I had was that latex was turned into images, but changing the font-size of the reader did not change the size of the images, making the text readable, but the maths barely readable.This is something I would love to see happen though.

评论 #17858847 未加载

评论 #17857674 未加载

评论 #17857350 未加载

评论 #17857498 未加载

disqardover 6 years ago

I like pandoc. I've been using Typora [1] for all of my writing, and it's decent, but a little slow.What editor do HN folks use? I wonder if there's a leaner editor out there with an equally nice distraction-free editing interface. Thanks in advance![1] <a href="https://typora.io/" rel="nofollow">https://typora.io/</a>

评论 #17856285 未加载

评论 #17855730 未加载

评论 #17855561 未加载

hatmatrixover 6 years ago

Even though org-mode has its own exporters, Pandoc is great for the extra bibtex integration.

voltagex_over 6 years ago

The only problem I have with pandoc is I have to lug the entire GHC around with it.

评论 #17857694 未加载

shaknaover 6 years ago

What don't I use it for?+ Static websites from any input to html+ Markdown & TeX & References to pdf for academia+ Generating manpages for new tools+ Generating ebooks... Let's just say I get a bit lost when it isn't available.

评论 #17855458 未加载

评论 #17855618 未加载

bovermyerover 6 years ago

I love pandoc, but I'm very surprised that such an established tool has (at time of writing) 865 points and is #1 on HN.I guess it's not as well-known as I thought.

epynonymousover 6 years ago

i have been using catdoc and pdftotext to convert doc and pdf files, respectively. nice to see that there's an alternative that also includes a library, will be checking this out.a couple questions i have, seems firstly that old school .doc files are not supported, docx yes. unfortunately i still get a lot of docs in .doc format which seems to be microsoft's proprietary format (docx seems to be more open).my second question is whether or not there's a filter for golang, most of my development is in golang, so i either need to call your cli as a forked process or best to have a native library. i have never worked with haskell so not sure if i can import a haskell library from golang directly. i imagine there'd need to be a golang wrapper around the cli.

评论 #17857849 未加载

评论 #17857699 未加载

GlenTheMachineover 6 years ago

As a guy attempting to transition from macOS to Linux:Pages to anything else, please.

评论 #17855590 未加载

评论 #17856153 未加载

评论 #17856121 未加载

kccqzyover 6 years ago

Pandoc is great! I use pandoc for all kinds of formal writing (conversion to PDF via LaTeX). We also run pandoc in production to produce customer-facing PDFs.

bkyanover 6 years ago

Is there an equivalent of this for spreadsheets?

评论 #17855439 未加载

rllinover 6 years ago

frustratingly slow for word docs. antiword is better for those of you who wish to convert word docs en masse

nambitover 6 years ago

I have used pandoc with uikit to autoconvert my markdown pages to html. Works like a charm.

rydelover 6 years ago

Really one of the best tool! Simple to use and makes things done.

fastierover 6 years ago

Where is .djvu?

评论 #17855985 未加载

评论 #17855977 未加载

boonasty69over 6 years ago

updated and secure.

another-cuppaover 6 years ago

I write any document that doesn't need extensive custom typesetting (which is 90% of stuff) in org-mode and then use pandoc to convert it to "normal people" formats at the end. I have made a basic template for MS Word that looks pretty good.

Numberwangover 6 years ago

I wish they’d fix the md to adoc table conversion issues. Apart from that I love it.

评论 #17855457 未加载

euskeover 6 years ago

I know it's well intended and somewhat successful, but I can't help but thinking of xkcd.com/927Sorry, I couldn't resist.

评论 #17858034 未加载