Pandoc – A universal document converter

756 pointsby johnsonjoover 4 years ago

40 comments

ntnsndrover 4 years ago

I can't express enough my gratitude on a daily basis for what pandoc enables me to do. I made a simple Emacs script that I use to output files, and I use it constantly for Latex PDFs, HTML output, RevealJS slides, and odt/docx/etc. All with bibliographies fron Zotero in zillions of formats. As a professor and journalist, I need to use a wide range of output formats, but as a human being I like to work in clean, simple text files that will never be obsolete. Pandoc, way more than any tool, gives me the freedom to work in any writing environment I like and keep that distinct from whatever weird formatting preferences a journal, magazine, or publisher might have. I've written two books with Markdown and a huge variety of articles. I am so thankful for the care with which it has been built and maintained. Thank you.

cosmic_quantaover 4 years ago

One thing I love about pandoc that I don't see mentioned here is the ability to apply filters to transform documents mid-conversion.I'm using Pandoc to write my PhD thesis at the moment, from Markdown source, using certain filters to "augment" what Markdown can do. Examples:<a href="https://github.com/LaurentRDC/pandoc-plot" rel="nofollow">https://github.com/LaurentRDC/pandoc-plot</a><a href="https://github.com/lierdakil/pandoc-crossref" rel="nofollow">https://github.com/lierdakil/pandoc-crossref</a>More info here: <a href="https://pandoc.org/filters.html" rel="nofollow">https://pandoc.org/filters.html</a>

评论 #24886725 未加载

评论 #24883379 未加载

评论 #24883550 未加载

评论 #24885984 未加载

评论 #24884533 未加载

评论 #24884238 未加载

评论 #24883345 未加载

szhuover 4 years ago

fwiw Pandoc's author, John MacFarlane, is also behind these projects that try to unify the Markdown ecosystem:- Babelmark, a tool to compare how different Markdown parsers interpret the same Markdown input. <a href="https://johnmacfarlane.net/babelmark2/" rel="nofollow">https://johnmacfarlane.net/babelmark2/</a>- CommonMark, the first formalized Markdown standard, and now the de-facto Markdown standard. <a href="https://commonmark.org/" rel="nofollow">https://commonmark.org/</a> (He's the first listed member of the team.)I feel like John is probably the single largest contributor to what Markdown is today, other than perhaps the creator of Markdown. Thank you for your work!

评论 #24885402 未加载

jaggederestover 4 years ago

I had an interesting conversation with John MacFarlane, the maintainer and author of Pandoc (lovely human being and excellent maintainer), and the subject of day jobs came up. He's a professor of logical philosophy at UC Berkeley which I thought was fascinating. It certainly makes sense given the number of document formats and such that academia deals with.

评论 #24883813 未加载

评论 #24883446 未加载

dmlorenzettiover 4 years ago

Pandoc is great at bridging the gap between science-oriented data control needs, and management-oriented reporting needs.I was on a modeling project that used scripts to generate hundreds of input parameters, embed them in models, run the models, and produce reports. The inputs and outputs shifted a lot over the course of the project, as we came to understand the domain and implications of the work better. At every update, the changes had to be transferred to a Microsoft Word document that went to the project sponsors.Pandoc made this easy -- we just added scripts to write out the model inputs as Markdown tables, then embed those tables in a larger writeup, also written in Markdown. Pandoc turned it all into a Word document. Thus, the same toolchain that did the actual work, also drove the final report. I really don't think we could have had confidence all the tabular data was right, had it not been automated through Pandoc.

nathan_f77over 4 years ago

I would like to start using Pandoc in my commercial software [1] to help convert documents into different formats, but the GPL license makes that difficult (or at least confusing.) I think it's generally fine to call a GPL program from a SaaS application. I believe it's fine as long as it is providing an optional or tangential feature, and your application can continue to perform the core functions when that GPL tool is not present. AGPL licenses go a step further and prevent access to any AGPL commands over the network, so that's when a commercial license is always required.Am I allowed to distribute GPL programs contained inside a Docker image for on-premise installations? Do I just need to provide proper credit and a link to the source code?Or is there a commercial license available for Pandoc? (I couldn't find anything.)[1] <a href="https://docspring.com" rel="nofollow">https://docspring.com</a>UPDATE: I've decided to evaluate pandoc and see if it might be useful for supporting Markdown and Word formats, etc. If it is, then I'll reach out to John McFarlane and ask about a commercial license (or just something in writing), perhaps in exchange for sponsorship on GitHub.

评论 #24885119 未加载

评论 #24885006 未加载

评论 #24888840 未加载

评论 #24885201 未加载

tarlebover 4 years ago

I'm a long time (7 years) contributor to pandoc. Other frequent contributors often drop by here as well. Happy to answer questions, ask us anything.

评论 #24884607 未加载

评论 #24885973 未加载

评论 #24885778 未加载

评论 #24891623 未加载

评论 #24884966 未加载

ImaCakeover 4 years ago

Pandoc is a tool used daily by those of us who write code notebooks (rmd or jupyter) or are into using markdown for their notes and occasionally need to print said notes. It is hard to overstate how useful Pandoc is for me.I would bet many people who use Pandoc have no idea they rely on it. I don't think Jupyter or RStudio make a big fuss about it even though they both use it.

评论 #24883224 未加载

评论 #24883237 未加载

评论 #24883501 未加载

wtroughtonover 4 years ago

Probably overkill, but I use Pandoc to generate tailored resumes for roles and jobs I’m interested in.I keep a list of all my skills, experience and education in a YAML file and have a LaTeX template that I clone when creating a new resume. Then it’s just a matter of replacing the template fields with YAML metadata and running Pandoc.

评论 #24883905 未加载

评论 #24885844 未加载

leephillipsover 4 years ago

You can write filters in Python and several other languages. These let you perform arbitrary computation triggered by tags in your source document, and let you extend Pandoc’s Markdown to include your own custom tags to do anything you can imagine.Here is an article where I show how to use Panflute, a library that lets you write filters in Python, and how I wrote a set of filters to automate the tedious parts of writing a complex technical manual:<a href="https://lee-phillips.org/panflute-gnuplot/" rel="nofollow">https://lee-phillips.org/panflute-gnuplot/</a>

karlicossover 4 years ago

Pandoc is awesome! One of my favorite usecases is for Orger [0], which I'm using to automatically convert data from different services into org-mode for easier local-first/offline search, navigation etc. Often API would give you markdown (e.g. Github), and while I could embed a markdown source block in org-mode, with Pandoc I can just convert it and display in native Org syntax.[0] <a href="https://github.com/karlicoss/orger#readme" rel="nofollow">https://github.com/karlicoss/orger#readme</a>

评论 #24883185 未加载

roryokaneover 4 years ago

If you want to do single-file conversions with Pandoc without having to install it, try <a href="http://markup.rocks/" rel="nofollow">http://markup.rocks/</a>. It’s a compilation of Pandoc into 2.2MB of JavaScript so you can convert documents (and preview their HTML conversion) in your browser as you type. Its source code: <a href="https://github.com/osener/markup.rocks" rel="nofollow">https://github.com/osener/markup.rocks</a>.I most often use <a href="http://markup.rocks/" rel="nofollow">http://markup.rocks/</a> for converting HTML to Markdown and for testing that my reStructuredText syntax is correct when contributing to docs.Pandoc also has a demo web page for trying it out (<a href="https://pandoc.org/try/" rel="nofollow">https://pandoc.org/try/</a>). The demo supports all of Pandoc’s formats and doesn’t require a large JS download, but it silently truncates inputs to 3,000 characters.

评论 #24894195 未加载

jjiceover 4 years ago

Pandoc is on the the programs that always surprises me with how good it is. Everything I throw at it works perfectly. I write my assignments for class as Markdown or plain text and it easily makes them a good looking Word or LaTeX document seamlessly.It's also fantastic for converting my class notes from Markdown with LaTeX equations into beautiful PDFs.

amirkdvover 4 years ago

Pandoc is a true work of art. Everything about it embodies the Unix philosophy of "Do One Thing and Do It Well".I've been using Pandoc (and make) daily for over 6 years for all sorts of document writing (letter, report, thesis, design doc, performance review, you name it) and solve the occasional "interesting" format conversion problem. Its robust, reliable, fast, and a pleasure to use (and script).

dangover 4 years ago

If curious see alsoa large thread from 2018: <a href="https://news.ycombinator.com/item?id=17855104" rel="nofollow">https://news.ycombinator.com/item?id=17855104</a>

ravi-deliaover 4 years ago

Always glad to see pandoc get some attention. This tool is probably in my top 5 overall, I barely make it through a day without it.

评论 #24882881 未加载

nn3over 4 years ago

pandoc is one of the few packages (among with tetex) i black listed on my distribution for automatic updates because it seems to pull in hundreds of other packages which are not used by anything else.I don't know how they did it, but somehow they put dependency hell on a completely new level.Yes i'm sure it's a great tool, but there's a limit how much bloat I can tolerate for a single program.

评论 #24883065 未加载

评论 #24882989 未加载

评论 #24883411 未加载

评论 #24883077 未加载

评论 #24884716 未加载

评论 #24883079 未加载

bigbubbaover 4 years ago

Pandoc is great but I think it falls a bit short of being a Swiss army knife; there are a lot of conversions it cannot do, like PDF-to-anything. Thankfully Calibre's 'ebook-convert' tool covers many of pandoc's blindspots.

评论 #24883782 未加载

评论 #24883785 未加载

CornCobsover 4 years ago

Great thing about Pandoc - it has a clear, descriptive and yet unique name that aptly describes what it does.That aside, I find the markdown + additional features (e.g. latex math, inline code eval), mainly as implemented in Rstudio and Rmarkdown, to be the sweet spot of power and convenience of typing and legibility in plain text form. Thanks pandoc!

johnsonjoover 4 years ago

I've been using pandoc a lot recently for converting DRM free epubs into plain text and then piping that into Mac's say command generally then I pipe that to ffmpeg and output the file to mp3 for compressions sake. say is a text-to-speech program. Obviously I only use the audio output for myself. But, I find mac's Books app useful for the audio because you can set the speed up to 2x the original. (I'm sure the say command also has some similar settings too.) I even set up my own Automator task to do most the work for me. I am so thankful to those who made pandoc though it has come in handy time and time again. I used it for tons of my school papers back when I was in school and now it's my go to document converter.EDIT: I've also used this workflow for reading RFCs for OAuth and such. It's just basically a small curl piped to say away. Sometimes if I feel like reading an article I'll add a readability like cli tool piped between the curl and say commands. Unix is awesome!

评论 #24892688 未加载

grecyover 4 years ago

I've self-published a couple of paperback novels that I create using LaTeX, then I run them through pandoc to get a perfectly formatted .epub that I use to sell the e-book versions.Flawless!

asicspover 4 years ago

I'm using pandoc for generating pdf/epub ebooks from GitHub style markdown. The default output is good enough and there are various themes that can be selected. But I wanted to customize a lot of things like chapter breaks, background color for inline code, bullet styles, blockquote style, etc. I didn't know Latex but was able to find snippets from stackexchange sites to suit my needs. I wrote a blog post on this: <a href="https://learnbyexample.github.io/customizing-pandoc/" rel="nofollow">https://learnbyexample.github.io/customizing-pandoc/</a>

sabalabaover 4 years ago

I absolutely love Pandoc, I use it in my Makefile based static site generator. Pandoc is probably one of the most valuable pieces of open source tooling next to ffmpeg and imagemagick.

评论 #24884489 未加载

mdeck_over 4 years ago

Hadn't heard of pandoc before. Momentarily thought it converted from PDF to anything, and my heart leapt. Alas, it only converts to PDF. My hopes dashed...

评论 #24885944 未加载

quickthrower2over 4 years ago

Poster child for Haskell

评论 #24884809 未加载

eskaover 4 years ago

I used pandoc with filters written in Haskell for my blog. I was surprised how far I could stretch it before I had to switch to Rust with pulldown-cmark (just went for Rust for learning although it turned out to be a good decision).Pandoc filters allowed me to transform the AST in useful ways. For example I turned the image tag into HTML figures with captions, used the video tag if the URL was a video, and called ffmpeg to encode the video in another format for browsers that didn't support the other format.

jmmcdover 4 years ago

I write my lectures and labs in .md and convert to pdf with pandoc. I like the results tex produces but I don't love the language, so pandoc is ideal.

评论 #24885122 未加载

mark_l_watsonover 4 years ago

Pandoc is wonderful. I don’t use it often, but I always have it installed and available.+1 for being written in Haskell, indeed way back when I became interested in Haskell, I think it was noticing that this tool I was using was written in a strange programming language that influenced me to eventually adopted it many side projects and to write a little book on.

flaweddwarf1231over 4 years ago

As much as i like pandoc, i hate how many Haskell dependencies it has on archlinux. And the distro is not to blame here. They do it right. In that sense pandoc might be an excellent tool, but for me it's also a reason to think twice whenever you want to use haskell in production. Because apparently, this is a haskell ecosystem issue.

评论 #24892389 未加载

评论 #24891117 未加载

pandatigoxover 4 years ago

This is probably a silly question, but the last (and first) time I used pandoc, my conversion of org files to markdown resulted in a lot of whitespace within the document itself. I followed the instructions on the website, but is there a flag that I should have used to get rid of excess whitespace?

评论 #24884517 未加载

raj2569over 4 years ago

Long term pandoc user here!Been using it with <a href="https://github.com/Wandmalfarbe/pandoc-latex-template" rel="nofollow">https://github.com/Wandmalfarbe/pandoc-latex-template</a> to generate my documents.Please comment if there are other nice templates, either for LaTeX or for Doc

评论 #24885469 未加载

meksterover 4 years ago

It surprised me when I couldn't find a decent tool to read markdown in a shell and I tried about a dozen tools but pandoc did it the best to read it sufficiently well by feeding it into man command.

评论 #24884355 未加载

Santosh83over 4 years ago

Does anyone have practical experience maintaining an entire website through pandoc generated HTML? Is it worth it, and what are some pitfalls to be aware of?

评论 #24891100 未加载

评论 #24890771 未加载

mlang23over 4 years ago

And with hakyll, you get a static site generator powered by all the goodness that is pandoc. Blazingly fast (compared to say, pelican) and easy to extend.

jasonshenover 4 years ago

This is great! Anyone know what the format for Google Docs is and whether Pandoc or another tool is good for importing GGocs into other formats?

评论 #24884317 未加载

laktakover 4 years ago

Pandoc is great though I struggle with latex. Is there an easier way to go from md to pdf with your own template?

评论 #24886444 未加载

评论 #24884664 未加载

arunaugustineover 4 years ago

Can anyone point me to docs/code where the internal pandoc format (AST) is described please?

svikashkover 4 years ago

I’ve used many converters in my life, but Pandoc is the one I always end up using every time

Causality1over 4 years ago

I rather expected more than just two ebook formats on something described as a universal document converter.

fizixerover 4 years ago

Pandoc ubuntu apt installation is horrible.I have installed the latest texlive in home directory.When I invoke 'sudo apt install pandoc' it requires me to install a massive texlive setup at the system level as part of it.This is not specific to pandoc but many other packages. I have anaconda3 installed in my home, but image-magick requires a massive numpy/scipy system-level install (ignoring for the moment my bewilderment at why would image-magick require numpy/scipy).I refuse to put up with this kind of bloated bs.

评论 #24883562 未加载

评论 #24883003 未加载

评论 #24884172 未加载

评论 #24884712 未加载

评论 #24883668 未加载