From Word to Markdown to InDesign: Fully Automated Typesetting Using Pandoc

114 pointsby rhythmvsover 9 years ago

11 comments

Animatsover 9 years ago

Well, yes, if you dumb your document down to the level of "markdown", it's not hard to grind them out as plain text with some styling. You can write HTML in that style, too. People did that 20 years ago.[1] That was the original vision of HTML.Some people wanted to stop there, and limit HTML to describing the semantics of a document, not its visual appearance. They lost.[2]Pandoc doesn't really use "markdown". It uses "enhanced markdown":"Pandoc’s enhanced version of Markdown includes syntax for footnotes, tables, flexible ordered lists, definition lists, fenced code blocks, superscripts and subscripts, strikeout, metadata blocks, automatic tables of contents, embedded LaTeX math, citations, and Markdown inside HTML block elements."Pandoc's "markdown" now has roughly the feature set of HTML 3.[1] <a href="http://www.animats.com/papers/articulated/articulated.html" rel="nofollow">http://www.animats.com/papers/articulated/articulated.html</a> [2] <a href="http://www.w3.org/People/Raggett/book4/ch02.html" rel="nofollow">http://www.w3.org/People/Raggett/book4/ch02.html</a>

评论 #10666398 未加载

评论 #10666330 未加载

评论 #10666463 未加载

评论 #10666876 未加载

rubidiumover 9 years ago

So 30% of my current job is writing technical user guides for large, custom, automated research systems (the rest is design and development of those systems).Each one is slightly different, and so needs its own guide. We currently are stuck using Word b/c it's so fast. Yes it's ugly. Yes it's a pain to work with sometimes. But I haven't found anything that's faster to produce these docs with.The reason we haven't switched to html, latex, or xml etc... is because I haven't found a typesetting program that is as easy to drag and drop images into and out of. Each guide has 30+ images on average. I _need_ that interface to be drag and drop otherwise it's too tedious to write the guide. The formatting of the text I'd love to do in some sort of markup language, but the images have to be dead simple.Short version: Anyone know of a typsetting solution with some markup language, version control, print to PDF happen at click of a button, and drag and drop images?

评论 #10667295 未加载

评论 #10667100 未加载

评论 #10668055 未加载

评论 #10667081 未加载

评论 #10668653 未加载

joluxover 9 years ago

Is Markdown really "relatively new?" According to Wikipedia, it's older than OOXML, aka .docx .xlsx .pptx etc.<a href="https://en.wikipedia.org/wiki/Markdown" rel="nofollow">https://en.wikipedia.org/wiki/Markdown</a> <a href="https://en.wikipedia.org/wiki/Office_Open_XML" rel="nofollow">https://en.wikipedia.org/wiki/Office_Open_XML</a>

评论 #10666140 未加载

评论 #10666130 未加载

shortformblogover 9 years ago

From the perspective of someone with experience working in the print industry: This is impressive work, and you nail down the biggest issue with Word—if you don't use pure style sheets, you get cruddy code. (Google Docs has this problem, too.)I think that lots of folks are trying to find different ways to solve this problem in the print world, especially as stories have to be pulled into CMSes as well.Personally, I'd be curious if it'd be possible to cull copy from a InDesign file to convert into Markdown, complete with links to images used in the document that could then be edited.

katabasisover 9 years ago

It's great to see others in the publishing world moving away from proprietary software like Word and towards plain-text based processes. As the author points out there are many benefits of this.Here's something to consider: you've ditched MS Word, maybe you can ditch InDesign too? In my experience once content goes into an Adobe program, it's hard to get it out again in a clean way. It's pretty impressive what you can accomplish using CSS3 for print layout[1].I work at an academic publisher, and I've spent much of the last year preaching the benefits of a similar workflow. Some of the editors are now editing manuscripts in Markdown files directly. Currently we're building a system where a single set of text files get fed into a program like a static website generator. This produces a web version, but also PDF, ePUB, etc. automatically. We're getting pretty close[2]. I think this is the future for many forms of publishing.I think one of the big remaining pieces of the puzzle is creating a better Markdown editor, something suited for the needs of scholars & academics with support for things like footnotes, bibliographies, etc, while remaining a plain-text format.[1] <a href="http://alistapart.com/article/building-books-with-css3" rel="nofollow">http://alistapart.com/article/building-books-with-css3</a> [2] <a href="http://egardner.github.io/posts/2015/building-books-with-middleman/" rel="nofollow">http://egardner.github.io/posts/2015/building-books-with-mid...</a>

zyxleyover 9 years ago

Unlinked footnotes in a webpage?With all that unused margin space on a desktop, it seems like they would have made more sense as Tufte-style marginal notes anyway.

评论 #10665996 未加载

akavelover 9 years ago

@rhytmvs Have you considered using SILE [1][2] instead of LaTeX (and maybe even InDesign)? (I'm not affiliated, but I believe it may become a worthy successor to TeX in future.)[1]: <a href="http://video.fosdem.org/2015/main_track-typesetting/introducing_sile__CAM_ONLY.mp4" rel="nofollow">http://video.fosdem.org/2015/main_track-typesetting/introduc...</a>[2]: <a href="https://archive.fosdem.org/2015/schedule/event/introducing_sile/attachments/slides/772/export/events/attachments/introducing_sile/slides/772/sile.pdf" rel="nofollow">https://archive.fosdem.org/2015/schedule/event/introducing_s...</a>

cossatotover 9 years ago

Will it work with equations? A major peeve of mine is trying to get Tex to deal with figures in a better manner, especially in space-limited situations (e.g. grant proposals), and I used to use InDesign for that before my work got too mathy.

评论 #10668842 未加载

pjstewover 9 years ago

I've been looking into the same problem for a while, and came to exact same solution a few weeks ago, coincidently. I haven't actually got round to completing all the code for it, but have tested each section. I was delighted when I spotted your article this morning, but was hoping you would have also shared your code... No git repository? I'm sure I can make the whole system myself, but I'm always happy to use others work if it exists. If you do have a working version of this process, please do share it!

pessimizerover 9 years ago

I prefer asciidoc: <a href="http://powerman.name/doc/asciidoc" rel="nofollow">http://powerman.name/doc/asciidoc</a> <a href="http://asciidoctor.org/docs/what-is-asciidoc/" rel="nofollow">http://asciidoctor.org/docs/what-is-asciidoc/</a>

todd8over 9 years ago

I'm really looking forward to the evolution of Markdown, but it will not be a complete replacement for TeX (for many years). TeX is designed around a powerful (macro based) programming system. This is an excerpt from a comment that I posted on HN a while back that is apropos this discussion:TeX's macro style of programming is too difficult. Nevertheless, people have done amazing things with it.TeX has somewhere around 325 primatives, and one of the most important is the \def primative used to define macros. These primatives are used to define additional macros, hundreds of them, available in different so called formats. A basic format known as Plain TeX includes about 600 macros in addition to the 325 primatives. LaTeX is another format, the most widely used, but there are others, like ConTeXt, that are also very capable. Each of these extend TeX's primatives with their own macros resulting in different kinds of markup language. TeX's primatives are focused on the low level aspects of typesetting (font sizes, text positions, alignment, etc.). LaTeX provides a markup language that is focused on the logical description of the document's components: headings, chapters, itemized lists, and so forth. The result is a system that does simple things easily while allowing very complex typesetting to be performed when needed.In addition to the TeX core primatives and the hundreds of commands (implemented as macros) in a format like LaTeX there are additional packages, classes, and styles that are used to provide support for any conceivable document. LaTeX has a rich ecosystem of packages. Typesetting chess? There's a LaTeX package for that. Complex diagrams and graphics, there's a LaTeX package for that. Writing a paper in the style of Tufte? Writing a book? or a musical score? or building a barcode? there are packages for that. The documentation for the Tikz & PGF graphics package is over 1100 pages long! The documentation for the Memoir package is 570 pages.The amazing thing is that all of this is built out of macros. Diving into this, and once one needs to customize the look of a document it's inevitable, you find yourself in a maze of twisty little passages. Once upon a time, while writing assembly language for large computers, I enjoyed writing fancy assembler macros. I was facinated with Calvin Moore's Trac programming language based on macros and Christopher Strachey's General Purpose Macrogenerator. These were early (mid 1960's) explorations into the viability of macro processors as means for expressing arbitrary computations. Reader's interested in trying out macros for programming can try the m4 programming language (by Kernighan and Ritchie) found on Unix and Linux systems. m4 is used in autoconf and sendmail config files. Yet, TeX macros are in a whole other dimension. All of these powerful macro systems have one thing in common: parameterized macros can be expanded into text that is then rescanned looking for newly formed macros calls (or new macro definitions) to expand as many times as one wants. This isn't just an occasional leaky abstraction; it is programming by way of leaky abstractions. Looking at TeX packages is some of the most difficult programming that I've done. It's unbelievably impressive what people have come up with (e.g. floating point implemented via macro expansion in about 600 lines of TeX), but it's also unbelievably frustrating to program in such an environment. The LaTeX3 project is an attempt to rewrite LaTeX (still running on top of the TeX core). Started in the early 1990's it is still not done. I think its just that they are mired in a swamp of macros. They do have a relatively stable set of macros written, with the catchy name expl3, that are intended for use when writing LaTeX3. Here's a sample<pre><code> \cs_gset_eq:cc { \cf@encoding \token_to_str:N #1 } { ? \token_to_str:N #1 } </code></pre> This is described in the documentation as being a big improvement over the old macros and "far more readable and more likely to be correct first time". I can't wait.I think LaTeX is absolutely without peer, but I wish improving it's programming method wasn't so daunting. I keep toying with starting a project to do just that, but so many others have tried and failed. It's disheartening.Links:[TRAC] <a href="https://en.wikipedia.org/wiki/TRAC_(programming_language)" rel="nofollow">https://en.wikipedia.org/wiki/TRAC_(programming_language)</a>[GPM] <a href="http://comjnl.oxfordjournals.org/content/8/3/225.full.pdf" rel="nofollow">http://comjnl.oxfordjournals.org/content/8/3/225.full.pdf</a>[m4] info pages available on Unix and Linux[Tikz & PGF] <a href="https://www.ctan.org/pkg/pgf?lang=en" rel="nofollow">https://www.ctan.org/pkg/pgf?lang=en</a>[Memoir] <a href="https://www.ctan.org/pkg/memoir?lang=en" rel="nofollow">https://www.ctan.org/pkg/memoir?lang=en</a>[expl3] <a href="https://www.tug.org/TUGboat/tb30-1/tb94wright-latex3.pdf" rel="nofollow">https://www.tug.org/TUGboat/tb30-1/tb94wright-latex3.pdf</a>

评论 #10668477 未加载