TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Don't write just in plain text (longevity vs. authenticity)

138 点作者 yumiris大约 3 年前

35 条评论

dang大约 3 年前
Related large thread from yesterday:<p><i>Write plain text files</i> - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=30521545" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=30521545</a> - March 2022 (345 comments)
nonrandomstring大约 3 年前
This essay is actually deeper than its surface appearance, about text versus other formats. It&#x27;s about semantics and richness of content, although I am not sure Miris fully grasps what s&#x2F;he is wrestling with.<p>The author invokes the concept of &quot;authenticity&quot;, and that&#x27;s where it gets interesting.<p>I used to set my students a question about information content in a class on the philosophy of procedural representation.<p>We had a very high resolution photo of the aviation pioneer Amelia Earhart, and a short grainy video clip of her getting into a plane and smiling and waving.<p>My question was: Which one of these two media conveys more information about Amelia?<p>One gave extraordinary detail of her face, eyes, and seemed to many was a much better &quot;fidelity&quot; document. Others noticed that although you couldn&#x27;t see her face in the video, you could feel from her gait, waving, body language and the way she shook hands _much more_ about her than from the static photo.<p>Both files are the same size in bytes.<p>So which one has more &quot;information&quot;? Which one is more &quot;authentic&quot;?<p>Not to attempt to answer here with a deep dive into phenomenology, but each carries a different kind of information, which can be static, dynamic, or meta-dynamic in higher orders relative to a matrix of assumptions that must be carried forward in parallel by the culture that wants to decode the message later.<p>I like that Miris tries to explore this by questioning the richness of text. But maybe the question doesn&#x27;t hold up well under those conditions of investigation - because one might say that a great poet using only a few words might capture a landscape better than a painting, but if our culture drifts toward a visual one where poetry is no longer understood we cannot say that the medium itself degraded.
评论 #30527287 未加载
评论 #30527603 未加载
评论 #30526985 未加载
评论 #30527711 未加载
评论 #30527021 未加载
评论 #30527448 未加载
评论 #30527064 未加载
thomascgalvin大约 3 年前
This argument feels ... not quite like a strawman, but more pedantic than I think it needs to be.<p>I don&#x27;t think anyone <i>really</i> argues that everything should be plain text, even if that&#x27;s an easy shorthand. The real argument is &quot;use the simplest, most open format possible.&quot;<p>Nobody is suggesting you go through all of your photos, transcribe your emotional reaction to each picture, and then delete the image. But, if you want to view those same photos when you&#x27;re fifty years old, or seventy-five, you&#x27;re better off storing them as a JPEG than a PSD, and you&#x27;re better off storing them on a hard drive you have access to in addition to whatever cloud they&#x27;re currently occupying.<p>&quot;Write plain text&quot; is a shorthand for &quot;use open formats.&quot; Because so much of what this audience does is test-based, plain text is the most common format we use, from source code to journaling, but that message applies to pretty much anything: if you lock yourself into a proprietary format, or a proprietary editor, you will almost certainly lose data over the long term.
评论 #30529253 未加载
评论 #30528538 未加载
评论 #30528459 未加载
评论 #30528394 未加载
评论 #30529445 未加载
评论 #30530718 未加载
评论 #30529118 未加载
llarsson大约 3 年前
That &quot;some&quot; proprietary formats from the 80&#x27;s and 90&#x27;s are still readable is already causing real problems: because not *all* are. So text, possibly with Markdown or similar hints regarding emphasis and structure, is still vastly better than any alternative I can think of.
评论 #30527351 未加载
评论 #30526679 未加载
评论 #30526931 未加载
eatmygodetia大约 3 年前
I feel like a lot of use plain text proponents forget that outside of ASCII and now UTF-8, lots of alleged plain text documents with diacritics or non-latin characters are at least slightly difficult to open because of their somewhat esoteric encodings. Plain text isn&#x27;t as universal as it is often claimed, although it is immensely simpler than some other formats.<p>But maybe we should all use monochrome bitmap files for everything? That would be very simple.
评论 #30527764 未加载
评论 #30534509 未加载
评论 #30530843 未加载
评论 #30527754 未加载
yumiris大约 3 年前
This was concocted at 5AM -- my apologies for any peculiar sentence structures or odd phrasing.<p>Will re-re-re-revise it again with fresh eyes after resting &#x27;em!
aasasd大约 3 年前
I got quite a lot of use out of metadata over the years, such that now I&#x27;ll probably get a nervous itch and tremors all over my body if I attempt to use <i>just</i> plain text. Specifically, the creation and modification times for each addition to my notes are rather valuable, especially with the work-from-home lifestyle aka ‘day fades into night into day’—with which more people are gonna be familiarized in these years.<p>Thankfully I&#x27;m using Org-mode these days, which is reasonably ‘plain text’ under practical definitions—but I make dozens new headings every week, and each of them is stamped with the creation time. But boy do I miss having modification times too—should probably finally set up automatic commits to Git. Also need to mess with Orgzly so that it marks notes that are created on the phone.
评论 #30527060 未加载
评论 #30534563 未加载
评论 #30527891 未加载
评论 #30526904 未加载
brians大约 3 年前
“all the binary formats of the 1990s can be opened today”<p>Oh, sweet summer child. Scribe&#x2F;mss. Koalapad. A bunch of Apple 2GS, Apple 3, and Lisa formats. Lotus Improv.<p>The points about semantics and authenticity are wonderful, but I think the presumption that all formats can be opened is mistaken exactly because those that can’t be opened become effectively invisible and lost.
评论 #30534632 未加载
评论 #30529013 未加载
ggm大约 3 年前
he said.. in courier, monospaced paragraphs format, morally as close to &quot;plaintext&quot; as you can be with a couple of diagrams which could have been ASCII art...
评论 #30526643 未加载
评论 #30526435 未加载
评论 #30526600 未加载
评论 #30532057 未加载
briandoll大约 3 年前
I assume this is a response to Derek Sivers post: Write Plain Text Files <a href="https:&#x2F;&#x2F;sive.rs&#x2F;plaintext" rel="nofollow">https:&#x2F;&#x2F;sive.rs&#x2F;plaintext</a><p>I&#x27;ve been using computers daily for about 35 years now and I have a _lot_ of plain text files that I regularly use -- notes, lists, outlines, quotes, links, etc. Does anyone who has been around a while, have a large multi-decade collection of texts that are _not_ plain text? What formats do you use? How do you maintain access to those files over time?
评论 #30530813 未加载
titzer大约 3 年前
&gt; What ultimately matters is that information is captured and preserved as thoroughly as possible. Between a picture that expresses a thousand words, and plain text file that sacrifices its detail and authenticity, why wouldn&#x27;t we choose the former? Indeed, this question applies even the choice may sacrifice the longevity. What&#x27;s the point of longevity, when the pursuit of it can compromise our ability to capture the information we may be afraid of possibly losing?<p>I would contend that capturing a picture is absolutely a massive distortion of reality because reality is three dimensional, exists in many spectra beyond visible light, has sounds, smells, taste, and feeling, and exists in a historical context. The selection of framing, distance, focus, all of these are biases of the photographer. A photo is a lie, too. Just because it&#x27;s higher resolution doesn&#x27;t mean it has indeed captured the right information.<p>Text is a lie too, granted. But in our current digitization zeitgeist, we have forgotten that our media (pictures, video, recordings, not just the TV, cable, and internet) lie to us. Our own bias towards slicing apart the world into computer-digestible bits is just us lying more convincingly to ourselves.
评论 #30534675 未加载
orzig大约 3 年前
Render to ASCII, everyone wins! (e.g. <a href="https:&#x2F;&#x2F;ascii-generator.site&#x2F;" rel="nofollow">https:&#x2F;&#x2F;ascii-generator.site&#x2F;</a>)
copperx大约 3 年前
&gt; but dismissing or abandoning media files is a much more guaranteed potential loss of information – information which plain text cannot capture due to its limitations.<p>Some examples are sorely needed. How is a Word&#x2F;InDesign file more authentic than a plain text file? Or is the author talking about media? Is a ProTools session more authentic than Wav files?
评论 #30526792 未加载
评论 #30526376 未加载
jauco大约 3 年前
Real archivists (as in people that have archivist as a job description and work at places that have “storing data forever” as a mission statement) tend to store the data in multiple formats. The source + a few derivations. They also store a bunch of copies to ward against bitrot. And they periodically compare the copies.<p>Real archivists use a lot of data :)
评论 #30534749 未加载
davbryn1大约 3 年前
&quot;Prioritising the longevity of data can sacrifice the authenticity of what it tries to capture and preserve. When I say authenticity, I refer to how accurate and detailed the data in question preserves a particular state. An original raw image, for example, will capture a landscape much more authentically than written text would. Written text will inevitably comprise of ambiguity and even bias, if not distortion.&quot;<p>Or, you need to become a better writer.
评论 #30534793 未加载
Annatar大约 3 年前
&quot;This is the Unix philosophy: Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface.&quot;<p><a href="http:&#x2F;&#x2F;catb.org&#x2F;~esr&#x2F;writings&#x2F;taoup&#x2F;html&#x2F;ch01s06.html" rel="nofollow">http:&#x2F;&#x2F;catb.org&#x2F;~esr&#x2F;writings&#x2F;taoup&#x2F;html&#x2F;ch01s06.html</a>
nicbou大约 3 年前
There doesn&#x27;t need to be a compromise. You can have both if you keep your data in multiple formats. Storage is cheap and text files are small.<p>My timeline thing [0] keeps the original archives, stores the timeline entries in a database, and exports them hourly as JSON + files. If the code stops working or the database crashes, the files are still there. The automated backups are there too. No information is lost.<p>However, the richness is not lost in the process. This timeline has geolocation history, notebook scans and a bunch of other things that don&#x27;t really translate to plain text.<p>The most important difference is that I can write to my timeline from my phone. Managing text files across devices is quite troublesome by comparison. If I want plain text out of it, I can write a new Destination that pipes entries to plain text files or to a fax machine.<p>[0] <a href="https:&#x2F;&#x2F;nicolasbouliane.com&#x2F;projects&#x2F;timeline" rel="nofollow">https:&#x2F;&#x2F;nicolasbouliane.com&#x2F;projects&#x2F;timeline</a>
dorfsmay大约 3 年前
Whenever choosing a markup, image format, or other technologies, keep the Lindy effect in mind. A boring technology that has been around for a long time will survive a lot longer and a brand new shiny one.<p><a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Lindy_effect" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Lindy_effect</a>
writegit大约 3 年前
Or both?<p>I have a daemon that watches for binary changes in writing documents.<p>If changes are identified then it runs:<p><pre><code> $ libreoffice --headless --convert-to txt &lt;CHANGED_FILES&gt; </code></pre> Then commits the plaintext to a git repo.<p>Allows for diffs, text search, and &quot;longevity&quot; across &quot;authentic&quot; docs.
VariableStar大约 3 年前
IMO the question is more about which standards are used, rather than specifying an specific format. In particular, using open and free standards and formats increases the chance to retrieve and use data after long time storage. Different formats suit different data types.
highspeedbus大约 3 年前
Obsidian&#x2F;Markdown file structure is great for this. It can become a standard to &quot;Offline Hypertext&quot; format.<p>Despite text being fully portable, it is limited when it&#x27;s needed to link a image or other files. People often forget how useful this concept is.<p>Html is not a viable option as it is awfully verbose for taking simple a note.<p>Markdown adds just enough semantics that is perfectly readable. From a hex editor to Microsoft Word.<p>We&#x27;re in a somewhat critical moment, where markdown can either stay as it is, then dominate and become a godsend format of solid usability for decades, or a harmful feature is added that would slowly drag the whole thing down until the next Just Write Plain Text blog post.
ad404b8a372f2b9大约 3 年前
I think longevity is not just an issue of the data format but more so of its organization. It so happens that text files organized using the file system is the most easily producible, maintainable and queryable data organization tool. But other media can have the same properties if they&#x27;re organized using the file system rather than any complex tools. I have graphs and datasheets that have endured decades that I refer to often and are easily findable because they are well-named files in well-named folders, even though the formats are comparatively much more complex.
Beldin大约 3 年前
It seems the author overlooked the possibility of writing out the full binary string of whatever format he&#x27;d like (i.e., &quot;zero one one ...&quot;), prefaced by instructions on how to parse that.<p>That would give you great &quot;authenticity&quot; (in his definition) and great longevity.<p>Not practical for reading back, but that was not the point. With the help of a few simple scripts, writing is easy. So, in the end, not really an argument against storing information exclusively in plaintext.
jjice大约 3 年前
We use Google Docs for pretty much all of our docs since they&#x27;re easy to create, share, and modify, and it works pretty well. I just (selfishly) want a good integrated plain text editor as part of GSuite. Sharing code via Google Docs isn&#x27;t great, and sometimes I don&#x27;t want to think about headers and formatting, I just want to use tabs to separate my pieces. That said, I&#x27;m definitely in the minority of users and I&#x27;ll deal with it, not that big of a deal.
thematrixadmin大约 3 年前
What about writing data in markdown format, physically on the HDD. You can use bunch of different both online and local tools which will probably stay supported in the future. There is also no problem with implementing your own markdown editor (nice side, pet project as well). I store and run small server on my RPi, accessible through my phone and desktop. If I&#x27;d like to show the text to somebody I can easily copy it as a plain text, Word format or export it to HTML or PDF.
happyglands大约 3 年前
I&#x27;ve struggled with this for quite some time now, and tried almost every tool out there. At the moment, I&#x27;m settling with Bear, writing my notes in Markdown. I prefer the ease of using nvAlt but I need the ability to store images and PDFs and I like the fact that it has some very nice export options should I eventually move to another tool, so I don&#x27;t feel like I&#x27;m &quot;locked in&quot;.
m348e912大约 3 年前
This might be off topic but in terms of communication such as email, plain text seems the most authentic format to me. For example, if you are one of those sales guys that bolds and highlights the important parts of an email that you send, it&#x27;s off-putting. The only exception I would give is if you wanted to add an inline image or an emoji -- everything else, plain text.
amiga1200大约 3 年前
The Epic of Gilgamesh was written in plain text.
评论 #30528438 未加载
jdvh大约 3 年前
Plain text is so compelling because it&#x27;s as simple as it gets, you can bring your own editor, you own your own data, and you can use version control.<p>Text+ is compelling because you can have images and some kind of formatting. You want to store metadata and have backlinks and tags. Ideally with the possibility of collaborative editing.<p>There should be a way to fuse these two.
评论 #30526972 未加载
quasarj大约 3 年前
Wrote a whole article about not using plain text. Used plain text for everything except a useless image. A+++
chaxor大约 3 年前
I like the idea of making a binary file into a plaintext file - but you could store it as the ASCII characters &quot;0000110100111011110001111100101...&quot;<p>This would be great for many reasons. At the top of that list for example, is getting a lot more use out of those hard drives you paid for.
dade_大约 3 年前
MD for all things text and SVG journals for handwritten notes, diagrams, sketches, screenshots. Works great, but haven’t found a way to integrate them beyond using a common set of folders.
评论 #30527319 未加载
anotherevan大约 3 年前
Reminds me of the Einstein quote: Make something as simple as possible, but no more so.<p>Paraphrased: Make your information capture format as simple as possible, but no more so.
gandalfff大约 3 年前
Plain text is fine for some things but lacking for others. I like GUIs for formatting. I wouldn&#x27;t be surprised if my ODTs could be opened a thousand years from now.
评论 #30543426 未加载
a1445c8b大约 3 年前
s&#x2F;comprise of&#x2F;comprise&#x2F;g