TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Why are the Microsoft Office file formats so complicated? (And some workarounds)

60 点作者 unfoldedorigami大约 17 年前

11 条评论

WenomousVit大约 17 年前
Joel starts out by telling you that Microsoft's file formats are <i>not</i> the product of a demented Borg mind and are <i>not</i> impossible to read or create correctly.<p>And then he spends the rest of the article explaining <i>why</i> the file formats are so demented and impossible to read or create correctly.<p>Which isn't the same thing at all. Just because there is a rich history of it-seemed-like-a-good-idea-at-the-time decisions doesn't mean the end result is any good.<p>There is a serious, deep, and interesting problem of scaling and complexity management that could be discussed here. But Microsoft's approach seems to have been one of <i>embracing</i> complexity. And Joel's role, today, is just defending that approach.
projectileboy大约 17 年前
I call bullshit. A lot of his arguments <i>perhaps</i> make sense for early versions of Office, but for '97 and later? It didn't occur to <i>anyone</i> on the Word or Excel teams <i>in 1997</i> that <i>maybe</i> interoperability would be an issue?<p>I really enjoy Joel's blog, but sometimes I can't stand the clubbish-ness of the old-school Microsoft brigade. "We were a bunch of geniuses, doing the best we could..." Yeah, well, the software is usable, but it kinda sucks. Don't be surprised if the people MS victimized for a decade sound critical now that we can all see how the sausage gets made.
评论 #118978 未加载
评论 #119394 未加载
codesurgeon大约 17 年前
What is the motivation for Joel Spolsky's post? Sympathy for MS Office file formats? Reads like a manifesto for never ever changing/cleaning up the file formats because it took a thousand man-years to come up with the current incarnation. He is even going so far as to defend the office file formats' complexity and resulting maintainability hell by saying that portability was not an issue fifteen+ years ago - well, it is now. His suggestions for file format conversion solutions are OK, but doesn't it occur to him, that developers don't want to be locked-in to having to shell out money for MS APIs forever? Ties in well with Bruce Schneier's latest post on vendor lock-in <a href="http://www.schneier.com/blog/archives/2008/02/lockin.html" rel="nofollow">http://www.schneier.com/blog/archives/2008/02/lockin.html</a>
tlrobinson大约 17 年前
Normally I agree Joel's articles, but having dealt with the new garbage that Microsoft calls a standard, Office Open XML, I can tell you that none of this applies. Yet the Open XML spec is over <i>6000</i> pages long (compared to these binary specs which are a measly 100-300 pages each).<p>Open XML is not designed for performance... it's XML, and today's computers are fast enough. It IS [<i>supposed to be</i>] designed for interoperability (somehow they managed to get ECMA to put their stamp of approval on it), but in reality it feels like it's a half-assed attempt to wrap all of Office's legacy formats in XML. For example, to import you <i>still</i> need WMF importing, because a lot of the graphics (including all clip art) are WMF.
snorkel大约 17 年前
Joel suggests using one licensed copy of Office to run as a web service backend but I'm not so sure Microsoft's EULA allows that.
评论 #119154 未加载
henning大约 17 年前
"The idea of things like SGML and HTML—interchangeable, standardized file formats—didn’t really take hold until the Internet made it practical to interchange documents in the first place; this was a decade later than the Office binary formats were first invented."<p>Put another way, Microsoft didn't care about interop until they were convicted of illegal monopolistic practices and they couldn't get away with the kind of shit reflected in this spec anymore.
brlewis大约 17 年前
MSFT is releasing specs only after the EU is basically forcing them to, and in the year 2008. Please don't try to tell me their formats are not deliberately obfuscated.<p>If a format's goal truly was to maintain forward/backward compatibility while remaining friendly toward low-end hardware, then it wouldn't be so hard to reverse engineer.
jdueck大约 17 年前
You can't blame Microsoft for awful Office file formats. Remember, they wrote this stuff before the web came along, and before they generally started sucking really bad. It's just the accumulation of decades of feature creep, add-ons, re-dos, compatibility hacks, bug fixes, and workarounds.<p>Eventually, all software needs a rewrite. Not just MS's.
phil大约 17 年前
One option Joel doesn't mention is to use OpenOffice. It has its own object model (called UNO), which has bindings in many languages.<p>It's not a perfect implementation of the formats, but it's good enough for most things. And has the rather large advantage that you can run it under your favorite *nix.
BrandonM大约 17 年前
It seems to me like Joel has a rather warped view of how software should be built. He claims<p><i>It means you have to rewrite all of your date display and parsing code to handle both epochs. That would take several days to implement, I think.</i><p>That's just ridiculous. In my mind, you need a piece of code that reads the 1904 record and sets a flag in the code. Then, your date display and parsing code should all call one, or maybe two, functions which handle the conversion for you based on this flag. Thus, supporting two different epochs requires at most three components: one to read it and set a flag, one to convert a numerical argument to a date based on that flag, and one to convert a date back to a numerical argument (again, based on the flag). Should this really take "several days to implement"? It seems to me that an hour should be plenty of time.<p>It's going to be hard for me to continue to take him seriously if this is his view of how software should be built.
评论 #121492 未加载
edw519大约 17 年前
"If you simply have to produce tabular data for use in Excel, consider CSV."<p>That's what most of us have been doing for years, simply to avoid the mess discussed in this article.<p>After reading the article, I, for one, will keep on doing it.
评论 #118988 未加载