TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Xee: A Modern XPath and XSLT Engine in Rust

381 点作者 robin_reala大约 2 个月前

35 条评论

therealmarv大约 2 个月前
Great to see that somebody else creates a true open source XSLT 3 and XPATH 3 implementation!<p>I worked on projects which refused to use anything more modern than XSLT &amp; XPATH 1.0 because of lack of support in the non Java&#x2F;Net World (1.0 = tech from 1999). Kudos to Saxon though, it was and is great but I wished there were more implementations of XSLT 2.0 &amp; XPATH 2.0 and beyond in the open source World... both are so much more fun and easier to use in 2.0+ versions. For that reason I&#x27;ve never touched XSLT 3.0 (because I stuck to Saxon B 9.1 from 2009). I have no doubt it&#x27;s a great spec but there should be other ways than only Saxon HE to run it in an open source way.<p>It&#x27;s like we have an amazing modern spec but only one browser engine to run it ;)
评论 #43510081 未加载
评论 #43508642 未加载
评论 #43523514 未加载
infogulch大约 2 个月前
There are many humongous XML sources. E.g. the Wikipedia archive is 42GB of uncompressed text. Holding a fully parsed representation of it in memory would take even more, perhaps even &gt;100GB which immediately puts this size of document out of reach.<p>The obvious solution is streaming, but streaming appears to not be supported, though is listed under Challenging Future Ideas: <a href="https:&#x2F;&#x2F;github.com&#x2F;Paligo&#x2F;xee&#x2F;blob&#x2F;main&#x2F;ideas.md" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;Paligo&#x2F;xee&#x2F;blob&#x2F;main&#x2F;ideas.md</a><p>How hard is it to implement XML&#x2F;XSLT&#x2F;XPATH streaming?
评论 #43508261 未加载
评论 #43509467 未加载
评论 #43508016 未加载
评论 #43515534 未加载
评论 #43508100 未加载
评论 #43508062 未加载
vessenes大约 2 个月前
This, thirty years later, is the best pitch for XML I’ve read. Essentially, it’s a slow moving, standards-based approach to data interoperability.<p>I hated it the minute I learned about it, because it missed something I knew I cared about, but didn’t have a word for in the 90s - developer ergonomics. XML sucks shit for someone who wants to think tersely and code by hand. Seriously, I hate it with a fiery passion.<p>Happily to my mind the economics of easier-for-creators -&gt; make web browsers and rendering engines either just DEAL with weird HTML, or else force people to use terse data specs like JSON won out. And we have a better and more interesting internet because of it.<p>However, I’m old enough now to appreciate there is a place for very long-standing standards in the data and data transformation space, and if the XML folks want to pick up that banner, I’m for it. I guess another way to say it is that XML has always seemed to be a data standard which is intended to be what <i>computers</i> prefer, not <i>people</i>. I’m old enough to welcome both, finally.
评论 #43507948 未加载
评论 #43508508 未加载
评论 #43507792 未加载
评论 #43508760 未加载
评论 #43509082 未加载
评论 #43508501 未加载
评论 #43508749 未加载
评论 #43512363 未加载
评论 #43508153 未加载
评论 #43508445 未加载
评论 #43509823 未加载
评论 #43508423 未加载
评论 #43509101 未加载
评论 #43508498 未加载
montroser大约 2 个月前
Fun fact: XSLT still enjoys broad support across all major browsers: <a href="https:&#x2F;&#x2F;caniuse.com&#x2F;?search=xslt" rel="nofollow">https:&#x2F;&#x2F;caniuse.com&#x2F;?search=xslt</a>
评论 #43509580 未加载
评论 #43507236 未加载
评论 #43510000 未加载
athanagor2大约 2 个月前
The fact it could be compiled in WASM is a good thing, given the Chrome team was considering removing libxml and XSLT support a few years back. The reasons cited were mostly about security (and share of users).<p>It&#x27;s another proof that working on fundamental tools is a good thing.
评论 #43507194 未加载
egh大约 2 个月前
Very cool! I recently wrote an XSLT 2 transpiler for js (<a href="https:&#x2F;&#x2F;github.com&#x2F;egh&#x2F;xjslt" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;egh&#x2F;xjslt</a>) - it&#x27;s nice to see some options out there! Writing the xpath engine is probably the hard part (I relied on fontoxpath). I&#x27;m going to be looking into what you have done for inspiration!
airstrike大约 2 个月前
What problems are {elegantly, neatly, best} solved by using XPath and XSLT today that would make them reasonable choices over alternatives?
评论 #43507950 未加载
评论 #43508459 未加载
评论 #43507377 未加载
评论 #43508732 未加载
评论 #43507640 未加载
评论 #43507945 未加载
评论 #43508148 未加载
riedel大约 2 个月前
Love to see stuff outside the Java space since I really like thedoing stuff in XSLT. Question: Does this work on a textual XML representation or can you plug in different XML readers? I have had really great fun in the past using <a href="http:&#x2F;&#x2F;www.ananas.org&#x2F;xi&#x2F;" rel="nofollow">http:&#x2F;&#x2F;www.ananas.org&#x2F;xi&#x2F;</a> transforming arbitrarily for formated files using XSLT. Also it is today really important that XML Reader has error correction capabilities, since lots of tools don&#x27;t write well-formed XML, which often is a showstopper for employing to transforms from my experience.
jchw大约 2 个月前
I wonder if this could perhaps some day be used in Wine, for the MSXML implementations. Maybe not, since those implementations need to be bug-compatible where applications depend on said bugs; but the current implementation(s) are also not fantastic. I believe it is still using libxml2.<p>(Aside: A long time ago, I had written an alternate XPath 1.1 implementation for Wine during GSoC, but rather shamefully, I never actually got it merged. Life became very hectic for me during that time period and I never really looped back to it. Still feel pretty bad about it all these years later.)
samsk大约 2 个月前
Nice ! I&#x27;ve a scrapper using XPath&#x2F;XSLT extensively and 90% of the XPath selectors work like for years without a change. With CSS selectors I&#x27;ve had more problems...
评论 #43508707 未加载
mattrighetti大约 2 个月前
I will definitely try this out!<p>I have a service that extracts &lt;meta&gt; tags in webpages and to do that I&#x27;m currently using (and need) three different dependencies: html5ever, markup5ever_rcdom, markup5ever. I don&#x27;t like those to be honest, the documentation is quite bad and it was difficult to understand how I should have used the libraries to achieve such a simple task.<p>XPath on the other hand makes this extremely easy in comparison, I wonder how this will perform compared to my current solution.
评论 #43508275 未加载
mdaniel大约 2 个月前
I always hate it when license files have &quot;yes, but&quot; language in them because if the license file differs in some non-obvious way, now I have to pay lawyers to interpret it<p><a href="https:&#x2F;&#x2F;github.com&#x2F;Paligo&#x2F;xee&#x2F;blob&#x2F;xee-v0.1.5&#x2F;COPYRIGHT" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;Paligo&#x2F;xee&#x2F;blob&#x2F;xee-v0.1.5&#x2F;COPYRIGHT</a><p>And that goes double for when there is a <i>separate</i> LICENSE file in the repo <a href="https:&#x2F;&#x2F;github.com&#x2F;Paligo&#x2F;xee&#x2F;blob&#x2F;xee-v0.1.5&#x2F;LICENSE-MIT" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;Paligo&#x2F;xee&#x2F;blob&#x2F;xee-v0.1.5&#x2F;LICENSE-MIT</a>
评论 #43512395 未加载
tracnar大约 2 个月前
Nice! I tried using XQuery (superset of XPath 3) for a while through the BaseX implementation. It&#x27;s pretty nice, but you have to face XML problems like namespaces, document order, attributes vs nodes, you don&#x27;t know if you can have 0, 1 or more nodes, etc. Something I wish was more readily available would be to run XPath against JSON, yaml, etc. It&#x27;s a nicer language than say jq, but its ties to XML sometimes make it hard to transfer.<p>Another pain point with XML is the lack of inline schema, so the languages around like XPath have to work with arbitrary structures unlike say JSON where you at least have basic primitives like map&#x2F;dict, numbers, bool, etc
trympet大约 2 个月前
I recently had the pleasure of using XSLT after never having seen it before. I used it to transform a huge 130K line XML manifest with MAPI property metadata into C# source code. It was so simple, readable, and intuitive to use.
评论 #43514906 未加载
nickm12大约 2 个月前
This is fantastic to see! I&#x27;ve used XML off and on since it was the red hot tech of the early 2000s. I wouldn&#x27;t choose it today for a green field project, but it&#x27;s still around in so many places, so we definitely need a high-performance, high-quality library written in Rust for this.<p>This could become a great foundation for a typed, (mostly) etree-compatible, python library built on top of this. I&#x27;ve used lxml for years and it&#x27;s still my goto, but there are lots of places where it could be modernized.
threecheese大约 2 个月前
This is great, I’ve been looking for performant and safe XML processing to replace IBM stuff (websphere&#x2F;datapower) that we really only keep around for hw accelerated payload processing. At our scale, lxml and others + BYO gateway tech has a similar run cost even considering IBM licensing. I hate running their crap, which requires k8s at a version that’s some hair-thin slice above the minimum supported EKS version, it’s almost like they want us to live in 24&#x2F;7 fear of being OOS.
1shooner大约 2 个月前
I miss the declarative purity of XSLT as an HTML templating layer. I&#x27;d love to know if there is a similar system for more popular&#x2F;current web stack.
o_pax大约 2 个月前
This is really good news, I am looking forward to trying it out! Is XQuery also planned as an additional frontend? By the way, there is also χrust, a rust project working towards pretty similar goals (XPath 3.1, XQuery 3.1 and XSLT 3.0). At first glance, the architecture also seems quite similar, it is not as far along, though. Have you had any contact with them?
ianand大约 2 个月前
Fun fact: A decade ago the designer of HAML and Sass created a modern alternative to XSLT. <a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Tritium_(programming_language)" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Tritium_(programming_language)</a>
smitty1e大约 2 个月前
&gt; XML is now niche technology, but it&#x27;s a bigger niche than you might think, and it&#x27;s not going to go away any time soon.<p>When you consider that .docx, .pptx, and .xlsx files are zipped XML archives, &quot;niche&quot; seems a misnomer.
评论 #43512209 未加载
nashashmi大约 2 个月前
Just want to say that Microsoft has some sort of implementation of an xml application using Microsoft word or Ms word. But I have struggled to find examples I can use, but for a long time I have been trying to convert an office repository of corporate resumes to xml.
immibis大约 2 个月前
XSLT is great for nerd cred, when someone selects &quot;view source&quot; on your page and there&#x27;s not an HTML tag in sight. I did this once.<p>Maybe it&#x27;s good for compression, but probably not by a factor much bigger than gzip&#x2F;brotli&#x2F;zstd.
xvilka大约 2 个月前
I miss XHTML and XSL times. Time, when Web would have been more prepared for the AI consumption, less dynamic nonsense, and more focus on the actual content. Time shows all these Flash and Java gimmicks died off.
chromatin大约 2 个月前
NCBI still emits XML from their most prominent databases (e.g., PubMed). I&#x27;m looking forward to adopting this library into some of my production code that interfaces with PubMed!
blacklion大约 2 个月前
Does XSLT still used in a new projects? I have impression, that it was not popular even when XML was.<p>For example, apache HTTPD never has official module to serve XML via XSLT transformation.<p>And XSL:FO looks even more obscure.
评论 #43510251 未加载
评论 #43510090 未加载
评论 #43523041 未加载
mvc大约 2 个月前
Nice work. Xpath is a beast. Obvious why paligo would be interested too. Must be a lot of commercial documentation out there where the best representation they can get looks a bit XMLish.
yxhuvud大约 2 个月前
I hope this will be packaged into shared libraries at some point so that languages that isn&#x27;t rust will get access to it.
评论 #43511533 未加载
torginus大约 2 个月前
I yearn for the day when people will stop considering the main advertising bullet point feature that their software was written in Rust. Rust 1.0 was released a decade ago, plenty of time for its alleged technical advantages to become apparent.<p>It&#x27;s like a handbag whose main claim to being a premium product isn&#x27;t workmanship or materials, but that it has Gucci on its side.
评论 #43515226 未加载
评论 #43515394 未加载
评论 #43514758 未加载
hcayless大约 2 个月前
This sounds fantastic! Thank you for your work. Now I gotta go learn Rust :-).
shadowtree大约 2 个月前
Throwback shoutout to Steve Muench and his genius method of grouping elements in XSLT 1.0.<p>So good it has its own Wikipedia page!<p><a href="https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;XSLT&#x2F;Muenchian_grouping" rel="nofollow">https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;XSLT&#x2F;Muenchian_grouping</a><p>I mean, talk about hacker cred.
stuaxo大约 2 个月前
eXcellent, it&#x27;s good to see new work on XSLT, reviled bysome it&#x27;s actually great tech and useful in all sorts of places.
notfed大约 2 个月前
Does it preserve whitespace? Something that I always found asinine about XSLT is that it wipes out whitespace when transforming. Imagine you have thousands of corporate XML files in source control, and you want to tranform them all, performing some simple mutation. XSLT claims to be fit for this job, but in practice your diff is going to be full of unintentional whitespace mangling.
评论 #43508690 未加载
mickeyp大约 2 个月前
It&#x27;s interesting to see the slow rehabilitation of XML and its tooling now that there&#x27;s a new generation of developers who have not grown up in the shadow of XML&#x27;s prime in the late 90s &#x2F; early 2000s, and who have not heard (or did not buy into) the anti-XML crowd&#x27;s ranting --- even though some of their criticisms were legitimate.<p>I&#x27;ve always liked XML, and especially XPath, and even though there were a large number of missteps in the heyday of XML, I feel it has always been unfairly maligned. Look at all the people who reinvent XML tooling but for JSON, but not nearly as well. Luckily, people who value XML can still use it, provided the fit is right. But it&#x27;s nice to see the tides turning.<p>Most fashions really are cyclical.
评论 #43507909 未加载
评论 #43507894 未加载
评论 #43507454 未加载
评论 #43512112 未加载
评论 #43508304 未加载
评论 #43508151 未加载
评论 #43508035 未加载
评论 #43510357 未加载
评论 #43508249 未加载
评论 #43508066 未加载
评论 #43508290 未加载
infogulch大约 2 个月前
&gt; I was at XML Prague, an XML conference<p>There&#x27;s an <i>XML conference</i>?!
评论 #43509556 未加载
dev_l1x_be大约 2 个月前
XML is the what OOP is for programming languages. Often overcomplicated, hard to follow, full of footguns.