TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Greplin (YC W10) open sources 10-15x faster protocol buffers for Python

197 点作者 rwalker超过 14 年前

11 条评论

haberman超过 14 年前
For a long time (much longer than I expected it would take) I've been working on a protobuf implementation in C that does <i>not</i> use Google's C++ implementation at all. I've been through about three rewrites and I finally have the interface right. I'm hoping it will be usable with Python soon (weeks).<p><a href="https://github.com/haberman/upb/wiki" rel="nofollow">https://github.com/haberman/upb/wiki</a><p>(if anyone's looking at the code, I'm working on the src-refactoring branch at the moment)<p>The benefits of my approach are:<p>* you can avoid depending on a 1MB C++ library. upb is more like 30k compiled.<p>* you can avoid doing any code generation. instead you just load the .proto schema at runtime, so you don't have to get a C++ compiler involved.<p>* Google's protobuf library does have a dynamic/reflection option that avoids my previous point, but it is ~10x slower than generating C++ code. My library, last time I benchmarked it, was 70-90% of the speed of generated C++.
评论 #2146414 未加载
评论 #2145829 未加载
评论 #2146916 未加载
sigil超过 14 年前
I too have a speedy Protocol Buffer implementation in Python:<p><a href="https://github.com/acg/lwpb" rel="nofollow">https://github.com/acg/lwpb</a><p>It clocks in at 11x faster than json, the same speedup reported by fast-pb. Only with lwpb:<p>* There's no codegen step -- which is a disgusting thing in a dynamic language, if you ask me.<p>* You're not forced into object oriented programming, with lwpb you can decode and encode dicts.<p>Most of haberman's remarks apply to lwpb as well, ie it's fast, small, and doesn't pull huge dependencies. The lwpb C code was originally written by Simon Kallweit and is similar in intent to upb.
评论 #2146394 未加载
atamyrat超过 14 年前
We (<a href="http://connex.io/" rel="nofollow">http://connex.io/</a>) use Protocol Buffers quite heavily, and Python implementation was the performance bottleneck in many places.<p>I was working on same thing, CyPB, which is 17 times faster than Google's Python implementation. <a href="https://github.com/connexio/cypb" rel="nofollow">https://github.com/connexio/cypb</a><p>This one seems more complete though at the moment. I might just mark the ticket in our tracker as closed and switch to fastpb :-/
nostrademons超过 14 年前
Nifty. I've passed it along to the appropriate folks.<p>Google uses SWIG-wrapped C++ proto bindings in Python pretty extensively, so I'm not sure how much this gets over that approach. I checked out the source; it's basically using Jinja templates to autogen Python/C API calls. Basically like SWIG, but not using SWIG.
评论 #2145950 未加载
apotheon超过 14 年前
It doesn't appear to actually be open source:<p>&#62; # Copyright 2010 Greplin, Inc. All Rights Reserved.<p>Where's the license?<p>I think the term you want is "publishes", and not "open sources".
评论 #2145785 未加载
评论 #2146959 未加载
peterlai超过 14 年前
I hope to see these changes incorporated within Google's official implementation.<p>As of right now, deserialization of json and xml are way faster in Python: <a href="http://stackoverflow.com/questions/499593/whats-the-best-serialization-method-for-objects-in-memcached" rel="nofollow">http://stackoverflow.com/questions/499593/whats-the-best-ser...</a>
dirtae超过 14 年前
This is very welcome, but I hope Google fixes this problem in the official protobuf distribution.<p>It looks like protobuf 2.4.0 has experimental support for backing Python protocol buffers with C++ via the PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION environment variable:<p><a href="http://protobuf.googlecode.com/svn/trunk/CHANGES.txt" rel="nofollow">http://protobuf.googlecode.com/svn/trunk/CHANGES.txt</a>
traviscline超过 14 年前
Is this really a better approach than using Cython to wrap a c++ or c implementation?
sigil超过 14 年前
You should add cPickle to the benchmark as well -- I bet fast-pb still comes out ahead, and that may be an eye opener for many Python devs.
andrewvc超过 14 年前
MessagePack is up to 4x faster* than protobuf, and easier to work with btw IMHO.<p><a href="http://msgpack.org/" rel="nofollow">http://msgpack.org/</a><p>I used it as the native format for DripDrop (<a href="https://github.com/andrewvc/dripdrop" rel="nofollow">https://github.com/andrewvc/dripdrop</a>)<p>* In Some tests
评论 #2146147 未加载
评论 #2148334 未加载
sigil超过 14 年前
Has anyone managed to run the fast-pb tests in benchmark.py? I'm not sure where this switch is coming from:<p><pre><code> protoc --fastpython_out</code></pre>
评论 #2146325 未加载