TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Python extensions should be lazy

115 pointsby 0x63_Problems9 months ago

7 comments

raymondh9 months ago
This is an impressive post showing some nice investigative work that isolates a pain point and produces a performant work-around.<p>However, the conclusion is debatable. Not everyone has this problem. Not everyone would benefit from the same solution.<p>Sure, if your data can be loaded, manipulated, and summarized outside of Python land, then lazy object creation is a good way to go. But then you&#x27;re giving up all of the Python tooling that likely drove you to Python in the first place.<p>Most of the Python ecosystem from sets and dicts to the standard library is focused on manipulating native Python objects. While the syntax supports method calls to data encapsulated elsewhere, it can be costly to constantly &quot;box and unbox&quot; data to move back and forth between the two worlds.
评论 #41185568 未加载
评论 #41186017 未加载
jay-barronville9 months ago
Evan, just a tip…<p>When linking to code on GitHub in an article like this, for posterity, it’s a good idea to link based on a specific commit instead of a branch.<p>It might be a good idea to change your link to the `Py_CompileStringObject()` function in CPython’s `Python&#x2F;pythonrun.c` [0] to a commit-based link [1].<p>[0]: <a href="https:&#x2F;&#x2F;github.com&#x2F;python&#x2F;cpython&#x2F;blob&#x2F;main&#x2F;Python&#x2F;pythonrun.c#L1425">https:&#x2F;&#x2F;github.com&#x2F;python&#x2F;cpython&#x2F;blob&#x2F;main&#x2F;Python&#x2F;pythonrun...</a><p>[1]: <a href="https:&#x2F;&#x2F;github.com&#x2F;python&#x2F;cpython&#x2F;blob&#x2F;967a4f1d180d4cd669d5c6e3ac5ba99af4e72d4e&#x2F;Python&#x2F;pythonrun.c#L1425-L1453">https:&#x2F;&#x2F;github.com&#x2F;python&#x2F;cpython&#x2F;blob&#x2F;967a4f1d180d4cd669d5c...</a>
评论 #41188219 未加载
评论 #41186554 未加载
formerly_proven9 months ago
&gt; In the case of ASTs, one could imagine a kind of ‘query language’ API for Python that operates on data that is owned by the extension - analogous to SQL over the highly specialized binary representations that a database would use. This would let the extension own the memory, and would lazily create Python objects when necessary.<p>You could make the API transparently lazy, i.e. ast.parse creates only one AstNode object or whatever and when you ask that object for e.g. its children those are created lazily from the underlying C struct. To preserve identity (which I assume is something users of ast are more likely to rely on than usual) you&#x27;d have to add some extra book-keeping to make it not generate new objects for each access, but memoize them.
评论 #41185611 未加载
pdhborges9 months ago
How much time did the PyAST_mod2obj actually take? The rewritte is 16x faster but the article doesn&#x27;t make it clear if most of the speedup came from switching to the ruff parser (specially because it puts the GC overhead at only 35% of the runtime).
评论 #41188240 未加载
lalaland11259 months ago
Optimizing Python extensions is becoming increasingly important as Python is used in more and more compute intensive environments.<p>The key for optimizing a Python extension is to minimize the number of times you have to interact with Python.<p>A couple of other tips in addition to what this article provides:<p>1. Object pooling is quite useful as it can significantly cut down on the number of allocations.<p>2. Be very careful about tools like pybind11 that make it easier to write extensions for Python. They come with a significant amount of overhead. For critical hotspots, always use the raw Python C extension API.<p>3. Use numpy arrays whenever possible when returning large lists to Python. A python list of python integers is amazingly inefficient compared to a numpy array of integers.
评论 #41186580 未加载
评论 #41185229 未加载
评论 #41184003 未加载
评论 #41185261 未加载
评论 #41190918 未加载
评论 #41185360 未加载
评论 #41188486 未加载
评论 #41187306 未加载
truth_seeker9 months ago
Preloading jemalloc binary can help here. Of course it wont be as efficient as using numpy 2.x especially for dealing with larger datasets.<p>jemalloc also gave good results with NodeJS and Ruby projects i did.
hackan9 months ago
Nice article!<p>But I couldn&#x27;t help but notice that when `_PyCompile_AstOptimize` fails (&lt;0), then `arena` is never freed. I think this is bug :thinking:.