TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Python utility for tracking third party dependencies within a library

189 pointsby prashantgupta24about 3 years ago

8 comments

simonwabout 3 years ago
Tried this on one of my projects, it&#x27;s neat.<p><pre><code> python3 -m import_tracker --name datasette --recursive | jq { &quot;datasette&quot;: [ &quot;aiofiles&quot;, &quot;click&quot;, &quot;markupsafe&quot;, &quot;mergedeep&quot;, &quot;pluggy&quot;, &quot;yaml&quot; ], &quot;datasette.version&quot;: [], &quot;datasette.utils.shutil_backport&quot;: [ &quot;click&quot;, &quot;markupsafe&quot;, &quot;mergedeep&quot;, &quot;yaml&quot; ], &quot;datasette.utils.sqlite&quot;: [ &quot;click&quot;, &quot;markupsafe&quot;, &quot;mergedeep&quot;, &quot;yaml&quot; ], &quot;datasette.utils&quot;: [ &quot;click&quot;, &quot;markupsafe&quot;, &quot;mergedeep&quot;, &quot;yaml&quot; ], &quot;datasette.utils.asgi&quot;: [ &quot;aiofiles&quot;, &quot;click&quot;, &quot;markupsafe&quot;, &quot;mergedeep&quot;, &quot;yaml&quot; ], &quot;datasette.hookspecs&quot;: [ &quot;aiofiles&quot;, &quot;click&quot;, &quot;markupsafe&quot;, &quot;mergedeep&quot;, &quot;pluggy&quot;, &quot;yaml&quot; ] } </code></pre> Related tool: pipdeptree - here&#x27;s the output from that against a project that installs a lot of extra stuff: <a href="https:&#x2F;&#x2F;github.com&#x2F;simonw&#x2F;latest-datasette-with-all-plugins&#x2F;blob&#x2F;main&#x2F;pipdeptree.txt" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;simonw&#x2F;latest-datasette-with-all-plugins&#x2F;...</a>
评论 #31508592 未加载
samwillisabout 3 years ago
This looks like a real useful tool for large projects, it can be quite possible to loose track of what a specific dependancy is used for. I also like the idea of making an import lazy so in monolithic app you could have a deployment that excludes functionality, and exclude its dependancies.<p>When I read the title I was hoping for something else though, what I would love is a tool that logs and potentially blocks unexpected IO operations on a library basis. With the increasing common supply chain attacks we are seeing (there was a PyPI one just the other day), having a way to at least report on unexpected activity if not help prevent it would be brilliant. Has anyone ever found a tool like that?<p>(Obviously the ultimate solution would be an outbound firewall, but it seems be that although you can easily do this in a VM or bare metal, I haven&#x27;t seen any PAAS platforms have that sort of capability)
评论 #31508642 未加载
评论 #31509278 未加载
hrpnkabout 3 years ago
You can use Syft [1] which generates the full software bill of materials, which includes package names, licenses for a broad set of tech stack ranging from OS level (Alpine, Debian), through Go, Ruby, Python, Java, JavaScript, etc.<p>[1] <a href="https:&#x2F;&#x2F;github.com&#x2F;anchore&#x2F;syft" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;anchore&#x2F;syft</a>
评论 #31508782 未加载
cubesabout 3 years ago
This looks really neat. One thing I noticed on reading the source code, it appears to actually import the modules:<p>Quoting the docstring on the `track_module` function:<p><pre><code> &quot;&quot;&quot;This function executes the tracking of a single module by launching a subprocess to execute this module against the target module. The implementation of thie tracking resides in the __main__ in order to carefully control the import ecosystem. </code></pre> Source: <a href="https:&#x2F;&#x2F;github.com&#x2F;IBM&#x2F;import-tracker&#x2F;blob&#x2F;67a1e84e5a609e52e9bab866496af60e906d97c7&#x2F;import_tracker&#x2F;import_tracker.py#L33-L36" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;IBM&#x2F;import-tracker&#x2F;blob&#x2F;67a1e84e5a609e52e...</a><p>Here&#x27;s the actual subprocess call: <a href="https:&#x2F;&#x2F;github.com&#x2F;IBM&#x2F;import-tracker&#x2F;blob&#x2F;67a1e84e5a609e52e9bab866496af60e906d97c7&#x2F;import_tracker&#x2F;import_tracker.py#L96-L97" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;IBM&#x2F;import-tracker&#x2F;blob&#x2F;67a1e84e5a609e52e...</a><p><pre><code> # Launch the process proc = subprocess.Popen(shlex.split(cmd), stdout=subprocess.PIPE, env=env) </code></pre> I think this is clever, and maybe even necessary, but feels risky to do on unaudited third-party Python libraries.<p>Maybe I&#x27;m misunderstanding something?
评论 #31510112 未加载
评论 #31510860 未加载
评论 #31513597 未加载
throwamonabout 3 years ago
Related: &quot;Probably the most complete python dependency database&quot; - <a href="https:&#x2F;&#x2F;github.com&#x2F;DavHau&#x2F;pypi-deps-db" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;DavHau&#x2F;pypi-deps-db</a><p>&gt; This data allows deterministic dependency resolution which is required by mach-nix[1] to generate reproducible python environments.<p>[1]: <a href="https:&#x2F;&#x2F;github.com&#x2F;DavHau&#x2F;mach-nix&#x2F;" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;DavHau&#x2F;mach-nix&#x2F;</a>
DishyDevabout 3 years ago
This tool has reminded me a few years ago I created a helper web utility that let me search Python libraries and get a tree view of their dependencies, and some license info. I had to do a lot of manual Python library compliance before we had tools like Blackduck.<p>It accepts dependencies in requirements.txt format (e.g. Django==3.1 or tensorflow) <a href="https:&#x2F;&#x2F;pydepchecker.z33.web.core.windows.net&#x2F;" rel="nofollow">https:&#x2F;&#x2F;pydepchecker.z33.web.core.windows.net&#x2F;</a><p>It&#x27;s got a few shortcomings. Dependency resolution in Python is pretty difficult to work out when you&#x27;ve got a lot of libraries with common dependencies. And the license info on Pypi isn&#x27;t always correct. But it&#x27;s always been a quick useful tool for me.
lifeisstillgoodabout 3 years ago
The couple of times I have done something similar I have found an odd outcome - first was internal (to large company codebase) and second was npm imports years ago. Both times one ended up pulling in huge numbers of dependencies (900+ npm, 600 or so internal)<p>The point was that pretty no much what starting point one used, you pulled in much the same amount. There was a common core but even so it was like a starfish - if you start at tip of one limb, you pull in that limb and the core. start on another limb same thing.<p>but all the limbs are about the same size<p>it&#x27;s just anecdata but it has been at the back of my mind as some kind of rule.
barefegabout 3 years ago
Could I use the lazy import to define a single set of dependencies of a monorepo and then load only the required subset for each project?
评论 #31517499 未加载
评论 #31508387 未加载