TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Ask HN: How do you deal with large Python code bases?

8 pointsby StefanWestfalalmost 2 years ago
Currently I am working on a Python code base with 10k - 100k loc. The service does not depend on Python SciPy tooling or similar but is web backend on top a large db. I worked with Python for several years and it is still my go to language for a lot of task. However I feel that Python advantages like dynamic typing become increasingly disadvantages when the code base grows. How do you deal with Python in large code bases? We use MyPy, black, Tox, pytest etc. any tips? I feel like (for the next project) moving to a typed language like Kotlin/Go etc. might be the better choice then using Python when not depending on the SciPy/ML/DL stack.

7 comments

dstrombergalmost 2 years ago
Python _is_ typed. Python is Dynamically Typed. The word &quot;Dynamically&quot; does not suddenly erase the word &quot;Typed&quot;. Even CPython is Dynamically Typed.<p>You presumably mean Statically, Manifestly Typed.<p>Assembly Language and Forth are Untyped. The closest thing they have to types are bytes and cells.<p>For examples of a language that is Statically but not Manifestly Typed, have a look at Shedskin (which is a dialect of Python) or Haskell (which is a fascinating Functional, Lazy language with a regrettably poor compiler).<p>I continue to see mypy, or something like it, as the best way of taking a medium sized Python project into the world of large projects. For smaller projects, ruff or pylint are sufficient, but for large projects, mypy and similar are the way to go.
zntalmost 2 years ago
If databases are in play, instead of trying to understand the business logic, have a go at understanding the database structure first.<p>What are your tables, how they are related etc.<p>Then look at the modules that interact with the data layer directly and move up.<p>This is a language agnostic approach though, but it has worked out well for me.
评论 #36115669 未加载
MichaelRazumalmost 2 years ago
What exactly are your problem&#x27;s? It is hard to give advice without knowing. To be honest I don&#x27;t think it makes things easier if you switch to Go, since less verbose.
评论 #36116333 未加载
pg_1234almost 2 years ago
Regression tests.<p>Static typing still doesn&#x27;t help for the more insidious issues - errors of value.
评论 #36218121 未加载
jstx1almost 2 years ago
How are you finding mypy for your project?<p>I find that I waste a lot of time telling mypy to ignore some library that it doesn&#x27;t understand, much more than I&#x27;m saving by catching type errors (which is never). People add it to projects because it seems like a best practice and the proper thing to do, but I&#x27;m kind of unconviced that it&#x27;s making things better.
评论 #36218065 未加载
评论 #36116255 未加载
lordkrandelalmost 2 years ago
Have a look at www.github.com&#x2F;odoo&#x2F;odoo . It&#x27;s a full blown ERP and more, in Python.
hayst4ckalmost 2 years ago
The larger your codebase gets the more bazel becomes a requirement. Bazel is really not negotiable for large python code bases. The more bazel is put off, the more pain you will endure before you eventually are forced to use bazel. You <i>will</i> be forced to use bazel or a system like it because eventually your good devs will not tolerate your codebase and leave without it.<p><a href="https:&#x2F;&#x2F;bazel.build&#x2F;" rel="nofollow">https:&#x2F;&#x2F;bazel.build&#x2F;</a><p>Other than bazel you will have to start hacking away at dependency problems.<p>No inline imports, no circular imports. Imports all sorted at the top of your file. You will have to start enforcing good hygiene with linters.<p>You will need to create warnings against using the global scope.<p>You will need to construct the clients for all your dependencies in main()<p>You will need to discourage the use of calling non trivial functions in constructors. (this property largely encourages dependency injection).<p>There are exceptions to every rule, but if you are going to violate scope or not dependency inject, those things need to be done very mindfully.<p>As the structure of your code improves via good scoping and injected dependencies, it will become easier to change <i>and</i> easier to test.<p>You will have to devote some serious consideration to how to quarantine business logic from server code. Generally, your product developers shouldn&#x27;t be doing much outside of defining their data and altering business logic from within a route. If the place where business logic is executed is commingled with how data-stores are manipulated, you&#x27;re going to have a bad time. Likewise if the place business logic is executed is commingled with the presentation of it to customers, you&#x27;re going to have a bad time.<p>Python does not have a culture of dependency injection because it&#x27;s so easy to import antigravity and fly away. This makes writing tests hard and promotes spaghetti code. Lack of dependency injection (which means violating scoping) is the entropic force that makes codebases miserable as time increases.<p>Additionally, you will have to think hard about state. If you can&#x27;t restart a process trivially, or balance traffic to a different machine trivially, you are going to make your operational people&#x27;s lives hard. State belongs in state storage. Put it in an RDBMS, put it in redis, put it in memcached, put it in anything but a python processes memory (or disk). This means that any two requests should be able to be sent to any two machines. This is a deeply important property for scaling.<p>Lastly, if you do not have good answers for observability, in terms of time series data, log data, exception data, and event data (for observability only), you will have a bad time. These are generally the things it is ok to violate scope to use.
评论 #36218180 未加载