As someone who owns multiple label makers and organizes my utensils by type when I put them into the dishwasher I really like Mypy. Cool writeup, good points. I think a lot of people (ML scientist people especially) who haven't been forced to use a type checker don't realize the productivity benefits of having a type checker running on your whole codebase - they only see the annoying slow-down of having to do things a particular way and having the type checker bug you for inconspicuous errors.<p>Example: One pattern (which I don't think a lot of people are familiar with?) that I started adopting recently is the use of `Literal` for type-checking strings. For example, instead of something like<p>(on closer reading I realized this was in the blog post as well, but I suspect maybe some ML people will have seen this specific case before)<p><pre><code> class ActivationType(enum.Enum):
sigmoid = "sigmoid"
tanh = "tanh"
def get_activation(key: str | ActivationType) -> nn.Module:
key = ActivationType[key]
if key == ActivationType.sigmoid:
return nn.Sigmoid()
if key == ActivationType.tanh:
return nn.Tanh()
raise KeyError(key)
</code></pre>
you can do something like this instead:<p><pre><code> from typing import Literal
ActivationType = Literal["sigmoid", "tanh"]
def get_activation(key: ActivationType) -> nn.Module:
if key == "sigmoid":
return nn.Sigmoid()
if key == "tanh":
return nn.Tanh()
raise KeyError(key)
</code></pre>
The advantage is that you can do something like<p><pre><code> act = get_activation("tahn")
</code></pre>
and Mypy will show an error for your typo (instead of having to run your code and eventually hit the `KeyError`). So if you're just trying to quickly implement an idea, you don't have to kill brain cells searching for typos.<p>Of course, doesn't make a difference if your coworkers all use Vim and Emacs with no extensions...
I can't articulate specifically why but using typing in Python just feels like so much pain compared to other languages that have "opt-in" nominal typing syntax (PHP et al.).<p>The workflow feels frustrating, the ecosystem seems diverse and has no clear "blessed" path, and I'm still confused about what is bundled by Python and what I need to pull in from an external source. I REALLY want to use mypy but by the time I've figured out how to pull it all together I probably could have finished the program I'm working on.<p>The relevant factor here might be the size of Python programs I typically work on, somewhere between a few hundred lines and a few thousand.<p>I'm glad other people are having success because hopefully that'll smooth the pavement for the next time I circle around and try to add some meaningful types to my Python programs.
> I typically show candidates a snippet that uses typing.Protocol as part of a broader technical discussion, and I can’t recall any candidates having seen that specific construct before<p>I think the `typing.Protocol` [1] (aka "structural subtyping" or "static duck typing") does not get enough spotlight! This is one of the keys to migrate a very pythonic codebase to type hints, and allows to avoid infinite type hints shenanigans all over the place. Of course, MyPy supports this feature natively [2].<p>[1] <a href="https://docs.python.org/3/library/typing.html" rel="nofollow">https://docs.python.org/3/library/typing.html</a>
[2] <a href="https://mypy.readthedocs.io/en/stable/protocols.html" rel="nofollow">https://mypy.readthedocs.io/en/stable/protocols.html</a>
> Mypy catches bugs<p>100% yes. It’s much better than the examples would lead you to believe. Mypy catches stuff like:<p><pre><code> def f(arg: Optional[Object]):
arg.method() # type error
if arg is not None:
arg.method() # ok
</code></pre>
It’s half the goodness of what you’d get from a well-typed language like Haskell or Rust but with the ability to say “trust me on this” and disable type checks for a line or two. Honestly I wish go tooling checked for nil pointer use as well as mypy unwraps optional values. Every time I add type hints and use mypy I find bugs. I would never make the case that type hints are better than a strongly typed language (especially with pattern matching), but it’s a great balance when writing python.
Painpoint with type annotations: not being able to reuse "shapes" of data, e.g. struct-like fields such as TypedDict, NamedTuple, dataclasses.dataclass, and soon *kwargs (PEP 692 [1]) via TypedDict.<p>Right now, there isn't a way to load up a JSON / YAML / TOML into a dictionary, upcast it via a `TypedGuard`, and pass it into a TypedDict / NamedTuple / dataclass.<p>dataclasses.asdict() or dataclasses.astuple() return naïve / untyped tuples and dicts. Also the factory functions will not work with TypedDict or NamedTuple, respectively, even if you duplicate the fields by hand [2].<p>Standard library doesn't have runtime validation (e.g. pydantic [3]). If I make a typed NamedTuple/TypedDict/dataclass with `apples: int`, nothing is raised in runtime when a string is passed.<p>Other issues you may run into using mypy:<p>- pytest fixtures are hard. It's repetitious needing to re-annotate them every test.<p>- Django is hard. PEP 681 [4] may not be a saving grace either [5]. Projects like django-stubs don't give you completions, it'd be a dream to see reverse relations in django models.<p>- Some projects out there have very odd packaging and metaprogramming that make typing and completions impossible: faker, FactoryBoy.<p>[1] <a href="https://peps.python.org/pep-0692/" rel="nofollow">https://peps.python.org/pep-0692/</a>
[2] <a href="https://github.com/python/typeshed/issues/8580" rel="nofollow">https://github.com/python/typeshed/issues/8580</a>
[3] <a href="https://github.com/pydantic/pydantic" rel="nofollow">https://github.com/pydantic/pydantic</a>
[4] <a href="https://peps.python.org/pep-0681/" rel="nofollow">https://peps.python.org/pep-0681/</a>
[5] <a href="https://github.com/microsoft/pyright/blob/8a1932b/specs/dataclass_transforms.md#django" rel="nofollow">https://github.com/microsoft/pyright/blob/8a1932b/specs/data...</a>
For a while I used both mypy and pyright for my team’s codebase. After about half a year I eventually dropped mypy . I think type checking is valuable just that most of errors mypy detected pyright also caught and using newer type features often led to mypy false positives. I had trouble justifying using both when I could require my teammates to install pyright. Advanced type features tend to run into more bugs and while both are well maintained, pyright’s maintenance is magical. I do not know any other open source library that fixes bugs as fast (most bugs are fixed in under a week). The main thing that eventually forced decision was a flaky (depends on cache) mypy crash using paramspecs half a year ago. At time paramspec support was still in progress and there’s a good chance that specific issue is fixed.<p>The main awkwardness of pyright is it’s node library and most python devs I work with don’t interact much with node. But my team has a bash script that installs all our dependencies including node as needed (nvm) which mostly works. One benefit is you can use pyright as an LSP and it works very convenient in vscode.<p>Edit: 3rd party library lacking types is probably biggest issue. As my codebase is mostly typed by itself I’ve started gradually writing type stubs for library apis we use. Only writing stubs for small percent of what we use helps but there’s still a ton to add given codebase was started without types.
I am moving all my open source projects to `mypy --strict`. Here's the diff of adding basic / --strict mypy types:<p>libvcs: <a href="https://github.com/vcs-python/libvcs/pull/362/files" rel="nofollow">https://github.com/vcs-python/libvcs/pull/362/files</a>, <a href="https://github.com/vcs-python/libvcs/pull/390/files" rel="nofollow">https://github.com/vcs-python/libvcs/pull/390/files</a><p>libtmux: <a href="https://github.com/tmux-python/libtmux/pull/382/files" rel="nofollow">https://github.com/tmux-python/libtmux/pull/382/files</a>, <a href="https://github.com/tmux-python/libtmux/pull/383/files" rel="nofollow">https://github.com/tmux-python/libtmux/pull/383/files</a><p>unihan-etl: <a href="https://github.com/cihai/unihan-etl/pull/255/files" rel="nofollow">https://github.com/cihai/unihan-etl/pull/255/files</a>, <a href="https://github.com/cihai/unihan-etl/pull/257/files" rel="nofollow">https://github.com/cihai/unihan-etl/pull/257/files</a><p>Perks:<p>- code completions (through annotating)<p>- typings can be used downstream (since the above are all now typed python libraries)<p>- maintainability, bug finding + Easy to wire into CI and run locally<p>Longterm, unsure of the return on investment. I do promise to report back if I find it's not worth the effort.
The more I think about it, the more I think that the controversy around static type annotations in Python boils down to this:<p><pre><code> Improved readability
</code></pre>
This is very subjective, and is particularly sensitive in a language like Python which (rightly) has such a strong historical emphasis on readability above almost anything else.<p>My personal opinion is that static type annotations are extremely <i>detrimental</i> to readability. They add jarring line noise that makes reading Python much less like reading English. Hitting a type annotation when reading Python forces my brain into little backtracking loops which hugely diminishes my ability to form a mental model of the code from a quick read.<p>I wonder if people who come to Python from other languages (that are already statically typed) are accustomed to the poor comprehension introduced by types, and so don't experience this drawback.
I've been using Python since 2008 and the type annotations are the thing that broke me. I don't want to configure another tool on every project just to get (incomplete) type checking. Why do I have to pick a type checker? Just check my types.<p>`typing.Protocol` is a poorly designed `Interface`.
`abc` is a band-aid over missing `abstract` class/method syntax.<p>Things that should be part of the language are left to libraries. Just add interfaces, enums, and abstract classes/methods to the language.<p>I'm helping a new developer learn Python and having to explain all of the hoops I've been jumping through for the past 15 years is embarrassing. Making things "simpler" is making things more complex.<p>My current project is going to be my last Python project. I'm tired of add-ons and hacks, I want a complete language.
Really like Mypy.
I have coded many python micro service with different framework but my minimum core set is:<p><pre><code> - Black (formatting)
- Isort (import order)
- MyPy (typing)
- Pylint (linting)
</code></pre>
Edit: s/unit test/linting
Huge fan of Mypy but you lost me at LOC worship.<p>Why do we do this? The more LOC is somehow attributed to more features? That simply isn’t true. More LOC means two things. One, what you are trying to do with the language is pushing its limits. Or two, you don’t understand the domain.<p>Most large monorepo’s I have seen fall into the latter category while few reside in the former. Game engines, mature enterprise software, and a few others are large (probably OP’s codebase too), but we seem to revel in the fact that we have so much code to grok. Sometimes conciseness is better than cleverness.<p>Back to mypy. I’d throw in black as well. Combining black, flake8, mypy, on precommit has loads of advantages. It’s completely opinionated but I find it helps me write better Python code.
Mypy is very useful on big projects, and does catch bugs regularly in my code.<p>The ergonomics improved a lot and it's now usable, so the cost ratio/benefit is worth it today.<p>But barely.<p>Even assuming you use the latest Python version (lots of project can't), you still have to import tons of things you use all the time like Iterable, Self, Callable and so on.<p>Then you have to deal with with the poor Protocol solution for duck typing, aggressive defaults, mypy slowness (before 9.13 it's terrible, after it's just bad and mypyd is quickly mandatory) and a surprisingly high number of bugs (such frustrating time wasters). Add on that false positives, low support from some popular libs and incompatible type checker implementations, and you get a very much meh experience. Very far from the awesomeness on Python.<p>If you are unlucky and have to use anaconda, mypy dependencies make it extra fun to include.<p>Still, I'm glad it exists. It's still very useful. But thank god hints are optional.
I really really wanted to like mypy but my experience with mypy and Django has been very poor - it is slow, type inference is not good and most of the errors are false positives. Perhaps I’m spoiled by Typescript or django-stubs is just not quite mature enough.
This article mentions the woes of circular imports.
I thought MyPy let you work around that by doing if False: around your imports<p>eg
a.py:
import c<p><pre><code> if False:
import b
class X:
def x(self):
#type: () -> b.Y
from b import something_that_returns_y
return something_that_returns_y(self)
</code></pre>
b.py:<p><pre><code> from a import X
class Y:
pass
def something_that_returns_y(x : X) -> Y:
return Y()
</code></pre>
per <a href="https://github.com/asottile/flake8-typing-imports" rel="nofollow">https://github.com/asottile/flake8-typing-imports</a>
As the article mentions, the biggest problem by far with using mypy and the Python static typing ecosystem generally is the lack of third party support, even for big or new projects.<p>Python's benefits are as much about the libraries available as they are the language itself, and unfortunately it's kind of lacking right now especially compared to e.g. TypeScript support in npm.<p>Waiting for it to get better only goes so far; there isn't yet a cultural expectation around publishing typesheds for everything.
Would be interesting to seem them try pyright on their codebase. IME pyright is faster, catches more potential bugs, and doesn't require its own custom plugin system (which seems to be a major burden on other libraries).
> My unsubstantiated guess is that this is one of the most comprehensively-typed Python codebases out there for its size.<p>Not important, but FAANG companies have several orders of magnitude more strictly-typed Python than this.
Basically, they screwed up by using a monorepo with Python and have decided to try and badly paper over it by using a type checker.<p>For reference, don't do either of those things in Python.