Don't let dicts spoil your code (2020)

134 pointsby imankulovalmost 3 years ago

21 comments

snidanealmost 3 years ago

The article is not explaining the point, which I believe is: type your dicts if you want to provide strict guarantees to your downstream about data shape.If you know precisely what the data is used for - great, go ahead - type system is your friend.If you don't know how the data should be used, it's often a different story. Wrapping data in hand typed classes is a terrible idea in the typical data engineering scenarios where there might be hundreds of these api endpoints, which also might be changing as the upstream sees fit. Perfect way to piss off your downstream users is to keep telling them "sorry the data is not available because I overspecified the data type and now it failed on TypeError again". Usually the downstream is the domain expert, they know which fields should be used and they don't know which ones before they start using it. Typically the best way is to pass ALL the upstream data down, materialize extra fields and NOT modify any existing field names, even when you think you're super smart and know better than domain experts. Too often it happens that a "smart" engineer though he knew better and included only some fields. Only for then to be realized that the data source contained many more gold nuggets, and it was never documented that these were cleverly dropped.

评论 #31882181 未加载

评论 #31880165 未加载

评论 #31881323 未加载

评论 #31882087 未加载

评论 #31882047 未加载

oiveyalmost 3 years ago

Python's strapped on type annotations have been designed around traditional OOP, and it feels like a bad fit for the language. Duck typing is a tremendously powerful form of polymorphism, and none of the PEPs for type annotations do a great job of supporting it. Protocols don't work well with dataclasses and not at all with dicts. TypedDicts could have been perfect, but they explicitly disallow extra keys. Why even use a TypedDict instead of a dataclass? Why make yet another traditional OOP abstraction that was already well served by multiple other features of the language? Even more frustratingly, TypedDicts show that it could have been done. They just decided to break it on purpose.TFA accidentally even brings up the reason by dicts are so powerful: they enable easy interoperability between libraries (like a wire format). Using two libraries together that insist on their own bespoke class hierarchy is an exercise in data conversion pain. Further, if I want a point to be an object containing fields for "x" and "y", I'd much rather just use a dict rather than construct an object in some incompatible inheritance nightmare.

评论 #31880436 未加载

评论 #31881167 未加载

评论 #31881546 未加载

评论 #31880372 未加载

评论 #31881237 未加载

评论 #31882612 未加载

valbacaalmost 3 years ago

Interesting how Clojure takes the complete opposite approach by simply making dicts immutable.<a href="https://chasemerick.files.wordpress.com/2011/07/choosingtypeforms2.png" rel="nofollow">https://chasemerick.files.wordpress.com/2011/07/choosingtype...</a>

评论 #31881474 未加载

评论 #31880814 未加载

评论 #31880109 未加载

评论 #31883249 未加载

nicboualmost 3 years ago

This is something I enforced in a big rewrite at a previous company.People would take a full API response, and pass bits of it around with mutations. Understanding what the object looked like 5 functions deep was really hard. If the API changed... Oh boy.I found many bugs just tracing the code like this. It made me a big proponent of strong typing, or at least strong type hinting.

评论 #31882294 未加载

djhaskin987almost 3 years ago

This opinion gets at the heart of the reason to use type languages or not. After all, what is a dict but an untyped struct?Untyped languages are excellent for smaller code bases because they are more comfortable to program in and faster and more general. Types of polymorphism possible in these languages are simply not possible or much harder in typed languages. Also, as others have said, the problem domain may not be as explored yet.Typed languages really start to shine as a code base gets huge. In these instances well maintained untyped language code bases start collapsing under the weight of their own unit tests, while moderately well or poorly well maintained instances of untyped language code bases become a mess. Mostly this is due to difficulties in communication when the code base gets worked on by so many people that it's hard for them all to communicate with each other. In these cases a typed language keeps everyone on the same page to some extent.Both camps will hate me for saying this I think, but it's what I've observed over the years.It also may sound like I prefer typed languages, but in fact my favorite languages to work in are Clojure and Python. My code bases as a DevOps engineer rarely pass the 10,000 line mark and never pass 100,000 line mark. It's much more comfortable for me in these untyped languages.Untyped languages also really shine in microservices for the same reason.

madsbuchalmost 3 years ago

* Don't let dicts spoil your python codeMaybe that was implied?Anyways, a lot of languages take another stance. E. Elixir where using dicts along with pattern matches calls for quite powerful abstractions.As long as the dicts are kept shallow and the number of indirection in the code in general so, then it is alright to navigate and use.

评论 #31882343 未加载

ampgtalmost 3 years ago

Glad to see pydantic get mentioned here. It’s a great solution for this exact problem. I was introduced to it by FastAPI and have been using it in all my projects since.At the end of the day you really can’t escape typing. It just makes life easier. We should stop letting languages try to remove it.

asddubsalmost 3 years ago

Took me a really long time to learn this lesson. IMO this is a variation of the primitive obsession code smell, although I'd say it's way more harmful. I was really reluctant to add data classes to my code when the good old PHP array could get the job done without holding me up with a bunch of beaurocracy. Of course they give no guarantees and enforce no structure, so inevitably you get slight variations depending on what you need, or maybe you happen to have a dict that's a superset of what you're feeding in, and it just becomes really hard to reason about things. And of course since it's not a named type, tracing things back becomes really hard.

评论 #31879914 未加载

评论 #31880020 未加载

评论 #31879814 未加载

leetroutalmost 3 years ago

Yes to the principle. But typed dict is useful for more than just "the wire".There are places where you just dont need the overhead of a class. Yes slotted classes make this much cheaper but so do named tuples.If the behavior of a thing is to map values then it should stay a dict.If the behavior is a bag of attributes then yes pick something better.

molly0almost 3 years ago

I really liked the structure of this blog post. But It misses the positive aspects of using dictionaries. Like when you are the owner of the api you consume and just want the JSON to flow through your “application tier”

评论 #31882328 未加载

icedchaialmost 3 years ago

I used Pydantic on a recent project. It worked well.

dmixalmost 3 years ago

When JavaScript added hash/object deconstruction (both at the argument level and assigning variables) I noticed code has been using Dict-like function arguments everywhere. It makes typing them a bit more of a pain in the ass (especially without default arguments).I haven’t decided if I like it better than just breaking up objects into arguments in a more simple functional style.On one hand it’s more predictable but on the other most complex apps start passing around objects for everything. Typescript of course helps with that, as does nearly modularized code (ie not passing in full typed objects outside of the parent module which owns/generates them unless they uniquely operate on the full object).These are the small rescissions you end up making a hundred times.

评论 #31898021 未加载

评论 #31882049 未加载

hahajkalmost 3 years ago

I don't program professionally, and I struggle with dicts and classes. On one hand, I want to avoid the Java world of needing to learn 8 new classes to use any library. So dicts are lightweight and extensible and feel like the modern way of doing things. One the other hand, all the problems listed in the article are right. You really do need to document different expected dicts somehow, which is basically structs/classes.The other thing that always burns me are lists. Specifically lists of lists and lists of strings. Since python allows you to index into strings the same way as lists, for some reason I always loose track of where I am in the unpacking stack. This is when I switch to type hints.

评论 #31880149 未加载

评论 #31884180 未加载

scotty79almost 3 years ago

I would really like a language where you can swap simple data collections like dicts or arrays with others, better defined, employing better suited algorithms without changing everywhere in your code how you access them.So if getting a field using simple structure is mycol[key] it should look exacty the same when mycol is no longer a flexible dict containing adhoc objects but complex strongly typed immutanble trie or btree indexed array because at some point of evolution of your code it became apparent that this is exactly what you need.The only language that I know of that has consistent interface between simple and complex (also custom) collections is Scala.

评论 #31882436 未加载

BiteCode_devalmost 3 years ago

> Don't let dicts spoil your code (2020) (roman.pt) * Conditions applies* Apply only for when parsing I/O. Do not substitute primitives with classes inside your code base for no good reason. Unless validation is needed, prefer a NamedTuple.

评论 #31882705 未加载

akhmatovaalmost 3 years ago

Functions that accept dicts are a nightmare to extend and modify.Compared to what? I see the article's point about dicts being, like everything else in programming, a tradeoff with benefits and limitations. But the article's needless dramatization of a pretty mundane point (and the button-pushing title) are, to these jaded eyes, a definite turn-off.Meanwhile I'll keep using dicts when the use case calls for them, thank you. As a sibling commenter put it:If you don't know how the data should be used, it's often a different story.Exactly. The whole point (and benefit) of dicts is that they're squishy. Sometimes you need squishy.

islandertalmost 3 years ago

My take is that dicts are fine as long as your code is well tested. Yeah, dataclasses and frozen classes have much better typing support, but if you code is mostly reading and writing JSON like many modern cloud apps, it can be easier to use plain dicts combined with decent tests to make sure you don't break downstream services.

评论 #31882823 未加载

评论 #31881139 未加载

K0baltalmost 3 years ago

I don’t usually write software in python, but when I do I try not to end up with a bag of dicts.There are better ways to structure data for otherwise reusable functions.

Graffuralmost 3 years ago

So types solve this right? Or am I misunderstanding?

评论 #31883212 未加载

lloydatkinsonalmost 3 years ago

Is it so hard to type dictionary?

评论 #31883234 未加载

评论 #31882738 未加载

singularity2001almost 3 years ago

dicts are just data, the question of mutability is orthogonal