Here are the big tips I think the article missed:<p>Use the new string dtype that requires way less memory, see this video: <a href="https://youtu.be/_zoPmQ6J1aE" rel="nofollow">https://youtu.be/_zoPmQ6J1aE</a>. object types are really memory hungry and this new type is a game changer.<p>Use Parquet and leverage column pruning. `usecols` doesn't leverage column pruning. You need to use columnar file formats and specify the `columns` argument with `read_parquet`. You can never truly "skip" a column when using row based file formats like CSV. Spark optimizer does column projections automagically - you need to do them manually with Pandas.<p>Use predicate pushdown filtering to limit the data that's read into the DataFrame, here's a blog post I wrote on this: <a href="https://coiled.io/blog/parquet-column-pruning-predicate-pushdown/" rel="nofollow">https://coiled.io/blog/parquet-column-pruning-predicate-push...</a><p>Use a technology like Dask (each partition in a Dask DataFrame is a Pandas DataFrame) that doesn't require everything to be stored in memory and can run computations in a streaming manner.