How to reduce the memory used by Pandas DataFrames
Thomas Dickson
This post describes a useful script for reducing the memory usage of a pandas DataFrame. There have been a few blog posts and Stack Overflow questions in this area such as here and here.
You might want the memory of a dataframe to be reduced if you want to work with more data or you want to speed up what you are doing. A related problem is about getting the most amount of data you can out of your database (or data warehouse) - this article covers a few different ways you can use pandas to load lots of data..
I wanted to include this snippet here as it’s a working version of the functionality that I’m happy with. This function loads a dataframe and iterates over each column to identify whether a data type with a reduced memory requirement can describe the existing data. It returns a modified version of the dataframe.