Explicit Schema#


Pandas and Mixed Type Columns#

One of the bad habits that Pandas enables is having mixed type columns that are basically labelled as type object. This is not allowed in distribute computing frameworks such as Spark, Dask, and Ray because the data can be soread across multiple machines and having explicit data types guarantees consistency of the operations performed distributedly.