pyspark.sql.DataFrame A distributed collection of data grouped into named columns. pyspark.sql.Column A column expression in a DataFrame. pyspark.sql.Row A row of data in a DataFrame. pyspark.sql.HiveContext Main entry point for accessing data stored in Apache Hive. pyspark.sql.GroupedData Aggregation methods, returned by DataFrame.groupBy().