PySpark Cookbook
上QQ阅读APP看书,第一时间看更新

Abstracting Data with RDDs

In this chapter, we will cover how to work with Apache Spark Resilient Distributed Datasets. You will learn the following recipes:

  • Creating RDDs
  • Reading data from files
  • Overview of RDD transformations
  • Overview of RDD actions
  • Pitfalls of using RDDs