PySpark Cookbook
上QQ阅读APP看书,第一时间看更新

Creating RDDs

For this recipe, we will start creating an RDD by generating the data within the PySpark. To create RDDs in Apache Spark, you will need to first install Spark as shown in the previous chapter. You can use the PySpark shell and/or Jupyter notebook to run these code samples.