PySpark Cookbook
上QQ阅读APP看书,第一时间看更新

There's more...

Now that you have Jupyter on your machine, and assuming you followed the steps of either the Installing Spark from sources or the Installing Spark from binaries recipes, you should be able to start using Jupyter to interact with PySpark.

To refresh your memory, as part of installing Spark scripts, we have appended two environment variables to the bash profile file: PYSPARK_DRIVER_PYTHON and PYSPARK_DRIVER_PYTHON_OPTS. Using these two environment variables, we set the former to use jupyter and the latter to start a notebook service. 

If you now open your Terminal and type:

pyspark

When you open your browser and navigate to http://localhost:6661, you should see a window not that different from the one in the following screenshot: