PySpark Cookbook
上QQ阅读APP看书,第一时间看更新

How to do it...

There are five major steps we will undertake to install Spark from sources (check the highlighted portions of the code):

  1. Download the sources from Spark's website
  2. Unpack the archive
  1. Build
  2. Move to the final destination
  3. Create the necessary environmental variables

The skeleton for our code looks as follows (see the Chapter01/installFromSource.sh file):

#!/bin/bash
# Shell script for installing Spark from sources
#
# PySpark Cookbook
# Author: Tomasz Drabas, Denny Lee
# Version: 0.1
# Date: 12/2/2017
_spark_source="http://mirrors.ocf.berkeley.edu/apache/spark/spark-2.3.1/spark-2.3.1.tgz"
_spark_archive=$( echo "$_spark_source" | awk -F '/' '{print $NF}' )
_spark_dir=$( echo "${_spark_archive%.*}" )
_spark_destination="/opt/spark"
...
checkOS
printHeader
downloadThePackage
unpack
build
moveTheBinaries
setSparkEnvironmentVariables
cleanUp