Week of August 9th
What I did:
- Monday: Kaggle work with giant spotify dataset
- Tuesday: Added use of the Spotify API,
spotipy
to the notebook. - Weds - Sat: Added recommendation system to the notebook. Added 10 pages to my thesis.
- Saturday: Expored DOGE coin values on Kaggle
What I learned:
- Monday: A sparkSession includes all the possible sparkContexts. I’ve been using the sql context, mostly. A sparkContext connects your application driver (maybe a jupyter notebook) to a Spark cluster with the help of a Resource Manager (could be Spark Standalone, YARN, or Apache Mesos). Here’s a basic example to get you started.
spark = SparkSession.builder.master("local[2]").appName("Spotify-Huge-Dataset").getOrCreate()
sc = spark.sparkContext
sqlContext = SQLContext(sc)
df = spark.read.option("header", True).csv(CSV_FILE)
What I will do Next:
- Thesis writing, writing, writing.
- Also, I have a bug with the deep embeddings song vectors I’d like to fix if I have time.