Week of August 16th
What I did
-
As I suggested in the end of last week’s post, I am going through all my notes from the past few months. So far it has been really helpful, and it’s something that I am going to keep forcing myself to do regularly.
-
I finally fixed the bug plaguing my word cloud project. I have a good idea for how to improve this project, but I’m not sure if I should invest the time for this project now. Basically, I want to create a new branch for the word cloud project, where instead of plotting by word frequency, I plot by term importance using tf-idf. In my opinion, this will make much more interesting word clouds. It will definitely take some time to figure out how to include a corpus of facebook messages. I wrote an automated script in the very first facebook message counter program that was able to go through every file within the directory and extract the messages, but this was done for the HTML scripts. I suppose I could modify it to scrub through my directory of JSON’s instead without too much trouble. But I am also wondering if a better corpus exists. I suspect using my facebook messenger corpus would be the most accurate, but I’m going to ruminate on this for a while.
-
I implemented the utf-8 encoding fix to the chatbot program. I don’t think that’s enough to fix the program’s issues though, the main one being that something is wrong with the decoding-encoding sequence and the responses don’t vary.
What I learned
-
As I said, I’m not going to recap what I’ve relearned by reviewing my notes. If I did, my blog would end up being twice as large (well, hopefully less than that).
-
I listened to a Freakonomics episode (or maybe it was their spin-off show, No Stupid Questions), where they discussed the difference between “Maximizers” and “Satisficers.”
What I will do next
- About the chatbot: I’ve decided that I should apply transfer learning to my chatbot. That way I can have a more robust model from a powerfully trained network, and then fine-tune the deeper layers to reflect my personal writing idiosyncrasies.
- Continue to review my older notes, but also continue my Bayesian statistics in R course.