Now that we have ingested our data and created our labels, it's time to extract our features to build our binary classification model. As its name suggests, the bag-of-words approach is a very common feature-extraction technique whereby we take a piece of text, in this case a movie review, and represent it as a bag (aka multiset) of its words and grammatical tokens. Let's look at an example using a few movie reviews:
Review 1: Jurassic World was such a flop!
Review 2: Titanic ... an instant classic. Cinematography was as good as the acting!!
For each token (can be a word and/or punctuation), we will create a feature and then count the occurrence of that token throughout the document. Here's what ...