Skip to content

Ajayay/NAIVE_BAYES_on_AFFR

Repository files navigation

Amazon Fine Food Reviews Sentiment Analysis.

This dataset is taken from kaggle.com

Objective:

  • The main goal of the project is to analyze some large dataset and perform sentiment classification on it. Sentiment classification is a type of text classification in which a given text is classified according to the sentimental polarity of the opinion it contains. For the purpose of this project the Amazon Fine Food Reviews dataset, which is available on Kaggle, is being used.

  • Following sections describe the important phases of Sentiment Classification: the Exploratory Data Analysis for the dataset, the preprocessing steps done on the data, learning algorithms applied and the results they gave and finally the analysis from those results.

Context:

  1. This dataset consists of reviews of fine foods from amazon.
  2. The data span a period of more than 10 years, including all ~500,000 reviews up to October 2012.
  3. Reviews include product and user information, ratings, and a plain text review.
  4. It also includes reviews from all other Amazon categories.
  • Number of reviews: 568,454
  • Number of users: 256,059
  • Number of products: 74,258
  • Timespan: Oct 1999 - Oct 2012
  • Number of Attributes/Columns in data: 10

Data includes:

  • Reviews from Oct 1999 - Oct 2012
  • 568,454 reviews
  • 256,059 users
  • 74,258 products
  • 260 users with > 50 reviews

Feature information:

  • IdRow Id
  • ProductIdUnique identifier for the product
  • UserIdUnqiue identifier for the user
  • ProfileNameProfile name of the user
  • HelpfulnessNumeratorNumber of users who found the review helpful
  • HelpfulnessDenominatorNumber of users who indicated whether they found the review helpful
  • ScoreRating between 1 and 5
  • TimeTimestamp for the review
  • SummaryBrief summary of the review
  • TextText of the review

Text preprocessing techniques used:

  • BOW
  • TFIDF

Models used:

  1. Naive bayes

Releases

No releases published

Packages

No packages published