Women’s E-Commerce Clothing Reviews Analysis Using Machine Learning Test phase, srs, design phase and source code final deliverable

Women’s E-Commerce Clothing Reviews Analysis Using Machine Learning Test phase, srs, design phase and source code final deliverable

Domain / Category

Machine Learning/Information Retrieval/Prototype base

Abstract / Introduction

Ecommerce is a method of buying and selling goods and services online. Although the fashion industry developed first in Europe and America, today it is an international and highly globalized industry. The Internet has made the world a smaller place. It has facilitated long distance communication by making the process cheaper, faster, and easier. Today, the internet helps business industry to cater to a global audience without any major investment. Customer reviews act as powerful social proof. While your online customers may lack the ability to see or touch the product, they can read through reviews to still make well-informed purchase decisions. Reviews create credibility for the products, and increased credibility means increased sales.

In this project, students will use sentiment analysis to determine whether the product is recommended or not and find accuracy by applying appropriate machine learning techniques (such as Logistic Regression, Decision Tree, and Random Forest etc.) to given datasets. Students will also compare which technique is best and why. The outcomes of this analysis will benefit both the researchers and textile industries.

Functional Requirements:

The following are the functional requirements of the project:

  1. For this project, student can collect data from Kaggle platform for the classification of reviews. Data set must contain at least 2000 records. The data set is shared in the link below for the idea.
  2. System must be set the environment online/offline (If required)
  3. System will apply different data processing techniques (Tokenization, Stop word removal, Lemmatization, etc…)
  4. System must be split the given dataset into testing and training.
  5. System must train the specified model.
  6. User must evaluate mentioned models in the form of Confusion Matrix, Accuracy, Precision,

Recall.

  1. User must have to discuss the results of given algorithms (Naïve Bayes, Support Vector Machine, Random Forest and Deep Learning Algorithms)
  2. User must retrain the model if accuracy is not good (less than 60%) by changing different training parameters (If required)

Tools:

  • Anaconda (Python distribution platform)
  • Spyder, Jupiter Notebook, Google Colab, Pycharm (IDE)
  • Python (programming language)
  • Machine Learning (Technique)

Prerequisite:

Artificial Intelligence, Machine Learning, and Natural Language Processing Concepts, “Students will cover a short course relevant to the mentioned concepts besides SRS and Design initial documentation or see the links below.”

Helping Material

Machine Learning Techniques: https://towardsdatascience.com/machine-learning-an-introduction-23b84d51e6d0 https://towardsdatascience.com/top-10-algorithms-for-machine-learning-beginners-

149374935f3c

https://towardsdatascience.com/10-machine-learning-methods-that-every-data-scientistshould-know-3cc96e0eeee9

https://www.tutorialspoint.com/logistic_regression_in_python/logistic_regression_in_python_ tutorial.pdf

https://www.geeksforgeeks.org/ml-logistic-regression-using-python/

https://www.youtube.com/watch?v=sm0NoO5aYC0 https://www.youtube.com/watch?v=fG4e4TUrJ3E https://www.youtube.com/watch?v=7eh4d6sabA0

https://www.youtube.com/playlist?list=PLfoPJyH8_TpXkcJnuj8CF18-CbXPjw-ny

https://towardsdatascience.com/visualizing-decision-trees-with-python-scikit-learn-graphvizmatplotlib-1c50b4aa68dc

Dataset:

https://www.kaggle.com/nicapotato/womens-ecommerce-clothing-reviews.

Supervisor:

Name: Hina Ishaq

 

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top