0092 3125718857
WhatsApp for More Details
NLP based Duplicate Bug Report Detection using Supervised Machine Learning Algorithms Test phase, srs, design phase and source code final deliverable
Project Domain / Category
AI/Machine Learning/Prototype base
Abstract / Introduction
A bug report is a technical document that contains all the necessary information about the bug and the conditions under which it can be reproduced. It is a guide for the developers and the team engaged in fixing the bug. Bug reports are the primary means through which developers triage and fix bugs. To achieve this effectively, bug reports need to clearly describe those features that are important for the developers. However, previous studies have found that reporters do not always provide such features.
Our objective in this project is to Classify such bug reports using machine learning models on the given dataset. Natural language processing (NLP) is the ability of a computer program to understand human language as it is spoken and written referred to as natural language. Natural language processing uses artificial intelligence to take real-world input, process it, and make sense of it in a way a computer can understand. NLP perform data preprocessing (Tokenization, Stop word removal, Lemmatization, etc…) which involves clearing textual data for machine to be able to analyze it. To classify the duplicate bug reports we use machine learning algorithms such as Naïve Bayes, Support Vector Machine and Random Forest.
Pre-Requisites:
This project is easy and interesting but requires in depth study of machine learning, natural language processing techniques. The following link may help you better understand:
Text Classification Tutorial: https://www.youtube.com/watch?v=sm0NoO5aYC0
Dataset: https://github.com/logpai/bugrepo/tree/master/Thunderbird
Functional Requirements:
The following are the functional requirements of the project:
…………
Tools:
Supervisor:
Name: Sadeem Ahmad Nafees