1 AIT Asian Institute of Technology

Identifying deplicate questions on Quora

AuthorAkhileshwar, Chennu
Call NumberAIT RSPR no.IM-17-12
Subject(s)Machine learning--Technique

NoteA research submitted in partial fulfillment of the requirements for the degree of Master of Engineering in Information Management, School of Engineering and Technology
PublisherAsian Institute of Technology
Series StatementResearch studies project report ; no. IM-17-12
AbstractFinding whether the two questions are asking the same thing can be challenging, as word choice and sentence structure may vary significantly. Some of the natural language processing techniques have been found to have the limited success in separating related question from duplicate ones. Quora is a very good source which helps the users to exchange their knowledge and they also face this problem of duplicate questions. Since Quora gives importance to similar questions problem, it want to provide a good experience for both the question seeker and writer. Using a data set question pairs provided by Quora in Kaggle, we extract the features from the data set by using some methods like common word share, Jaccard Similarity Coefcient, Cosine Similarity, Tf-Idf. After extracting the features from the data we use some machine learning algorithms to build a model using training data. By using this model we get the final values of the test data set.
Year2017
Corresponding Series Added EntryAsian Institute of Technology. Research studies project report ; no. IM-17-12
TypeResearch Study Project Report (RSPR)
SchoolSchool of Engineering and Technology (SET)
DepartmentDepartment of Information and Communications Technologies (DICT)
Academic Program/FoSInformation Management (IM)
Chairperson(s)Sumanta Guha;
Examination Committee(s)Phan Minh Dung;Bohez, Erik L.J.;
Scholarship Donor(s)Asian Institute of Technology Fellowship;
DegreeResearch studies project report (M. Eng.) - Asian Institute of Technology, 2017


Usage Metrics
View Detail0
Read PDF0
Download PDF0