Toxic Comment Classification Using Neural Networks and Machine Learning

Abstract: A cornucopia of data is developed through conversations, interactions of humans online. This scenario has contributed considerably well to the quality of human life but it also involves prodigious dangers as online text communications with high toxicity quality cause individual assaults, online provocation and harassing practices. This has activated both industrial and research network over the most recent couple of years while there are a few attempts to distinguish a proficient model for online toxic comment classification and prediction. Be that as it may, these means are still in their earliest stages and new methodologies and structures are required. On parallel, the information blast that shows up always, makes the development of new machine learning computational apparatuses for overseeing this data, a basic need. Gratefully progresses in big data management, hardware and cloud computing administration permit the advancement of Deep Learning approaches showing up exceptionally encouraging execution up until now. Recently the use of Convolutional Neural Networks and Recurrent Neural Networks have been approached for computational purposes for the text classification systems. In this work, we utilize this way to deal with finding toxic comments, remarks in an extensive pool of records given by a current Kaggle's competition with respect to Wikipedia's talk page edits which has divided the level of toxicity into 6 labels: toxicity, severe toxicity, obscenity, threat, insult or identity hate.

Keywords: Long-Short Term Memory, Convolutional Neural Network, Text mining, Word Embedding, Toxic text classification, Text classification

| DOI: 10.17148/IARJSET.2018.597