Project 9 : Kaggle Natural Language Processing Competition — Part 2

October 31, 2022

Natural Language Processing with Disaster Tweets Classification part 2.

Sentiment analysis is the automated process of tagging data according to their sentiment, such as positive, negative and neutral. Sentiment analysis allows companies to analyze data at scale, detect insights and automate processes.

`Introduction: Business Problem`

Description: Now we have created a basic model. The goal now is to iterate on this model by making a good validation dataset. This is an issue of try and error since there is no general way of doing this. Thus we have no insure that we can perform a lot of tests without a long training time.

Evaluation: The evaluation metric for this competition is F1.

`Methodology`

The project will be executed by completing the following tasks:

Notebook Setup
Imports and EDA(Exploritory Data Analysis)
Training
Create a validation set
Initial model
Use validation set of Part 1
Improving model
Huggingface login
Final model for part 2 notebook
Saving and Sharing model
Use the model via pipeline

Link to Github Repository

Link to Blog post

Link to huggingface model