🛰️ Disaster Tweets NLP – Model Benchmarking

End-to-end NLP text classification on Kaggle “Disaster Tweets” — baseline ML + multiple deep learning architectures + transfer learning (USE) in a single reproducible notebook

Screenshot

Overview

Disaster Tweets NLP – Model Benchmarking is a portfolio-style NLP project built from a learning notebook and upgraded into a clean, reproducible mini-project.

The goal is to classify tweets into:

Real disaster (target = 1)
Not a disaster (target = 0)

This repository is intentionally structured like a mini case study:

Start with a strong classical ML baseline (fast and competitive)
Add multiple neural architectures (Dense, RNNs, CNN)
Add transfer learning using Universal Sentence Encoder (USE) from TensorFlow Hub
Compare everything using the same evaluation metrics

Problem Statement

Tweets are short, noisy, and often ambiguous.

Examples:

“Forest fire near La Ronge Sask. Canada” → likely a real disaster
“This exam destroyed me 😭” → contains “disaster-like” words but not a real disaster

The challenge is to build models that learn context rather than reacting to keywords alone.

Dataset

This project uses Kaggle’s dataset:

Competition: Natural Language Processing with Disaster Tweets
Link: https://www.kaggle.com/competitions/nlp-getting-started
Data page: https://www.kaggle.com/competitions/nlp-getting-started/data

Typical columns:

id: unique tweet id
keyword: (optional) keyword for the tweet
location: (optional) user location
text: tweet text (main input)
target: label (only in train.csv) where 1 = disaster, 0 = not disaster

Important repo note:

For licensing and cleanliness, you typically should NOT commit the raw Kaggle dataset into GitHub.
Instead, place it locally (see Dataset Setup).

Project Highlights

End-to-end workflow in one notebook: data → split → vectorization → multiple models → evaluation → comparison
Fair comparison:
- consistent train/validation split
- consistent evaluation metrics
Covers both classic NLP and modern embedding-based workflows
Includes reusable helper utilities (TensorBoard callback + plotting + metrics)

Approach

Data Pipeline

Load dataset from CSV files
Inspect class balance and sample texts
Split into training and validation sets
Prepare text for deep learning models using:
- TextVectorization layer (strings → integer token sequences)
- Embedding layer (token IDs → dense vectors)

Modeling Strategy

This notebook progresses from simplest to strongest:

A strong baseline model first
Then neural models trained from scratch
Then transfer learning (USE) as the high-performance benchmark
Finally a “10% data” experiment to show label-efficiency of transfer learning

Evaluation Metrics

For each model we compute:

Accuracy
Precision
Recall
F1 score

Why F1 matters:

Tweets can be ambiguous and noisy
Precision/recall tradeoffs matter in “disaster detection” scenarios
F1 gives a balanced view of classification quality

Models Implemented

Classical Baseline

Baseline pipeline:

TF-IDF Vectorizer
Multinomial Naive Bayes classifier

Why this baseline is important:

Fast to train
Surprisingly strong for short-text classification
Sets a minimum bar that neural models must beat

Deep Learning Models

All neural models follow the pattern:

Input: raw tweet strings
TextVectorization: strings → sequences of token IDs
Embedding: token IDs → trainable dense vectors
Architecture-specific layers
Output: sigmoid probability for binary classification

Neural models included:

simple_dense
lstm
gru
bidirectional
conv1d

Why benchmark multiple architectures?

Different inductive biases:
- RNNs capture sequential dependencies
- CNNs capture local n-gram patterns efficiently
- Dense baselines test if simple pooling is sufficient

Transfer Learning

Universal Sentence Encoder (USE) via TensorFlow Hub:

Encodes an entire sentence/tweet into a pretrained embedding vector
A small classifier head is trained on top

Why USE is strong:

Pretrained sentence embeddings often generalize well on small/medium datasets
Particularly useful for short texts where handcrafted features can miss semantics

Low-Data Experiment

USE model trained on only 10% of training data:

Shows how transfer learning behaves when labeled data is limited
Mirrors real-world situations where labels are expensive

Results

Final benchmark results from this notebook run:

Model	Accuracy	Precision	Recall	F1
baseline	0.792651	0.811139	0.792651	0.786219
simple_dense	0.776903	0.779556	0.776903	0.774487
lstm	0.772966	0.775649	0.772966	0.770438
gru	0.769029	0.773154	0.769029	0.765780
bidirectional	0.750656	0.752468	0.750656	0.747956
conv1d	0.780840	0.783458	0.780840	0.778533
tf_hub_sentence_encoder	0.814961	0.815272	0.814961	0.814225
tf_hub_10_percent_data	0.784777	0.790478	0.784777	0.781448

Key takeaways:

Best overall model here: tf_hub_sentence_encoder (highest F1)
Strong classical baseline: baseline (TF-IDF + NB)
conv1d performs well among from-scratch neural models
Transfer learning remains competitive even with only 10% training data

Getting Started

Project Structure

Recommended repository layout:

disaster-tweets-nlp-model-benchmarks/
├─ NLP.ipynb
├─ helper_functions.py
├─ requirements.txt
├─ screenshots/
│  ├─ results-table.png
│  ├─ dataset-preview.png
│  ├─ label-distribution.png
│  ├─ training-curves-use.png
│  └─ training-curves-baseline.png
├─ .gitignore
├─ LICENSE
└─ README.md

Prerequisites

Python 3.10+ recommended
pip installed
Optional: GPU for faster deep learning training

Installation

Clone the repository

git clone https://github.com/brej-29/disaster-tweets-nlp-model-benchmarks.git cd disaster-tweets-nlp-model-benchmarks
Create and activate a virtual environment

Windows (PowerShell):

python -m venv .venv
.\.venv\Scripts\Activate.ps1

macOS / Linux:

python3 -m venv .venv
source .venv/bin/activate

Install dependencies

pip install -r requirements.txt

Dataset Setup

This notebook expects Kaggle dataset files to be available locally.

Option A: download Kaggle dataset zip

Download the dataset from: https://www.kaggle.com/competitions/nlp-getting-started/data
Keep the downloaded zip named exactly:

nlp_getting_started.zip
Place it in the SAME folder as NLP.ipynb

After unzipping, you should have:

train.csv
test.csv
sample_submission.csv

Option B: already extracted files

Place train.csv/test.csv/sample_submission.csv in the same folder as NLP.ipynb
Ensure filenames match the notebook expectations

Run the Notebook

Start Jupyter Notebook

jupyter notebook

(or)

jupyter lab

Open NLP.ipynb
Run cells top-to-bottom
Recommended: Restart Kernel & Run All for full reproducibility

Optional: GPU Setup

If you have an NVIDIA GPU and want faster training:

Follow official TensorFlow installation guidance for your OS/CUDA setup
If you are on Google Colab:
- Runtime → Change runtime type → GPU

How to Reproduce Exactly

To reproduce results cleanly:

Create a fresh virtual environment
Install dependencies from requirements.txt
Ensure dataset files are present
Run the notebook from top to bottom without skipping cells
Save the results table screenshot into screenshots/results-table.png

Optional:

Capture package versions for strict reproducibility:

pip freeze > requirements-freeze.txt

Notes on helper_functions.py

helper_functions.py provides reusable utilities commonly used in ML notebooks, such as:

TensorBoard callback creation
Training curve plotting
Metric calculation helpers

Keeping helpers separate makes the notebook easier to read and the evaluation more consistent.

Common Issues & Fixes

FileNotFoundError: nlp_getting_started.zip

Confirm the zip file exists next to NLP.ipynb
Confirm the filename matches exactly

Training is slow

Use GPU if available
Reduce epochs temporarily while testing
Use the baseline model for quick sanity checks

TensorFlow Hub download takes time

First run may download the model from TF Hub
Ensure stable internet and rerun the cell if needed

Future Improvements

Ideas to upgrade this from “notebook project” to “full ML project”:

Add an inference script: input tweet → output disaster probability
Save best model + preprocessing artifacts for deployment
Add confusion matrix and error analysis:
- inspect false positives/false negatives
Try transformer baselines (DistilBERT/BERT) and threshold tuning
Add experiment tracking:
- structured results logging (CSV/JSON)
- TensorBoard organization per model run

References

Kaggle Competition: https://www.kaggle.com/competitions/nlp-getting-started
TensorFlow Hub USE tutorial: https://www.tensorflow.org/hub/tutorials/semantic_similarity_with_tf_hub_universal_encoder
Keras TextVectorization docs: https://www.tensorflow.org/api_docs/python/tf/keras/layers/TextVectorization
scikit-learn TF-IDF docs: https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.TfidfVectorizer.html

License

This project is licensed under the MIT License.
See the LICENSE file in this repository for full details.

Contact

If you’d like to discuss this project, provide feedback, or connect:

LinkedIn: Brejesh Balakrishnan

Feel free to fork the repo, open issues, or suggest improvements!

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
screenshots		screenshots
LICENSE		LICENSE
NLP_Project.ipynb		NLP_Project.ipynb
README.md		README.md
helper_functions.py		helper_functions.py
nlp_getting_started.zip		nlp_getting_started.zip
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

🛰️ Disaster Tweets NLP – Model Benchmarking

Screenshot

Table of Contents

Overview

Problem Statement

Dataset

Project Highlights

Approach

Data Pipeline

Modeling Strategy

Evaluation Metrics

Models Implemented

Classical Baseline

Deep Learning Models

Transfer Learning

Low-Data Experiment

Results

Getting Started

Project Structure

Prerequisites

Installation

Dataset Setup

Run the Notebook

Optional: GPU Setup

How to Reproduce Exactly

Notes on helper_functions.py

Common Issues & Fixes

Future Improvements

References

License

Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages