Dokumentation (english)

Text Classification - BERT

Sentiment analysis on IMDB movie reviews using BERT

This case study demonstrates fine-tuning BERT (Bidirectional Encoder Representations from Transformers) for sentiment classification on movie reviews. BERT's bidirectional architecture captures rich contextual understanding, making it highly effective for natural language understanding tasks.

Dataset: IMDB Movie Reviews

  • Source: HuggingFace (stanfordnlp/imdb)
  • Type: Binary text classification
  • Size: 50,000 reviews (25k train, 25k test)
  • Classes: Positive, Negative
  • Average Length: 233 words per review
  • Language: English

Model Configuration

{
  "model": "bert",
  "category": "nlp",
  "subcategory": "text-classification",
  "model_config": {
    "model_name": "bert-base-uncased",
    "num_labels": 2,
    "max_seq_length": 512,
    "batch_size": 32,
    "epochs": 3,
    "learning_rate": 0.00002,
    "warmup_steps": 500
  }
}

Training Results

Training Progress

Accuracy and loss curves over 3 epochs:

Keine Plot-Daten verfügbar

Confusion Matrix

Classification performance on test set:

Keine Plot-Daten verfügbar

Prediction Confidence Distribution

How confident is the model in its predictions?

Keine Plot-Daten verfügbar

Performance by Review Length

Does review length affect classification accuracy?

Keine Plot-Daten verfügbar

Most Important Words

Attention weights for sentiment prediction:

Keine Plot-Daten verfügbar

Common Use Cases

  • Customer Feedback Analysis: Classify product reviews, support tickets
  • Social Media Monitoring: Track brand sentiment, crisis detection
  • Content Moderation: Identify toxic or inappropriate comments
  • Market Research: Analyze consumer opinions and trends
  • Political Analysis: Classify political discourse, news sentiment
  • Financial Markets: Sentiment analysis of news for trading signals
  • Healthcare: Analyze patient feedback, clinical notes

Key Settings

Essential Parameters

  • model_name: Pre-trained model variant (base, large, multilingual)
  • max_seq_length: Maximum input tokens (128-512)
  • num_labels: Number of classes (2 for binary)
  • learning_rate: Fine-tuning rate (1e-5 to 5e-5)
  • batch_size: Samples per iteration (16-32)
  • epochs: Training iterations (2-4 typical)

Optimization

  • warmup_steps: Gradual learning rate increase
  • weight_decay: L2 regularization (0.01 typical)
  • adam_epsilon: Optimizer stability (1e-8)
  • max_grad_norm: Gradient clipping (1.0)

Advanced Configuration

  • fp16: Mixed precision training (faster, less memory)
  • gradient_accumulation: Simulate larger batch sizes
  • early_stopping: Stop if validation improves
  • class_weights: Handle imbalanced datasets
  • attention_probs_dropout: Regularization

Performance Metrics

  • Accuracy: 92.7% on test set
  • Precision: 92.4% (positive class)
  • Recall: 93.1% (positive class)
  • F1 Score: 92.7% (both classes)
  • Training Time: 3.2 hours (NVIDIA RTX 3080)
  • Inference Speed: ~80 reviews/second
  • Model Size: 438 MB (BERT-base-uncased)

Tips for Success

  1. Pre-trained Models: Always start with pre-trained BERT
  2. Sequence Length: Truncate intelligently (keep important parts)
  3. Learning Rate: Start small (2e-5), crucial for fine-tuning
  4. Few Epochs: 2-4 epochs usually sufficient
  5. Validation: Monitor validation loss for early stopping
  6. Batch Size: Larger batches more stable but need more memory
  7. Special Tokens: Properly handle [CLS], [SEP], [PAD]

Example Scenarios

Scenario 1: Positive Review

  • Input: "This movie is an absolute masterpiece! The acting was brilliant and the plot kept me engaged throughout. Highly recommend!"
  • Prediction: Positive (confidence: 98.7%)
  • Key Tokens: masterpiece, brilliant, highly recommend

Scenario 2: Negative Review

  • Input: "What a waste of time. The plot was confusing, acting was terrible, and I couldn't wait for it to end."
  • Prediction: Negative (confidence: 97.3%)
  • Key Tokens: waste of time, confusing, terrible

Scenario 3: Mixed Review (Challenging)

  • Input: "While the cinematography was stunning, the weak storyline and poor character development ruined the experience."
  • Prediction: Negative (confidence: 68.2%)
  • Reasoning: Negative aspects outweigh positive mention

Troubleshooting

Problem: Model overfitting (train acc >> val acc)

  • Solution: Reduce epochs (use 2 instead of 3-4), add dropout, increase data

Problem: Poor performance on sarcastic reviews

  • Solution: Add sarcasm examples to training, use context-aware features

Problem: Slow training or OOM errors

  • Solution: Reduce batch_size or max_seq_length, use fp16 training

Problem: Biased predictions (favors one class)

  • Solution: Balance dataset, adjust class_weights, check label distribution

Problem: Low confidence on short texts

  • Solution: Train on more short examples, consider different models for short text

Model Architecture Highlights

BERT-base consists of:

  • 12 Transformer Layers: Stacked encoder blocks
  • 768 Hidden Units: Dense representation dimension
  • 12 Attention Heads: Multi-head self-attention
  • Parameters: 110 million trainable parameters
  • WordPiece Tokenization: 30,522 vocabulary size
  • Bidirectional Context: Captures left and right context
  • Special Tokens: [CLS] for classification, [SEP] for separation

BERT Variants Comparison

ModelParamsSpeedAccuracyBest For
DistilBERT66M2x faster91.2%Production, mobile
BERT-base110MBaseline92.7%General use
BERT-large340M3x slower93.8%Maximum accuracy
RoBERTa125MSimilar93.5%Better pre-training

Next Steps

After training your BERT classifier, you can:

  • Deploy as REST API for real-time predictions
  • Fine-tune on domain-specific data (medical, legal, etc.)
  • Multi-task learning (sentiment + emotion + topic)
  • Export to ONNX for faster inference
  • Distill to smaller model (DistilBERT)
  • Ensemble with other models for higher accuracy
  • Build interpretability tools (attention visualization)
  • Adapt to other languages (multilingual BERT)

Command Palette

Search for a command to run...

Schnellzugriffe
STRG + KSuche
STRG + DNachtmodus / Tagmodus
STRG + LSprache ändern

Software-Details
Kompiliert vor 1 Tag
Release: v4.0.0-production
Buildnummer: master@64a3463
Historie: 68 Items