How to do sentiment analysis in Python

Learn how to perform sentiment analysis in Python. Discover different methods, tips, real-world applications, and how to debug common errors.

How to do sentiment analysis in Python
Published on: 
Thu
Feb 12, 2026
Updated on: 
Mon
Apr 13, 2026
The Replit Team

Sentiment analysis in Python allows you to gauge opinions from text data. It's a powerful technique for businesses that want to understand customer feedback and broader market trends.

In this article, you'll explore key techniques and practical tips for effective sentiment analysis. You will also find real-world applications and debugging advice to help you build and refine your own models.

Using TextBlob for quick sentiment analysis

from textblob import TextBlob

text = "I really enjoyed the movie. It was absolutely fantastic!"
analysis = TextBlob(text)
print(f"Polarity: {analysis.sentiment.polarity}, Subjectivity: {analysis.sentiment.subjectivity}")--OUTPUT--Polarity: 0.9, Subjectivity: 1.0

The TextBlob library offers a straightforward way to perform sentiment analysis without building a model from scratch. By passing your text to the TextBlob object, you can immediately access its sentiment attribute. This attribute contains two useful scores:

  • Polarity: A value between -1.0 (negative) and 1.0 (positive). The output of 0.9 indicates a very positive sentiment.
  • Subjectivity: A value from 0.0 (objective) to 1.0 (subjective). The score of 1.0 shows the text is entirely opinion-based.

Basic sentiment analysis techniques

While TextBlob offers a convenient starting point, other methods provide more specialized capabilities for tackling complex text and achieving greater accuracy in your analysis.

Using NLTK's VADER sentiment analyzer

import nltk
nltk.download('vader_lexicon', quiet=True)
from nltk.sentiment.vader import SentimentIntensityAnalyzer

sid = SentimentIntensityAnalyzer()
text = "The food was delicious, but the service was terrible."
print(sid.polarity_scores(text))--OUTPUT--{'neg': 0.253, 'neu': 0.451, 'pos': 0.296, 'compound': 0.1779}

Unlike TextBlob, NLTK’s VADER (Valence Aware Dictionary and sEntiment Reasoner) is tuned for social media and can effectively parse mixed sentiments. The polarity_scores() method returns a dictionary with a detailed breakdown:

  • pos, neu, and neg: These values show the proportion of text that falls into positive, neutral, and negative categories.
  • compound: This is a single normalized score from -1 (most negative) to 1 (most positive). The score of 0.1779 reflects a slightly positive overall sentiment, accurately capturing the nuance of the mixed review.

Creating a simple rule-based sentiment analyzer

def simple_sentiment(text):
positive_words = ['good', 'great', 'excellent', 'love', 'happy']
negative_words = ['bad', 'terrible', 'awful', 'hate', 'sad']
words = text.lower().split()
score = sum(1 for w in words if w in positive_words) - sum(1 for w in words if w in negative_words)
return "Positive" if score > 0 else "Negative" if score < 0 else "Neutral"

print(simple_sentiment("I love this great product despite some bad reviews"))--OUTPUT--Positive

For more control, you can create a custom rule-based analyzer. The simple_sentiment function works by checking text against predefined lists of positive and negative words to calculate a score.

  • It adds 1 for each positive word found.
  • It subtracts 1 for each negative word.

The function then returns "Positive", "Negative", or "Neutral" based on the final tally. This approach is transparent and easy to customize, though it doesn't capture the contextual nuance that more advanced, pre-trained models can.

Using spaCy with sentiment extensions

import spacy
from spacytextblob.spacytextblob import SpacyTextBlob

nlp = spacy.load('en_core_web_sm')
nlp.add_pipe('spacytextblob')
doc = nlp("This product exceeded my expectations. Highly recommended!")
print(f"Polarity: {doc._.blob.polarity}, Subjectivity: {doc._.blob.subjectivity}")--OUTPUT--Polarity: 0.75, Subjectivity: 0.8

You can integrate sentiment analysis into spaCy's powerful natural language processing pipelines using extensions like spacytextblob. This approach combines spaCy's advanced text processing with TextBlob's simple sentiment scoring. After adding the spacytextblob pipe, you process your text and access the sentiment scores through the custom doc._.blob attribute.

  • The polarity of 0.75 signals a strong positive sentiment.
  • The subjectivity of 0.8 confirms the text is highly opinionated.

Advanced sentiment analysis approaches

While pre-built tools are great for quick checks, you'll need more powerful methods like transformers and fine-tuning for nuanced, domain-specific sentiment analysis.

Using transformers with the pipeline API

from transformers import pipeline

sentiment_analyzer = pipeline("sentiment-analysis")
result = sentiment_analyzer("The plot was predictable, but the acting was superb.")
print(result)--OUTPUT--[{'label': 'POSITIVE', 'score': 0.9743}]

The transformers library gives you access to powerful, pre-trained models. Its pipeline API simplifies sentiment analysis by abstracting away complex steps like tokenization and model inference. You just call pipeline("sentiment-analysis") to load a model that's already fine-tuned for this task. This demonstrates why AI coding with Python is so effective for machine learning tasks.

The result is a dictionary containing:

  • label: The predicted sentiment, such as 'POSITIVE'.
  • score: A confidence score showing how certain the model is. A score of 0.9743 indicates very high confidence in the prediction.

When processing multiple texts, you'll often work with creating a list of dictionaries to store results from batch sentiment analysis.

Fine-tuning a pre-trained model for domain-specific analysis

from transformers import AutoModelForSequenceClassification, AutoTokenizer
from datasets import load_dataset

model_name = "distilbert-base-uncased"
model = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=2)
tokenizer = AutoTokenizer.from_pretrained(model_name)
dataset = load_dataset("imdb", split="train[:100]")
print(f"Loaded model and {len(dataset)} samples for fine-tuning")--OUTPUT--Loaded model and 100 samples for fine-tuning

Fine-tuning adapts a general-purpose model to your specific needs, improving its accuracy on niche topics. This code sets the stage for that process.

  • It loads a pre-trained model, distilbert-base-uncased, using AutoModelForSequenceClassification. The num_labels=2 argument configures it for binary classification (like positive/negative).
  • It also loads the corresponding tokenizer and a sample of the imdb dataset.

By training a model in Python on this movie review data, you make its sentiment predictions more accurate for that specific domain. This kind of rapid iteration and experimentation is perfect for vibe coding workflows.

Creating an ensemble model for improved accuracy

from sklearn.ensemble import VotingClassifier
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.naive_bayes import MultinomialNB

vectorizer = CountVectorizer()
ensemble = VotingClassifier(estimators=[
('lr', LogisticRegression()),
('nb', MultinomialNB())
])
print("Ensemble model created for robust sentiment predictions")--OUTPUT--Ensemble model created for robust sentiment predictions

An ensemble model combines the strengths of several different models to produce more reliable predictions. This approach often leads to better accuracy than using a single model alone. The VotingClassifier acts like a committee, taking votes from each individual model to make a final, collective decision on the sentiment.

  • The code creates an ensemble using two distinct models: a LogisticRegression classifier and a MultinomialNB (Naive Bayes) classifier.
  • Before the models can analyze text, CountVectorizer is used to convert the words into numerical data they can understand.

Move faster with Replit

Replit is an AI-powered development platform that comes with all Python dependencies pre-installed, so you can skip setup and start coding instantly.

Instead of piecing together the sentiment analysis techniques you've just seen, you can use Agent 4 to build a complete application. It takes your description and turns it into a working product.

  • A dashboard that tracks brand mentions on social media and analyzes their sentiment in real time.
  • A tool that ingests customer reviews and automatically categorizes them as positive, negative, or neutral.
  • An app that scrapes product reviews from an e-commerce site to generate a report on popular features or common complaints.

Simply describe your app, and Replit will write the code, test it, and fix issues automatically, all within your browser.

Common errors and challenges

Even with powerful tools, you might run into a few common roadblocks when performing sentiment analysis in Python.

Forgetting to install required TextBlob dependencies

When you first use TextBlob, you might encounter an error if you haven't downloaded its required data corpora. The library relies on these datasets for tasks like noun phrase extraction, and it can't function properly without them. You can fix this by running a one-time command in your terminal—python -m textblob.download_corpora—to get everything you need.

Addressing negation handling in sentiment analysis

Negation can easily trip up simpler sentiment analyzers. A sentence like "This movie was not good" contains the word "good," but the sentiment is clearly negative. Basic rule-based models often miss this nuance, leading to inaccurate scores. Tools like VADER are better equipped to recognize these context-flipping words and adjust the sentiment accordingly.

Preprocessing text properly for accurate sentiment analysis

The quality of your sentiment analysis depends heavily on the quality of your input text. Raw text is often messy—filled with punctuation, inconsistent capitalization, and irrelevant words that can confuse your model. Proper preprocessing, such as converting all text to lowercase and removing special characters, ensures your analysis is based on the content itself, not the noise surrounding it.

Forgetting to install required TextBlob dependencies

Running TextBlob for the first time can sometimes throw an error if its necessary data corpora aren't installed. The library depends on these for certain NLP tasks, and it won't work without them. The following code demonstrates this common pitfall.

from textblob import TextBlob

text = "The movie was fantastic!"
analysis = TextBlob(text)
print(f"Polarity: {analysis.sentiment.polarity}")

The call to analysis.sentiment will fail because its underlying data models are missing. You can fix this by running the one-line download command shown in the next example.

import nltk
from textblob import TextBlob

# Download required NLTK data
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')

text = "The movie was fantastic!"
analysis = TextBlob(text)
print(f"Polarity: {analysis.sentiment.polarity}")

The solution is to download the specific NLTK data packages that TextBlob depends on. Before you can analyze sentiment, you must run nltk.download('punkt') for tokenization and nltk.download('averaged_perceptron_tagger') for part-of-speech tagging. This is usually a one-time setup required in any new development environment. Once these dependencies are in place, your code will execute without errors, allowing TextBlob to function correctly. For robust applications, handling multiple exceptions in Python becomes important when dealing with missing dependencies. For more complex projects, understanding managing Python dependencies becomes crucial.

Addressing negation handling in sentiment analysis

Negation can easily mislead simple sentiment analyzers that score words individually. A word like 'not' completely flips a sentence's meaning, but a basic model might overlook it and focus only on positive or negative keywords, leading to an incorrect result. The following code demonstrates this pitfall.

from textblob import TextBlob

text = "I am not happy with this product at all."
words = text.split()
positive_words = sum(1 for word in words if TextBlob(word).sentiment.polarity > 0)
negative_words = sum(1 for word in words if TextBlob(word).sentiment.polarity < 0)
print(f"Positive words: {positive_words}, Negative words: {negative_words}")

The code tallies sentiment word by word, identifying 'happy' as positive. It fails to account for the negating word 'not', misinterpreting the overall negative tone. The next example shows a more effective approach.

from textblob import TextBlob

text = "I am not happy with this product at all."
# Analyze the full sentence to capture context and negations
analysis = TextBlob(text)
print(f"Full text polarity: {analysis.sentiment.polarity}")

The solution is to analyze the entire sentence at once, which allows the model to understand context. When you pass the full string to TextBlob, it correctly identifies that 'not' reverses the sentiment of 'happy', leading to an accurate negative score.

  • Always analyze complete sentences to avoid misinterpreting reviews that contain negations or other linguistic nuances.

Preprocessing text properly for accurate sentiment analysis

Raw text is often full of noise. Elements like excessive punctuation, emojis, and inconsistent capitalization can distort sentiment scores. A simple analyzer might misinterpret the text's true emotional weight if it's not cleaned up first. The following code demonstrates this common pitfall.

from textblob import TextBlob

text = "This product is AMAZING!!! I love it :) <3"
analysis = TextBlob(text)
print(f"Polarity: {analysis.sentiment.polarity}")

The code's analysis is unreliable because it doesn't account for noise like !!! and :). Feeding this raw text directly into TextBlob can distort the sentiment score. The next example demonstrates a better approach.

import re
from textblob import TextBlob

text = "This product is AMAZING!!! I love it :) <3"
# Remove special characters and normalize
clean_text = re.sub(r'[^\w\s]', '', text.lower())
analysis = TextBlob(clean_text)
print(f"Polarity: {analysis.sentiment.polarity}")

The solution is to clean the text before analysis. By using re.sub(r'[^\w\s]', '', text.lower()), you first convert the text to lowercase and then strip away all non-alphanumeric characters. This process, known as normalization, ensures the sentiment score is based purely on the words themselves, not on distracting elements like exclamation points or emojis. Additional preprocessing steps like removing stop words in Python can further improve accuracy. This step is crucial when working with user-generated content, which is often unstructured and messy.

Real-world applications

Understanding the technical challenges prepares you to apply sentiment analysis to solve practical business problems.

Analyzing customer reviews with TextBlob

You can use TextBlob to quickly process a list of customer reviews, categorizing each one and calculating an overall sentiment score to gauge feedback at a glance.

from textblob import TextBlob

reviews = [
"This product is amazing! I love it.",
"Decent quality, but a bit expensive.",
"Terrible experience, would not recommend.",
"Works as expected, good value.",
"Disappointed with the durability."
]

polarities = [TextBlob(review).sentiment.polarity for review in reviews]
for i, (review, polarity) in enumerate(zip(reviews, polarities)):
sentiment = "Positive" if polarity > 0.1 else "Negative" if polarity < -0.1 else "Neutral"
print(f"Review {i+1}: {sentiment} ({polarity:.2f}) - {review}")
print(f"\nAverage sentiment polarity: {sum(polarities)/len(polarities):.2f}")

This code processes a list of reviews by first using a list comprehension to efficiently calculate the sentiment.polarity score for each one. It then iterates through the reviews and their corresponding scores to assign a clear label.

  • A conditional expression classifies each review as "Positive," "Negative," or "Neutral." It uses thresholds of 0.1 and -0.1 to create a neutral buffer, preventing slightly skewed text from being mislabeled.
  • Finally, it computes the average polarity, giving you a single metric to understand the overall sentiment from the entire batch.

Analyzing sentiment trends in product reviews over time

By analyzing reviews from different time periods, you can track whether overall customer sentiment is trending up or down.

from textblob import TextBlob
import pandas as pd

reviews_data = {
'Month': ['Jan', 'Feb', 'Mar', 'Apr', 'May'],
'Reviews': [
"This product is terrible, don't buy it",
"Some improvements but still not great",
"Getting better, some issues remain",
"Good product, highly recommended",
"Excellent product, exceeded expectations!"
]
}

df = pd.DataFrame(reviews_data)
df['Sentiment'] = df['Reviews'].apply(lambda x: TextBlob(x).sentiment.polarity)
print(df[['Month', 'Sentiment']])
print(f"\nSentiment trend: {'+' if df['Sentiment'].is_monotonic_increasing else '-'}")

This script combines pandas and TextBlob to track how sentiment changes over time. It first organizes the monthly review data into a DataFrame, which is a powerful, table-like structure for reading CSV files in Python and handling data.

  • The apply() method processes each review, using a lambda function to calculate its polarity score with TextBlob and saving it to a new 'Sentiment' column.
  • Finally, it checks if these scores are consistently rising with is_monotonic_increasing, giving you a quick summary of whether customer opinion is improving.

Get started with Replit

Turn your knowledge into a functional tool. Give Replit Agent a prompt like "Build a dashboard that analyzes customer reviews from a CSV" or "Create a script that outputs a sentiment score for a text file."

It will write the code, test for errors, and deploy your application directly from your browser. Start building with Replit.

Build your first app today

Describe what you want to build, and Replit Agent writes the code, handles the infrastructure, and ships it live. Go from idea to real product, all in your browser.

Build your first app today

Describe what you want to build, and Replit Agent writes the code, handles the infrastructure, and ships it live. Go from idea to real product, all in your browser.