How to do sentiment analysis via Twitter and other media for political elections using Pattern

27 Feb

According to its website: Pattern is a web mining module for the Python programming language. It bundles tools for data retrieval (Google + Twitter + Wikipedia API, web spider, HTML DOM parser), text analysis (rule-based shallow parser, WordNet interface, syntactical + semantical n-gram search algorithm, tf-idf + cosine similarity + LSA metrics) and data visualization (graph networks).

The module is bundled with 30+ example scripts.

Pattern web mining system

Pattern web mining system

Probably the most interesting use of the system is “Belgian elections, June 13, 2010 – Twitter opinion mining“:

In the week before the Belgian 2010 elections, we analyzed approximately 7,600 tweets that mentioned the name of a Belgian politician. What makes this experiment interesting is the fact that Belgium is divided in a Dutch-speaking half (Flanders, 60% of the population) and a French-speaking half (Wallonia, 40% of the population). Flemings can only vote for Flemish politicians, Walloons can only vote for Walloon politicians.

However the most striking part of the project is that researchers did not bother to do anything specific to Dutch and French but rather simply used Google Translate to translate Dutch and French tweets into English and then use the existing sentiment analysis systems for English:

The sentiment_score() function in the example uses SentiWordNet to rate words. Take the following tweet — chosen for its visible (positive) sentiment: “Danny Pieters, sterke speech voor een gedurfde en degelijke sociale bescherming.” We translate it into English using Google Translate and then weigh the individual words: …

One wonders if that would work for other languages such as Turkish, too. Apparently their Google Translate based system predicted the outcome of the results correctly:

1 Comment

Posted by on February 27, 2011 in Linguistics, Programlama, Science


One response to “How to do sentiment analysis via Twitter and other media for political elections using Pattern

  1. cyhex

    July 13, 2011 at 12:42

    Take a look at Twitter sentiment analysis tool <a , its written in python and uses Naive Bayes classifier with semi-supervised machine learning


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: