Afghanistan and its Election on Twitter: The Macro Picture

Preview of an Upcoming WEP Report

By Erhardt Graeff
with Seth Woodworth

Data Summary

  • 111,741 tweets about Afghanistan and its presidential election posted between August 11, 2009 and September 9, 2009
  • 11,255 tweets on August 20, 2009, the day of the election
  • 29,642 users talked about Afghanistan in our dataset
  • Top 10% of tweeters contributed 65% of tweets (same as Iran Election)
  • Number of retweets for a user was not correlated to their tweeting volume (same as Iran Election)
  • 483 hashtags were used at least 3 times
  • No single, dominant hashtag (differs from Iran Election)
  • 3 most used hashtags: #Afghan09, #Afghanistan, and #AfghanElection

Introduction

Afghan citizens went to the polls on August 20, 2009 after a controversial delay recommended by Afghanistan’s Independent Election Commission to allow ample time to prepare for fair and safe elections. Karzai was favored to win the election amid a large pool of contending candidates; the most serious challenge coming from former Foreign Minister of Afghanistan Abdullah Abdullah. In pre-election polling, Abdullah gained significant momentum as election day drew nearer and other candidates dropped their campaigns.

In a clear reference to the protests following the June presidential election in Iran, Abdullah’s campaign manager was quoted predicting street violence if Abdullah doesn’t win. Here at the Web Ecology Project, we wondered if Twitter would play as significant a role in reporting the election as it did in Iran. In a country where mobile phone subscriptions add up to an estimated 50% of the population, but internet access was roughly 1.5% at last estimate with the status of network expansion [pdf] unclear, could the available ICT infrastructure and awareness of social media prompted by the “twitter revolution” in Iran enable a similar phenomenon post-August 20?

Tweets

We pulled tweets from Twitter containing 42 “English” search terms:

Zabihullah
“NATO Headquarters”
karzai
kandahar
jalalabad
kabul
herat
kunduz
khost
abdullah
abdulah
abdulla
abdula
taliban
taleban
“mullah omar”
mujahideen
mujahid
ghani
bashardost
mazari
khumri
ghazni
eikenberry
afghanelection
sabari
paktya
Haqqani
Dostum
Pajhwok
hazari
afghan09
parwan
paktika
lashkar
“puli khumri”
khowst
charikar
karzay
aliveinafghanistan
aliveinafghan
afpfail
garmser
garmsir

Using these search terms, we have archived 111,741 tweets posted between 11:00pm EDT on August 11 and 11:00pm EDT on September 9, including a complete set of 11,255 tweets from the day of the election. Currently, our dataset is noisy; we are aware of complications with the common name “Abdullah” in its different forms as well as the presence of the Taliban in Pakistan. We believe that about 10,000 tweets are affected by such irregularities while we improve our filter.

(We also compiled a list of Dari and Pashto search terms that correspond with our English search terms. These have yielded only 3389 tweets over the same time period, 33 of which were also picked up by the English equivalents. More work still needs to be done to prepare this corpus for analysis.)

Using the tools we first developed for our report The Iran Election on Twitter, we have generated a timeline of volume of tweets pulled using the English search terms over the four weeks of data collection; data points are at one hour increments [click for larger view].

Afghan Tweet Volume over Time (click for larger)

Adjusted for time difference, August 20, election day, saw the most activity on twitter—over 10% of all tweets across our 28-day dataset. The spike on that day begins between 10:00pm and 11:00pm EDT (UTC-4:00) on August 19, which coincides with the opening of polls in Afghanistan at 7:00am AFT (UTC+4:30), August 20. The second largest spike on the graph occurs on August 15, which was the date the Taliban bombed the UN Headquarters in Kabul.

Users

Our dataset involves 29,642 users, making an aggregate rate of 3.76 tweets per user. Using a Lorenz curve [plotted below], we found a pattern of unequal distribution of Twitter activity across the user population that was similar to our findings for the Iran Election.

Lorenz Curve of Afghan Tweets

The steepness of the above curve illustrates that the top 10% of users contributed 65.3% of all tweets, which is almost identical to the distribution found for the Iran Election tweets (the top 10% of users contributed 65.5% of all tweets). Another finding similar to our Iran study is the disconnect between a user’s number of tweets and the number of times they were retweeted. (Our list of retweets only account for tweets that contained either upper or lower case forms of “RT”, with any form of punctuation, then followed by any number of spaces and @user.)

The top tweeters on Afghanistan are more heterogeneous in their affiliations than the the top retweeted users. A number of high profile news organizations, individual journalists, and official and semi-official military channels comprise the list of top retweeted users. Notable accounts are those of the Pajhwok Afghan News (@pajhwok) and the Alive in Afghanistan project (@aliveinafghan), as well as the latter’s founder Brian Conley (@BaghdadBrian) of Small World News (@smallworldnews). These accounts are the strongest “local” voices offering Afghan perspectives on events. In the same way that individuals with close affiliations in Iran were both prolific and influential sources of information, these represent similar sources for Afghanistan.

Hashtags

One significant difference between Iran and Afghanistan is the lack of a common hashtag like #iranelection. Although multiple top hashtags were often used in tandem, only recently has #Afghan09 shown more dominance. The pie chart below shows the usage of the most common hashtags as a percentage of all tweets containing a hashtag.

Afghan Hashtags

Out of the 483 different hashtags used at least three times, #Afghan09 was most prominent, first adopted by @aliveinafghan, @smallworldnews, and @BaghdadBrian. #Afghanistan (a non-unique hashtag) had the next highest frequency, followed by #AfghanElection, originally adopted by @pajhwok. Additionally, the extremely generic #news and support our troops hashtag #militarymon were used often. Important to note is that the #Afghan09

Next Steps for this Research

Although no protests have ensued, Twitter activity has been kept alive by regular reports of official vote counts and allegations of voter fraud—spiking a more consistent conversation about the war. The statistics in this post are based on all aggregated tweets captured using English search terms over 4 weeks worth of activity. The next step in this project is to analyze the evolution of the conversation and its key players over time. We have broken down the usage of hashtags, the volume of tweets per user, and retweets, per day. We will be studying how the ranks of the top users in terms of output and “influence” have changed since the election day. Beyond quantitative analysis, this effort will require a qualitative classification of users to better understand the nature of the user population (Daniel Bennett’s “Who to follow: Twitter for the Afghanistan election” offers a good starting point). We hope to analyze the Dari and Pashto tweets in the same manner.

The need for a more detailed study of the conversations on Twitter is exemplified by Stephen Colbert (@StephenAtHome) and his single tweet on the Afghan election, which received enough retweets to elevate him to the 38th most retweeted user in our dataset:

“despite rumors of voter fraud in afghanistan, it looks like it went smoothly for new afghan president-elect mahmoud ahmadinejad.”

Expect a full report on this soon, as well as a much anticipated follow-up to our Iran Election report after that! Please send us your feedback on the progress of this research and ideas for other ways to analyze and interpret these data.

We would like to thank Jon Beilin, Sam Gilbert, and Javed Rezayee for their continuing contributions, feedback, and support.