Deloitte Digital Senior Community Manager Laura Anderson shares how she used Python codes to monitor online community discussions and research topics, in this case the Internet of Things. How can this programming language be used to help discover who’s joining a conversation, who’s leading it, and what its trends are?

Disclaimer: Before I begin, I want you to know that I am not a coder. Quite the opposite. As a senior community manager, part of my job entails learning more about online communities to understand audiences’ passions, behaviors, and essentially what makes them tick. I recently learned a new trick to help do this through Python, so I dove into a community I know nothing about – the Internet of Things.

With the concept of the Internet of Things (IoT) heating up online, many brands and individuals alike are jumping to join the bandwagon. But it’s likely that few, including myself, actually understand what IoT means and the implications it can have on modern society. In order to deepen my understanding about this latest buzzword (is “buzzphrase” a thing yet?), I used Python codes to stream and pull tweets using the hashtags #InternetofThings and #IoT via Twitter’s API.

Sounds fancy, right? It was actually so simple that even a Python newbie like me could do it. Here’s how.

First, I authenticated my Twitter profile in order to gain access to Twitter’s API. I did this by creating an app that spit out a consumer key (API key) and consumer secret (API secret), both of which are basically just long strings of letters and numbers. These keys are crucial because they serve as the green light to grab data from Twitter.

After completing this necessary authentication process, I used two primary Python scripts to collect tweets related to the Internet of Things. The first Python script streamed tweets from the Twitter API for approximately 12 hours, and the second Python script pulled a list of 1,000 tweets from Twitter’s API. Both Python scripts streamed or pulled tweets to include the hashtags #InternetofThings and #IoT, as well as the author, timestamp, and retweet count.

I then modified the Python codes to export my datasets to CSV files. The main things I wanted to do were: find out whether an individual or brand authored the tweet, determine current IoT trends, conduct a sentiment analysis, and measure a user’s activity and influence. Please note: “influence” has about a million different definitions and opinions. For my purposes, I measured a user’s influence by the frequency and volume of retweets he/she or the brand received.

Here’s a look at the Python code I used to stream tweets:

import encoding_fix
import tweepy
from twitter_authentication import CONSUMER_KEY, CONSUMER_SECRET, ACCESS_TOKEN, ACCESS_TOKEN_SECRET
auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_TOKEN, ACCESS_TOKEN_SECRET)
api = tweepy.API(auth)
output_file = open("InternetofThings.csv", "w")
class StreamListener(tweepy.StreamListener):
    def on_status(self, tweet):
        modified_tweet = tweet.text
        modified_tweet = modified_tweet.replace("\n", " ")
        modified_tweet = modified_tweet.replace(",", "")   output_file.write(','.join([tweet.author.screen_name, str(tweet.created_at), modified_tweet, str(tweet.retweet_count)]) + '\n')
        print(modified_tweet)
    def on_error(self, status_code):
        print('Error: ' + repr(status_code))
        return False
l = StreamListener()
streamer = tweepy.Stream(auth=auth, listener=l)
keywords = ['#InternetofThings', '#IoT']
streamer.filter(track = keywords)
output_file.close()

And, here’s the Python code I used to pull 1,000 tweets:

import encoding_fix
import tweepy
from twitter_authentication import CONSUMER_KEY, CONSUMER_SECRET, ACCESS_TOKEN, ACCESS_TOKEN_SECRET
import time
auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_TOKEN, ACCESS_TOKEN_SECRET)
api = tweepy.API(auth)
output_file = open("InternetofThings_pull.csv", "w")
counter = 0
for page in tweepy.Cursor(api.search, '#InternetofThings #IoT', count=100).pages():
    counter = counter + len(page)
    for tweet in page:
        modified_tweet = tweet.text
        modified_tweet = modified_tweet.replace("\n", " ")
        modified_tweet = modified_tweet.replace(",", "")       output_file.write(','.join([tweet.author.screen_name, str(tweet.created_at), modified_tweet, str(tweet.retweet_count)]) + '\n')
    # end this loop if we've gotten 1000
    if counter == 1000:
        break
    # This page suggests we can do one request every 5 seconds:
    # https://dev.twitter.com/rest/reference/get/search/tweets
    time.sleep(5)
output_file.close()

From those Python codes I learned the following about IoT:

  • A succinct definition: The “Internet of Things” is the term used to describe the connectivity between technology and physical objects.
    Who’s leading the conversation: Based on the tweet data, brands are dominating the online conversation over individuals, and it’s not hard to see why. Think wearable fitness trackers, smart watches, on-demand buttons to reorder household products, etc.
  • Who’s talking: Of 1,000 total tweets, brands accounted for 579 tweets made up of 239 original and 340 retweets. Individuals accounted for 388 tweets made up of 80 original and 308 retweets. Twitterbots accounted for 33 tweets made up of two original and 31 retweets.
  • Conversation volume and activity: These tweet numbers confirmed that the online IoT conversation is very active within Twitter. But based on the tweet data, many users are merely endorsing a thought or opinion through retweets, rather than writing an original tweet.
    IoT Trends: A few key trends emerged based on the tweet data. The most frequently tweeted words included “wearables,” “smarthome,” “mobile,” “smarter cities,” and “big data.”
  • Sentiment: In order to determine how those included in the tweet data are reacting to an increasingly digital lifestyle, I conducted a sentiment analysis based on my own interpretation of the tweet. (You get really good at distinguishing genuine positivity from snark in my job.) I found 62 percent of the tweets were positive, 31 percent were neutral, and 7 percent were negative.


In no way am I considering this tweet data a complete overview of the IoT conversation on Twitter, but it’s pretty awesome to know all this could be discovered with the help of a little Python code.

Laura Anderson is a senior community manager at the Deloitte Digital studio, Deloitte Consulting, LLP in the Pioneer Square neighborhood of Seattle. She recently graduated with a Master of Communication in Digital Media from the University of Washington. Follow her on Twitter at @laanderson.