Evolution 2014 social media wrap-up

Social media use among scientists has been growing. Nature just published the results of a broad survey of various social networks including Twitter. Given this, it seems timely to write up the quick analysis that I did this summer.
During the Evolution 2014 annual meetings I collected tweets using the hastag #evol2014. I did this based on interest in network analysis growing from my time at the Santa Fe Institute’s Complex Systems Summer school. I don’t have any particular agenda with this, mostly curiosity.

Twitter Buzz from 5 days of #Evol2014 (centrality, node size; modular groups, color) — Twitter Buzz from 5 days of #Evol2014. Node size is centrality and modular groups are color coded.

What you see in the above network is lines (edges) connecting tweets between sender and any users referenced in the tweet. Halos are a side effect generated by large number of links back to the original node (i.e. a tweet with no other username referenced). I used Eigenvector centrality to measure the importance of nodes to this network. It is similar to Google’s Pagerank method for web page importance. A couple of points to take out of this network. There were a couple of really active people. Some of the active people composed tweets including other users, others provided information without reference to other users. I know from my own experience it was difficult to quickly track down a presenters username (e.g. @devindrown). I decided I wanted to support social media use at Evolution 2014, so I actually put my Twitter handle on my title slide. I’m beginning to see this more and more at conferences.
I also collected follower relationships from the conference twitter account (@evol2014) shortly before the conference started (May 27, 2014). Twitter limits the rate at which you can collect follower relationships so I only did this once. I then generated a network of all the followers and their connections from this data.

Followers of @evol2014 and their relationships. Node size is centrality and modular groups are color coded.

I labeled only the nodes with a high degree (over 20 connections to other members in the network) because the number of labels became overwhelming. I color coded the different modular groups. These were fairly fluid based on the parameters, so I don’t want to make much out of them. Given that caveat, there seem to be larger groups of people who follow each other compared to the twitter activity network. This isn’t surprising as I’m not likely to tweet to every individual person I follow.
As part of my data exploration and training with the network tools , I collected tweets that were already using the hashtag (#evol2014) before the conference really started.

To this, I added all the followers of the conference twitter handle (@evol2014). I was curious to roughly compare the number of listeners to active users.

SOME CAPTION — Followers of @evol2014 combined with #evol2014 tweets

The quick answer: While only a few people were engaged in some pre-conference social media a great deal of people were ready to listen. You can see from the connections and color coding that most users were having some very local chats (tight, isolated clusters) as compared to the full conference network generated by all the conversations at the top of the post.
One last network: Using the same methods as I did for the full conference network, I generated a network of just the pre-conference tweets. This network highlights some of the people who were really engaging in social media to get the word out about some of the interesting events and topics of the conference.

If you’re getting ready to go to a conference an want to engage in the social media conversation, PLoS Computational Biology has just published this timely guide as part of their Ten Simple Rules series.

Other recent conference social network analyses
Jeremy Yoder over at the Molecular Ecologist wrote up an excellent post about the timing and content of the tweets during the conference. I highly recommend checking there for some more quantitative analytics of the conference.
Tim Poisot wrote up a very detailed comparison of two conferences (ESA2014 and IMCC3) over at his blog . He includes all of the software and scripts used to generate the analysis and I look forward to trying out those tools for future conferences.
Of course Jorge Cham over at PhD Comics knows the real reason why academics use twitter.

Brief methods: I used a combination of NodelXL and Gephi to collect, process, and visualize the data. NodelXL is a user friendly plugin for Excel that allowed me to search and collect tweets using the hashtag #evol2014 across the duration of the meeting. It misses most retweets (RT) unfortunately so the above networks loose this nuance (popular tweets). I exported the data to Gephi where I calculated the network metrics (e.g. node centrality and modularity). Finally, I used various layout algorithms (ForceAtlas) to display and organize the networks. I found this webpage very helpful to get started.