This is the third article in the series called “Twitter analytics”. My goal is to see whether it is possible to discover meaningful social media insights by applying data gathering and data analytics to the social network Twitter. In this part, I want to use the performance score that I created in the previous article and apply it to several attributes of a tweet to answer the question: “What makes a tweet perform well?”
I will use the word “engagement” a lot when analyzing the data set. If you have not read my previous article, the engagement metric simply represents the sum of likes, replies and retweets that a tweet got. For example, a tweet that has 30 likes, 12 replies and 10 retweets scores an engagement of 52. It is also important to know that the figures included in this article are interactive. You can display and hide graphs by clicking on their name on the right. Furthermore, you can use controls like panning and zooming to take a closer look at the graphs. By hovering over a data points you can view its exact value.
The data set
The data set that we are looking at today is all about photography. It contains 100 096 tweets which contain the keyword “photography”. Regarding the timeframe, the data set contains tweets from the last 130 days, so roughly the last four months, from the 31st of July 2020 to the 8th of December 2020.
I quickly want to talk about how I prepared and filtered the data set before going into the analysis of the data. As I mentioned in my previous article, I filter out tweets with a very low engagement right from the start. In this case, I filtered out all the posts that did not get any engagement at all by defining that every tweet needs to have at least one like, retweet and reply.
Another thing that I need to improve was the distribution of engagement in the data set:
If you are wondering where the graph is, it is very close to the y-axis at first and then follows the x-axis to the right. Every Twitter data set I worked with looked similar to this. There are a small number of tweets that go “viral” and therefore completely outperform all the other tweets by a factor of 100, 1000, or even 10 000. The problem with this uneven distribution is, that a very small percentage of tweets gets the majority of the engagement. Therefore, the viral tweets will distort the engagement scores that we are calculating later. In order to remove this distortion, I chose a more evenly distributed subset of the data by filtering out all the tweets that got an engagement of over 1000. This was a good choice because the number of tweets in the data set shrank only by little (from 100 096 to 97 771), while the distribution improved by a lot:
You might also be wondering why I am doing the extra step of filtering the tweets for a keyword. After all, it would be easier to just analyze a sample of random tweets. While using a sample of random tweets is possible, the results are inaccurate. There are many different users, topics and niches on Twitter that behave very differently from each other. To get reliable insights into the performance of a tweet, you have to use tweets from the same niche as a reference frame.
In the orange graph above, you can see what type of tweets were posted in the niche “photography” over the last four months. A “text” tweet contains only text, a “photo” tweet contains at least one photo, a “link” tweet contains a link to another website (but is also often used to embed videos), a “photo&link” post includes photos and an external link or embed. As you might have guessed, 68.47 % of tweets were photos. The second most used tweet type was “text” with 16.66 %. Tweets of type “link” or “photo&link” were rarely used and together only made up 14.87 % of tweets.
Knowing what type of content was posted is pretty interesting – but knowing what type of content performs best is even better. Let’s switch to the blue graph and look at the average engagement of the different tweet types. You can do that by pressing on “Amount of Tweets” under the graph. This will hide the orange graph so that you can take a closer look at the blue one. The most used content type “photo” is also the content type that brought in the most engagement on average. Surprisingly, the less used “photo&link” managed to get the second-most average engagement, beating “text” and “link” tweets. This makes me believe that the most important thing for marketing your photography on Twitter is to directly include photos in your tweet.
I am really happy with the results of the content type analysis. From looking at other broad data sets, I know that usually tweets of the type “text” dominate the other tweet types both in usage and engagement. Nevertheless, in this data set, you can see that setting a niche like “photography” completely changes the behaviour of Twitter and its users.
The orange graph shows us that most tweets in our dataset used up the 280 character limit of Twitter. That is pretty common because many people write tweets that are too long and then shorten them to get under the 280 character limit. The second most used tweet length is around 60 – 100 characters. The tweet lengths in between are not that popular. Since most of the tweets are photos, it makes sense that the length is shifted towards shorter tweets because most users probably only use the text as a short caption for their photo.
By looking at the blue graph, you can see a pretty straight line that tells us: the longer a tweet in the niche “photography” is, the less engagement it gets. This learning goes hand in hand with the previous learning. I think that the shorter tweets are probably “photo” tweets and the longer tweets are probably “text” tweets. Since we already learned that tweets of the type “photo” are more engaging, this explains why shorter tweet lengths are prefered in the photography niche.
The “Engagement by Day” is my favorite figure in this evaluation. By looking at the orange graph, you can see that around 700 tweets about photography (that have at least 1 like, reply and retweet) get posted on Twitter every day. It also seems like the interest in photography has increased in December, because there are now 900 – 1000 tweets about photography being posted every day. Overall the interest in photography seems pretty stable, with one big exception: the 19th of August. On the 19th of August, the number of tweets that were pubslihed increased to over 2100 tweets. At first, I thought that this was caused by a bug in my code – turns out it was not. The 19th of August is also known as the “World Photography Day”, which explains the increase in photography tweets on that day. This is another great example of why it is important to focus your analytics on a specific niche.
By looking at the blue graph, you can see that a tweet in the niche “photography” gets around 90 engagement on average and there are only small fluctuations between the days. When you look at the average engagement in December you can see that the number of posts per day went up (orange graph) but the average engagement went down (blue graph). I am guessing that this occurs because more tweets per day also lead to more competition. Therefore the average engagement per tweet drops. Interestingly enough, we do not see this behavior on the 19th of August, so I cannot fully prove this theory.
The “Engagement by Weekday” figure is pretty simple. Both the orange and the blue graph follow a relatively straight line. This means that there is no day of the week on which more photography tweets are posted than on the other days (orange graph). Furthermore, all the tweets got the same average engagement independent of the weekday on which they were publsihed (blue graph).
Let’s break the time of publishing down even more and look at the number of tweets per hour. I created this graph from the german perspective by using the timezone MEZ (UTC+1). I am also using the 24 hour notation to get rid of the AM/PM confusion. To convert this graph to CST (UTC-6), you have to subtract 7 hours.
By looking at the orange graph, you can see that the least amount of tweets are published from 03:00 to 06:00. After 6:00, there is a slow increase in tweets that plateaus from 11:00 to 23:00.
If you now switch to the blue graph, you can see a pretty interesting insight: the average engagement of a tweet peaks at 08:00. This means that the best time publishing time for photography tweets is 08:00. During this time the number of tweets that are being posted increases (and probably also the Twitter usage of users that are interested in photography) but has not reached its maximum yet. This time of the day seems to be the sweet spot for getting a higher engagement on a photography post.
Hashtags are a very important way to describe and categorize the contents of your tweet. Nevertheless, the orange graph shows that not all Twitter accounts are using hashtags in their tweets. 33.18 % of the tweets in the data set did not contain a hashtag. Second most popular are one, two, three, four or five hashtags with a 5% usage each (38.7 % in total). Starting with six hashtags, the number of tweets drops and steadily declines with an increasing number of hashtags.
A question that I often asked myself when composing a tweet was: “Should I use hashtags or not?”. Luckily, we are now able to answer that question (at least for photography tweets) by looking at the average engagement graph. The graph shows that tweets without hashtags performed the worst in terms of engagement. The best amount of hashtags according to our data set is five. Tweets containing five hashtags received twice the engagement that tweets without hashtags got on average. Tweets with over six hashtags also seemed to perform quite well. You should take those average engagement rates with a grain of salt since they are based on a pretty small data set (under 5000 tweets). In conclusion, four or five hashtags seem to work well for photography tweets.
Time for the great finale. In my last analysis, I want to answer a question as old as social media itself: “What hashtags should I use?”. Before we can answer this question, I want to explain the graphs you are seeing. The graph shows 125 hashtags of our data set. For every hashtag, I calculated three metrics. “Amount of Tweets” (orange) shows how often a hashtag was used. “Average Engagement” (blue) shows how much engagement a tweet with that hashtag received on average. And finally, “Amount of Accounts using Hashtag” shows how many different accounts used the hashtag in a tweet.
I would suggest hiding all the graphs, except for the orange one. Now you can see the hashtags that were most used in the data set. The hashtag that was used most is “#photography” (obviously). Futhermore, the hashtags “#nature”, “#photo”, “#naturephotography”, “#photooftheday” were used often. By zooming and hovering over the data points, you can get a feeling of the hashtags that were popular over the last three months in the photography niece.
The next thing you should look at is the green graph. It is very similar to the orange graph, but the values are smaller. This graph does not count how often a hashtag was used but how many different accounts used a hashtag. This is very important for filtering the hashtags. Imagine Harry Styles (he currently has one of the highest engagements on Twitter) posts 100 selfies using “#photography” and “#harrystyles”. This would lead my algorithm to believe that “#harrystyles” is a very engaging hashtag for photography tweets when in reality it only makes sense to use it if you actually are Harry Styles. Therefore I defined that a hashtag has to be used by at least 100 different Twitter accounts to be ranked in this graph. If you zoom into the green graph, you will see that no data point drops under 100.
With the filter in place, we can now switch to the blue graph and find out which hashtag got the most engagement on average. The overall winner is “#photos” with an engagement of 335 on average. This hashtag is not as overused (1340 tweets in three months) as “#nature” (8697 tweets in three months) or “#photo” (6135 tweets in three months) but it is still broad enough to be displayed to many accounts. Also popular are “#animals” (with an engagement of 314 on average), “#garden” (with an engagement of 235 on average) and “#night” (with an engagement of 209 on average). Feel free to pan through the graph and you might be able to spot some hashtags that you regularly use.
Is it possible to discover meaningful social media insights by applying data gathering and data analytics on the social network Twitter? Yes, definitely!
Is it as easy as many articles and social media guides say? Of course not!
While getting actionable social media insights by analyzing data is possible, I think that it gets extremely oversimplified most of the time. I have seen a lot of articles that claim to know the perfect type of tweet to post, the perfect timing, or the perfect hashtags that work for EVERY tweet. By analyzing different data sets I, was able to learn that there are no global insights. The users, topics, interests, and behavior are so diverse that you simply cannot find any meaningful insights that fit all of them.
In order to find those insights, you have to isolate the niche you want to look at in a smart way. I had to look through a lot of data sets before I found the right parameters to isolate the photography niche and establish a useful engagement distribution. So the next time you see a “one-fits-all” social media insight, you should probably be a little skeptical.