Ok, so let's start with a little bit of descriptive statistics shall we? What are the videos with the most views in this trending group?
These are the top 20%
We can see that the video on this trending group is Childish Gambino "This is America" video with more than 3 billion views and so far at least the next 4 videos have more than a billion views.
The next element we are going to review that was available is the "Dislikes" of a video, so Which are the videos with the biggest number of dislikes in this 2018 trend analysis?
The video with the biggest number of dislikes is the "youtube rewind: the shapes of 2017" with more than 10 millions dislike, after that the video called "So Sorry" and the Childish Gambino video is in third place, is is possible that there is a relationship between the popularity of the video and the number of dislikes it gets?... we will see.
The next element to analyse are the videos with the biggest numbers of likes, here are the results:
The top video here again is the one from Childish Gambino, the second one is "Fake Love" and the third one is Adriana Grande with the video "No tears left to cry", we can see the Childish Gambino video again, but it is not clear if the "Likes" make the video popular or because the video is popular the viewers click the "Like" button.
Now we will analyse the videos with the most comments, here are the results:
The top 3 videos with the most comments are "Fake Love", Childish Gambino "This is America" and "So Sorry" , we can see certain pattern here right? We can see that the most popular videos have a lot of interactions nevertheless it is not clear which one is the cause and which one is the effect, so let's keep analysing the data.
We have analysed individual videos so far, now we will show you the relationship between video channels and views:
One can see that once again the channel of Childish Gambino is the one with the most views, Marvel Entertainment is the 4 and Maluma (a spanish singer) it is the 6th once, very good combination of rap music, comics and spanish entertainment, as a side note we can see Taylor Swift in a very distant 15 place, even she is one of the best selling singers.
Once we have analysed the videos, channels, likes, etc.. let's analyse the correlation among variables to be able to identify what variables have a direct relationship and which ones do not have it.
What I am trying to evaluate with this correlation comparisons is to see which variables have a higher correlation and impact in the fact that the videos were trending, the idea is to compare the different combinations of these variables:
The correlation will be a number between 0 and 1 the bigger the number the more correlations exists between the variables.
I will compare first the relationship that exists between Views and Likes for the trending videos.
Likes vs Views.
See the above graphic, for reference I have always marked the Childish Gambino video, the numbers that are shown is the number of likes, the line in the graph is the line that shows the correlation, in this case there is a positive correlation (.80) meaning that one increases the other increases too, lets keep analysing the rest of the variables....
Views vs Dislikes.
These two variables have little correlation between them (.3) meaning not because you have more views the number of dislikes will growth or vice versa.
Views vs Comments.
In this case Views and Comments seem to have little correlation (.47)
Likes vs Dislikes.
I was expecting a lot more correlation between these two, but it is really small (.28)
Comments vs Likes.
This is one of the biggest correlations between two variables in this analysis .70 .
Little correlation between these two variables too, comments, vs dislikes (.49)
Analysing the correlation among variables we get the following table:
This table indicates the likes and views have the biggest correlation and after that the likes and comments, can we conclude those correlations make a video go trending? No, no yet we would need to make an anova analysis BUT even with this information we can get certain conclusions.
CONCLUSIONS AND PRACTICAL APPLICATIONS:
With all the information we have analysed we can see that the videos that were trending in 2018 have several elements in common, the number of dislikes seems to be totally independent to the rest (low correlation), the most important correlation factor are that videos with high number of views have high number of likes, this is important for those that are using bots to make their videos more natural, the second important element is that videos with high number of likes have a high number of comments, you need to take this into account to sell the right signals to the youtube algorithm otherwise your video is clearly being manipulated and youtube will know it.
None of these elements fully explain why a video goes trending BUT it help us to realise how difficult is to get conclusions from tests and analysis, and to separate facts from beliefs based on the data, there are more questions than answers example: what if we analyse the videos based on the category they belong to? Are all the videos the same across different categories? Is there a lot of variation, I will add this information if there is interest, as well as to make an anova analysis to identify from all the variables we have is the one that have the most weight in the trendiness of the video.