TY - JOUR
T1 - Twitter social bots
T2 - The 2019 Spanish general election data
AU - Pastor-Galindo, Javier
AU - Zago, Mattia
AU - Nespoli, Pantaleone
AU - López Bernal, Sergio
AU - Huertas Celdrán, Alberto
AU - Gil Pérez, Manuel
AU - Ruipérez-Valiente, José A.
AU - Martínez Pérez, Gregorio
AU - Gómez Mármol, Félix
N1 - Funding Information:
This study was partially funded by a grant from the Spanish National Cybersecurity Institute (INCIBE) with code INCIBEI-2015?27353, by the Spanish Government grants FPU18/00304, FJCI2017?34926, and RYC-2015?18210, co-funded by the European Social Fund, by a predoctoral grant from the University of Murcia and by the Irish Research Council, under the government of Ireland post-doc fellowship (grant code GOIPD/2018/466). Authors would also like to acknowledge Prof. Karl Aberer at EPFL, Prof. Albert Blarer at Armasuisse, H?ctor Cordob?s and IMDEA Networks Institute for their support to this work.
Funding Information:
This study was partially funded by a grant from the Spanish National Cybersecurity Institute (INCIBE) with code INCIBEI-2015–27353 , by the Spanish Government grants FPU18/00304 , FJCI2017–34926 , and RYC-2015–18210 , co-funded by the European Social Fund , by a predoctoral grant from the University of Murcia and by the Irish Research Council , under the government of Ireland post-doc fellowship (grant code GOIPD/2018/466 ). Authors would also like to acknowledge Prof. Karl Aberer at EPFL, Prof. Albert Blarer at Armasuisse, Héctor Cordobés and IMDEA Networks Institute for their support to this work.
Publisher Copyright:
© 2020 The Authors
PY - 2020/10
Y1 - 2020/10
N2 - The term social bots refer to software-controlled accounts that actively participate in the social platforms to influence public opinion toward desired directions. To this extent, this data descriptor presents a Twitter dataset collected from October 4th to November 11th, 2019, within the context of the Spanish general election. Starting from 46 hashtags, the collection contains almost eight hundred thousand users involved in political discussions, with a total of 5.8 million tweets. The proposed data descriptor is related to the research article available at [1]. Its main objectives are: i) to enable worldwide researchers to improve the data gathering, organization, and preprocessing phases; ii) to test machine-learning-powered proposals; and, finally, iii) to improve state-of-the-art solutions on social bots detection, analysis, and classification. Note that the data are anonymized to preserve the privacy of the users. Throughout our analysis, we enriched the collected data with meaningful features in addition to the ones provided by Twitter. In particular, the tweets collection presents the tweets’ topic mentions and keywords (in the form of political bag-of-words), and the sentiment score. The users’ collection includes one field indicating the likelihood of one account being a bot. Furthermore, for those accounts classified as bots, it also includes a score that indicates the affinity to a political party and the followers/followings list.
AB - The term social bots refer to software-controlled accounts that actively participate in the social platforms to influence public opinion toward desired directions. To this extent, this data descriptor presents a Twitter dataset collected from October 4th to November 11th, 2019, within the context of the Spanish general election. Starting from 46 hashtags, the collection contains almost eight hundred thousand users involved in political discussions, with a total of 5.8 million tweets. The proposed data descriptor is related to the research article available at [1]. Its main objectives are: i) to enable worldwide researchers to improve the data gathering, organization, and preprocessing phases; ii) to test machine-learning-powered proposals; and, finally, iii) to improve state-of-the-art solutions on social bots detection, analysis, and classification. Note that the data are anonymized to preserve the privacy of the users. Throughout our analysis, we enriched the collected data with meaningful features in addition to the ones provided by Twitter. In particular, the tweets collection presents the tweets’ topic mentions and keywords (in the form of political bag-of-words), and the sentiment score. The users’ collection includes one field indicating the likelihood of one account being a bot. Furthermore, for those accounts classified as bots, it also includes a score that indicates the affinity to a political party and the followers/followings list.
KW - Machine learning
KW - Sentiment analysis
KW - Social bots classification
KW - Social bots detection
KW - Social network analysis
UR - http://www.scopus.com/inward/record.url?scp=85089097695&partnerID=8YFLogxK
U2 - 10.1016/j.dib.2020.106047
DO - 10.1016/j.dib.2020.106047
M3 - Article
AN - SCOPUS:85089097695
SN - 2352-3409
VL - 32
JO - Data in Brief
JF - Data in Brief
M1 - 106047
ER -