Statistical Modelling of Bot Detection in Social Media Using Logistic Regression and Numerical Algorithms

Arun Kumar Chaudhary; Kapil Shah; Lal Babu Sah Telee; Suresh Kumar Sahani

pdf

Published: Apr 21, 2023

Keywords:

Bot Detection, Logistic Regression, Gradient Descent, Statistical Modelling, Classification Algorithms, Numerical Optimization, Social Media Analytics, Behavioral Features, Twitter Bot Dataset, Digital Misinformation

Arun Kumar Chaudhary

Kapil Shah

Lal Babu Sah Telee

Suresh Kumar Sahani

Abstract

The recent explosion in social networking websites has released real problems of proliferation of automatic accounts, or bots, that could be employed to manipulate public opinion, spread misinformation, and skew data-driven applications. This study develops a statistical framework for detecting such bots using logistic regression models based on numerical optimization techniques. Through the integration of computational mathematics and data science, the paper aims to model user behavior on Twitter and other social media with regard to classifying accounts as bots or authentic users. The logistic regression model is optimized with gradient-based numerical solvers in an effort to improve classification performance. Information is gathered from real and verified public datasets such as the PAN 2019 bots dataset and Twitter's bot repository in an effort to stay empirically grounded. The results confirm the effectiveness of logistic regression in predicting decision boundaries between bots and humans statistically, at 89.4% accuracy level on test data. Additionally, the explainability capability of this model gives researchers more insights into behaviour indicators such as tweet rate, retweet rate, posting time entropy, and friend/follower ratios. This paper presents a mathematicised social media monitoring mechanism that not only feeds into computational statistics but offers an efficient instrument for digital policy and cybersecurity interventions.

Issue

Vol. 22 (2023)

Section

Articles

scimago

indexing

hec logo

Announcements

Change of Publisher

February 13, 2025

We are writing to inform all of a recent change in the publishing arrangements for Linguistic and Philosophical Investigations, ISSN: 1841-2394, e-ISSN: 2471-0881 Effective 01/01/2025, the journal will transition its publishing responsibilities from Auricle Global Society of Education and Research to the Linguistic and Philosophical Investigations.

The latest h-index of the Linguistic and Philosophical Investigations is 17.

February 1, 2025

The latest h-index of the Linguistic and Philosophical Investigations is 17.

Make a Submission

submission notice

Authors can submit their final papers using the submission button. If you face any difficulties, you can directly send your final paper to editor@philolinginvestigations.com