Bot Detection in GitHub Repositories

Contemporary social coding platforms like GitHub promote collaborative development. Many open-source software repositories hosted in these platforms use machine accounts (bots) to automate and facilitate a wide range of effort-intensive and repetitive activities. Determining if an account correspond...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2022-03
Hauptverfasser: Chidambaram, Natarajan, Pooya Rostami Mazrae
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Contemporary social coding platforms like GitHub promote collaborative development. Many open-source software repositories hosted in these platforms use machine accounts (bots) to automate and facilitate a wide range of effort-intensive and repetitive activities. Determining if an account corresponds to a bot or a human contributor is important for socio-technical development analytics, for example, to understand how humans collaborate and interact in the presence of bots, to assess the positive and negative impact of using bots, to identify the top project contributors, to identify potential bus factors, and so on. Our project aims to include the trained machine learning (ML) classifier from the BoDeGHa bot detection tool as a plugin to the GrimoireLab software development analytics platform. In this work, we present the procedure to form a pipeline for retrieving contribution and contributor data using Perceval, distinguishing bots from humans using BoDeGHa, and visualising the results using Kibana.
ISSN:2331-8422
DOI:10.48550/arxiv.2203.16997