OSSInsight: Scalable GitHub Analysis
GitHub is a platform hosting code, enabling collaboration, and supporting version control for a global community of over 100 million developers. The need for free tools is crucial for researching open-source software. Based on our research, we found out that existing tools lack real-time GitHub data...
Gespeichert in:
Veröffentlicht in: | Proceedings of the VLDB Endowment 2024-08, Vol.17 (12), p.4321-4324 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | GitHub is a platform hosting code, enabling collaboration, and supporting version control for a global community of over 100 million developers. The need for free tools is crucial for researching open-source software. Based on our research, we found out that existing tools lack real-time GitHub data processing or have limited functionalities.
This demonstration presents OSSInsight, an open source tool for researching and analyzing GitHub repositories. We first present the architecture of the tool including its access to nearly 7 billion archived & real time data and how it is powered by TiDB. The demonstration shows how OSSInsight provides analysis of GitHub data along three dimensions: developers, repositories and organizations. All these analysis are based on generated SQL queries submitted to TiDB database. TiDB possesses HTAP capabilities, utilizing its row store for simple SQL queries while relying on its column store for more complex queries. Users can view and edit these SQL queries and also view their execution plan. Finally, OSSInsight provides an innovative tool based on OpenAI, that conducts data analysis using input in English text, yielding visual representations in the form of charts and graphs. |
---|---|
ISSN: | 2150-8097 2150-8097 |
DOI: | 10.14778/3685800.3685865 |