Generative Models for Source Code: Fine-Tuning Techniques for Structured Pattern Learning

This study addresses the problem of how to automatically generate source code that is not only functional, but also well-structured, readable, and maintainable. Existing generative models for source code often produce functional code, but they lack consistency in structure and adherence to coding st...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Technologies (Basel) 2024-11, Vol.12 (11), p.219
Hauptverfasser: Franzoni, Valentina, Tagliente, Silvia, Milani, Alfredo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This study addresses the problem of how to automatically generate source code that is not only functional, but also well-structured, readable, and maintainable. Existing generative models for source code often produce functional code, but they lack consistency in structure and adherence to coding standards, essential for integration into existing application development projects and long-term software maintenance. By training the model on specific code structures, including a dataset with Italian annotations, the proposed methodology ensures that the generated code is compliant with both the functional requirements and the pre-defined coding standards. The methodology proposed in this study applies transfer learning techniques on the DeepSeek Coder model, to refine pre-trained models to generate code that integrates additional structuring constraints. By training the model on specific code structures, including a dataset with Italian comments, the proposed methodology ensures that the generated code meets both functional requirements and coding structure. Experimental results, evaluated using the perplexity metric, demonstrate the effectiveness of the proposed approach, which impacts the goals of reducing errors, and ultimately improves software development quality.
ISSN:2227-7080
2227-7080
DOI:10.3390/technologies12110219