Okkhor-Diffusion: Class Guided Generation of Bangla Isolated Handwritten Characters using Denoising Diffusion Probabilistic Model (DDPM)

Bangla has a unique script with a complex set of characters, making it a fascinating subject of study for linguists and cultural enthusiasts. Unique in some of its similar characters which are only distinguishable by subtle differences in their shapes and diacritics, there has been a notable increas...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE access 2024-01, Vol.12, p.1-1
Hauptverfasser: Fuad, Md Mubtasim, Faiyaz, A., Arnob, Noor Mairukh Khan, Mridha, M.F., Saha, Aloke Kumar, Aung, Zeyar
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Bangla has a unique script with a complex set of characters, making it a fascinating subject of study for linguists and cultural enthusiasts. Unique in some of its similar characters which are only distinguishable by subtle differences in their shapes and diacritics, there has been a notable increase in research on Bangla character recognition and classification using machine learning-based approaches. However, Handwritten Bangla Character Recognition (HBCR) training requires an adequate amount of data from a diversely distributed dataset. Making diverse datasets for HBCR training is a challenging and tedious task to carry out. Yet, there is limited research on the automatic generation of handwritten Bangla characters. Motivated by this open area of research, this paper proposes a novel approach 'Okkhor-Diffusion' for class-guided generation of Bangla isolated handwritten characters using a novel Denoising Diffusion Probabilistic Model (DDPM). No prior research has used DDPM for this purpose, making the proposed approach novel. The DDPM is a generative model that uses a diffusion process to transform noise-corrupted data into diverse samples; despite being trained on a small training set. In our experiments, StyleGAN2-ADA had notably inferior performance compared to Okkhor-Diffusion in generating realistic isolated handwritten Bangla characters. Experimental results on the BanglaLekha-Isolated dataset demonstrate that the proposed Okkhor-Diffusion model generates realistic isolated handwritten Bangla characters, with a mean Multi-Scale Structural Similarity Index Measure (MS-SSIM) score of 0.178 compared to 0.177 for the real samples. The Fréchet Inception Distance (FID) score for the synthetic handwritten Bangla characters is 5.426. Finally, the newly proposed Bangla Character Aware Fréchet Inception Distance (BCAFID) score of the proposed Okkhor-Diffusion model is 10.388.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2024.3370674