Muharaf-public

Manuscripts of Handwritten Arabic dataset (Muharaf) for cursive text recognition.  The following files are present in this repositoriy: public_data_files.zip: Contains the public part of Muharaf dataset. It has the images and the corresponding annotation files in JSON and XML format. public_line_ima...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Saeed, Mehreen, Chan, Adrian, Mijar, Anupam, Moukarzel, Joseph, Habchi, Georges, Younes, Carlos, Elias, Amin, Wong, Chau-Wai, Khater, Akram
Format: Dataset
Sprache:ara
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Manuscripts of Handwritten Arabic dataset (Muharaf) for cursive text recognition.  The following files are present in this repositoriy: public_data_files.zip: Contains the public part of Muharaf dataset. It has the images and the corresponding annotation files in JSON and XML format. public_line_images.zip: Contains the line images and their corresponding transcriptions. public_summary_and_keywords.zip: Contains the summary and keywords extracted from the ground truth transcriptions of each image. sfr_files.zip: Contains the preprocessed files for the start_follow_read_arabic system for training the public part of Muharaf dataset. public_1100_untrained.zip: Contains an initiailized trial folder with 3 different random splits of (train, validation, test) to reproduce the experiments reported in the paper on Muharaf-public.  public_1100_trained.zip: Contains the results and models weights after training on Muharaf-public. It has results of three different random splits of (train, validation, test) sets. trial_15_untrained.zip: Contains an intialized trial folder with 3 different random splits of (train, validation, test) to reproduce the experiments reported in the paper on training all the files of Muharaf dataset (1500 training images).  trial_15.zip: Contains the results and model weights after training on Muharaf. It has results of three different random splits of (train, validation, test) sets.
DOI:10.5281/zenodo.11492214