SYSTEMS AND METHODS FOR PARSING AND INGESTING DATA IN BIG DATA ENVIRONMENTS

The system may validate a data source having a structured format and a grammar that includes tags. The system may identify a tag in the grammar. The system may parse the data source to extract attributes and/or values associated with the tags in response to successful validation. The system may also...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Naik, Harish, Arya, Sachin, Manesh, Ajay Paul Singh, Bose, Sandeep, Singh, Neha, Agarwal, Rahul
Format: Patent
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The system may validate a data source having a structured format and a grammar that includes tags. The system may identify a tag in the grammar. The system may parse the data source to extract attributes and/or values associated with the tags in response to successful validation. The system may also write the attributes and/or values to an output file separated by a preselected delimiter. A configuration file may identify the grammar, the preselected delimiter, and/or the data source. The data source may be in an XML format or a JSON format. The system may generate execution ready code in response to the validating the data source and the grammar. The output file may be a load ready file for ingestion into a big data storage format. The tag may include a parent tag and a sub tag corresponding to a hierarchy in the data source.