DISTRIBUTED MACHINE LEARNING ARCHITECTURE WITH HYBRID DATA NORMALIZATION, PROOF OF LINEAGE AND DATA INTEGRITY
A system, apparatus and method for processing observational data for training a neural network model for use by a neural network. Observational data is parsed into raw data and metadata components and then stored separately on a raw data storage system and a metadata storage system, respectively. A...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Patent |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | A system, apparatus and method for processing observational data for training a neural network model for use by a neural network. Observational data is parsed into raw data and metadata components and then stored separately on a raw data storage system and a metadata storage system, respectively. A digital fingerprint may be generated and assigned to the raw data and the metadata, used to verify the integrity of the data. When it is desired to train a neural network model, a DETL query is generated and used to identify any raw data that may be relevant to training the neural network model. The DETL query is processed by the metadata storage system to match any metadata to search terms in the DETL which, in turn, identifies raw data stored in the raw data storage system. The identified raw data is used to train the neural network model, and a updated neural network model is produced. Each time the neural network model is trained, the relevant raw data and metadata used for each training run is stored in association with the neural network model version so that a lineage of the training may be memorialized and used later to identify defects in the neural network model and/or to validate the integrity of the data used to train the neural network model. |
---|