Stream processing method and system for unstructured file based on distributed architecture
The invention provides an unstructured file stream processing method based on a distributed architecture, and the method comprises the following steps: obtaining an unstructured file, and putting the unstructured file into an FTP (File Transfer Protocol) or MinIO (Minimum Input/Output); designing an...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The invention provides an unstructured file stream processing method based on a distributed architecture, and the method comprises the following steps: obtaining an unstructured file, and putting the unstructured file into an FTP (File Transfer Protocol) or MinIO (Minimum Input/Output); designing an FTP (File Transfer Protocol) connector or a MinIO (Minimum Input Output) connector based on an Flink framework to read the unstructured file; performing dynamic processing on the unstructured file based on Flink distributed deployment, and recording and storing progress information of processing the unstructured file; a Format processor is integrated in the FTP connector or the MinIO connector, and the unstructured file is analyzed and processed; and writing the Flink SQL to write the processed data into a storage library. A large number of unstructured files which are continuously generated are read in a stream form by adopting an Flink distributed architecture, and the characteristics in stream processing are ap |
---|