File content identification method and equipment

The invention provides a file content identification method and equipment, and the equipment comprises a configuration module which is used for defining a file and attributes of collection items in the file, obtaining the positions of the collection items in the file, and configuring identification...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: SHEN AO, LIN ZANLEI, SHANG LEI, SONG YANG, CHEN BO
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator SHEN AO
LIN ZANLEI
SHANG LEI
SONG YANG
CHEN BO
description The invention provides a file content identification method and equipment, and the equipment comprises a configuration module which is used for defining a file and attributes of collection items in the file, obtaining the positions of the collection items in the file, and configuring identification rules of the collection items; the task module is used for creating a recognition task, preliminarily recognizing the content of the file, and dividing the recognition task of the file content into a character recognition task and an OCR recognition task according to a preliminary recognition result; an acquisition module; and collecting the content of the collection item in the file according to the identification task, and identifying the text in the content of the collection item according to the rule defined in the configuration module. According to the method, character recognition or OCR picture recognition can be automatically recognized, flexible configurable processing of collection of collection items is
format Patent
fullrecord <record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_CN117688350A</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>CN117688350A</sourcerecordid><originalsourceid>FETCH-epo_espacenet_CN117688350A3</originalsourceid><addsrcrecordid>eNrjZDBwy8xJVUjOzytJzStRyEwBkplpmcmJJZn5eQq5qSUZ-SkKiXkpCqmFpZkFuUBZHgbWtMSc4lReKM3NoOjmGuLsoZtakB-fWlyQmJyal1oS7-xnaGhuZmFhbGrgaEyMGgB5SyuW</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>File content identification method and equipment</title><source>esp@cenet</source><creator>SHEN AO ; LIN ZANLEI ; SHANG LEI ; SONG YANG ; CHEN BO</creator><creatorcontrib>SHEN AO ; LIN ZANLEI ; SHANG LEI ; SONG YANG ; CHEN BO</creatorcontrib><description>The invention provides a file content identification method and equipment, and the equipment comprises a configuration module which is used for defining a file and attributes of collection items in the file, obtaining the positions of the collection items in the file, and configuring identification rules of the collection items; the task module is used for creating a recognition task, preliminarily recognizing the content of the file, and dividing the recognition task of the file content into a character recognition task and an OCR recognition task according to a preliminary recognition result; an acquisition module; and collecting the content of the collection item in the file according to the identification task, and identifying the text in the content of the collection item according to the rule defined in the configuration module. According to the method, character recognition or OCR picture recognition can be automatically recognized, flexible configurable processing of collection of collection items is</description><language>chi ; eng</language><subject>CALCULATING ; COMPUTING ; COUNTING ; DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FORADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORYOR FORECASTING PURPOSES ; ELECTRIC DIGITAL DATA PROCESSING ; PHYSICS ; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE,COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTINGPURPOSES, NOT OTHERWISE PROVIDED FOR</subject><creationdate>2024</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20240312&amp;DB=EPODOC&amp;CC=CN&amp;NR=117688350A$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,776,881,25543,76294</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20240312&amp;DB=EPODOC&amp;CC=CN&amp;NR=117688350A$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>SHEN AO</creatorcontrib><creatorcontrib>LIN ZANLEI</creatorcontrib><creatorcontrib>SHANG LEI</creatorcontrib><creatorcontrib>SONG YANG</creatorcontrib><creatorcontrib>CHEN BO</creatorcontrib><title>File content identification method and equipment</title><description>The invention provides a file content identification method and equipment, and the equipment comprises a configuration module which is used for defining a file and attributes of collection items in the file, obtaining the positions of the collection items in the file, and configuring identification rules of the collection items; the task module is used for creating a recognition task, preliminarily recognizing the content of the file, and dividing the recognition task of the file content into a character recognition task and an OCR recognition task according to a preliminary recognition result; an acquisition module; and collecting the content of the collection item in the file according to the identification task, and identifying the text in the content of the collection item according to the rule defined in the configuration module. According to the method, character recognition or OCR picture recognition can be automatically recognized, flexible configurable processing of collection of collection items is</description><subject>CALCULATING</subject><subject>COMPUTING</subject><subject>COUNTING</subject><subject>DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FORADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORYOR FORECASTING PURPOSES</subject><subject>ELECTRIC DIGITAL DATA PROCESSING</subject><subject>PHYSICS</subject><subject>SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE,COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTINGPURPOSES, NOT OTHERWISE PROVIDED FOR</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2024</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNrjZDBwy8xJVUjOzytJzStRyEwBkplpmcmJJZn5eQq5qSUZ-SkKiXkpCqmFpZkFuUBZHgbWtMSc4lReKM3NoOjmGuLsoZtakB-fWlyQmJyal1oS7-xnaGhuZmFhbGrgaEyMGgB5SyuW</recordid><startdate>20240312</startdate><enddate>20240312</enddate><creator>SHEN AO</creator><creator>LIN ZANLEI</creator><creator>SHANG LEI</creator><creator>SONG YANG</creator><creator>CHEN BO</creator><scope>EVB</scope></search><sort><creationdate>20240312</creationdate><title>File content identification method and equipment</title><author>SHEN AO ; LIN ZANLEI ; SHANG LEI ; SONG YANG ; CHEN BO</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_CN117688350A3</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>chi ; eng</language><creationdate>2024</creationdate><topic>CALCULATING</topic><topic>COMPUTING</topic><topic>COUNTING</topic><topic>DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FORADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORYOR FORECASTING PURPOSES</topic><topic>ELECTRIC DIGITAL DATA PROCESSING</topic><topic>PHYSICS</topic><topic>SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE,COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTINGPURPOSES, NOT OTHERWISE PROVIDED FOR</topic><toplevel>online_resources</toplevel><creatorcontrib>SHEN AO</creatorcontrib><creatorcontrib>LIN ZANLEI</creatorcontrib><creatorcontrib>SHANG LEI</creatorcontrib><creatorcontrib>SONG YANG</creatorcontrib><creatorcontrib>CHEN BO</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>SHEN AO</au><au>LIN ZANLEI</au><au>SHANG LEI</au><au>SONG YANG</au><au>CHEN BO</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>File content identification method and equipment</title><date>2024-03-12</date><risdate>2024</risdate><abstract>The invention provides a file content identification method and equipment, and the equipment comprises a configuration module which is used for defining a file and attributes of collection items in the file, obtaining the positions of the collection items in the file, and configuring identification rules of the collection items; the task module is used for creating a recognition task, preliminarily recognizing the content of the file, and dividing the recognition task of the file content into a character recognition task and an OCR recognition task according to a preliminary recognition result; an acquisition module; and collecting the content of the collection item in the file according to the identification task, and identifying the text in the content of the collection item according to the rule defined in the configuration module. According to the method, character recognition or OCR picture recognition can be automatically recognized, flexible configurable processing of collection of collection items is</abstract><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier
ispartof
issn
language chi ; eng
recordid cdi_epo_espacenet_CN117688350A
source esp@cenet
subjects CALCULATING
COMPUTING
COUNTING
DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FORADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORYOR FORECASTING PURPOSES
ELECTRIC DIGITAL DATA PROCESSING
PHYSICS
SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE,COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTINGPURPOSES, NOT OTHERWISE PROVIDED FOR
title File content identification method and equipment
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T11%3A38%3A28IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=SHEN%20AO&rft.date=2024-03-12&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3ECN117688350A%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true