Method and device for extracting pptx file content
The invention discloses a method and a device for extracting pptx file content, which can extract the name of a text file from a text relation file presentation.xml.rels, and because the text in a page of slide in the pptx file is recorded in any text file, the text file can be obtained based on the...
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The invention discloses a method and a device for extracting pptx file content, which can extract the name of a text file from a text relation file presentation.xml.rels, and because the text in a page of slide in the pptx file is recorded in any text file, the text file can be obtained based on the name of the text file, and the text in the text file is extracted. In conclusion, according to themethod, the content of the pptx file can be extracted from the text file only by acquiring the text relation file presentation.xml.rels and the text file without acquiring format information of the pptx file, so that the speed of extracting the content of the pptx file is greatly increased. Furthermore, according to the method, the content of the pptx file is extracted through SAX, part of data inthe decompressed XML file can be loaded and processed, all file data do not need to be loaded, and therefore the method has the advantages of being small in occupied memory and high in extraction speed.
本发明公开了一种提取pptx文件内容的方法及 |
---|