Method and device for obtaining text based on really simple syndication (RSS)
The invention is suitable for technical field of Internet information, and provides a method for obtaining a text based on really simple syndication (RSS). The method comprises the following steps: capturing the source code of a webpage through a uniform resource locator (URL) in the feed of the RSS...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The invention is suitable for technical field of Internet information, and provides a method for obtaining a text based on really simple syndication (RSS). The method comprises the following steps: capturing the source code of a webpage through a uniform resource locator (URL) in the feed of the RSS; generating a document object model (DOM) according to the source code of the webpage, and determining DIV (Division) tags in the DOM; counting the characteristic value of each DIV tag according to a first preset rule; and extracting text nodes in the DIV tag with a maximum characteristic value as the text of the webpage. According to the method, the webpage code of the URL in the feed is captured, the DOM is generated according to the source code of the webpage, the characteristic values of all DIV tags in the DOM are counted, the text nodes in the DIV tag with the maximum characteristic value are taken as the text, and the text is obtained without opening the page of the text or adapting, so that the efficiency |
---|