Data capturing method and device and medium

The invention relates to the technical field of crawlers, in particular to a data capturing method, which comprises the following steps of: calling a browser to initiate a webpage access request, simulating user operation, opening a page, extracting target data in the page, obtaining a result after...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: MA RONGYU, GAO PENGCHAO, BI YUNPENG, LI NING
Format: Patent
Sprache:chi ; eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The invention relates to the technical field of crawlers, in particular to a data capturing method, which comprises the following steps of: calling a browser to initiate a webpage access request, simulating user operation, opening a page, extracting target data in the page, obtaining a result after webpage rendering and obtaining data returned to the page by utilizing a Python environment and a selenium automatic test tool. Compared with the prior art, the method has the advantages that when facing an anti-crawler mechanism, the anti-crawler mechanism can be effectively avoided through a series of operations, and the threshold of data acquisition is greatly improved. 本发明涉及爬虫技术领域,具体提供了一种数据抓取方法,利用Python环境和selenium自动化测试工具,调用浏览器发起网页访问请求,模拟用户操作,打开页面,在页面中提取目标数据,得到网页渲染后的结果,获取返回页面中的数据。与现有技术相比,本发明的在面对反爬虫机制时,通过一系列操作,能够有效避免反爬虫机制,大大提高了数据采集的门槛。