Data capturing method and device and medium
The invention relates to the technical field of crawlers, in particular to a data capturing method, which comprises the following steps of: calling a browser to initiate a webpage access request, simulating user operation, opening a page, extracting target data in the page, obtaining a result after...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The invention relates to the technical field of crawlers, in particular to a data capturing method, which comprises the following steps of: calling a browser to initiate a webpage access request, simulating user operation, opening a page, extracting target data in the page, obtaining a result after webpage rendering and obtaining data returned to the page by utilizing a Python environment and a selenium automatic test tool. Compared with the prior art, the method has the advantages that when facing an anti-crawler mechanism, the anti-crawler mechanism can be effectively avoided through a series of operations, and the threshold of data acquisition is greatly improved.
本发明涉及爬虫技术领域,具体提供了一种数据抓取方法,利用Python环境和selenium自动化测试工具,调用浏览器发起网页访问请求,模拟用户操作,打开页面,在页面中提取目标数据,得到网页渲染后的结果,获取返回页面中的数据。与现有技术相比,本发明的在面对反爬虫机制时,通过一系列操作,能够有效避免反爬虫机制,大大提高了数据采集的门槛。 |
---|