Web Scraping Tool For Newspapers And Images Data Using Jsonify

Web scraping is the process of extracting data from a website in an efficient and fast way. In such a scenario, python programming can offer useful set of methods that help web editors to improve the quality of the provided service. This scraper contains three steps 1) to understand the structure of...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of Applied Science and Engineering 2023-01, Vol.26 (4), p.465-474
Hauptverfasser: Qingli Niu, Irfan Ali Kandhro, Anil Kumar, Shahnawaz shah, Muhammad Hasan, Hifza Mehfooz Ahmed, Fei Liang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Web scraping is the process of extracting data from a website in an efficient and fast way. In such a scenario, python programming can offer useful set of methods that help web editors to improve the quality of the provided service. This scraper contains three steps 1) to understand the structure of web page, 2) design regular expression pattern and finally use that pattern to get certain data. In this paper, we also used Flask, Request, JSONify library to get the data, after processing, the data is transformed into the JSON form and ready for CSV with help of API. After generated all required regex patterns, the system uses these patterns as a set of rules, and with this, designed scraper tool works efficiently, and achieved outstanding results with help of support libraries to storing and extracting the news and web-based information. The proposed Web scraping tool eliminates the time and effort of manually collecting or copying data by automating the process. It is found that this designed scraper is easy and direct approach to extract the newspapers, websites, blogs, and images data.
ISSN:2708-9967
2708-9975
DOI:10.6180/jase.202304_26(4).0002