Indonesian Journal of Information System
Vol 2, No 2 (2020): February 2020

Web Scraping with HTML DOM Method for Data Collection of Scientific Articles from Google Scholar

Rahmatulloh, Alam (Unknown)
Gunawan, Rohmat (Unknown)



Article Info

Publish Date
26 Feb 2020

Abstract

Google Scholar is a web-based service for searching a broad academic literature. Various types of references can be accessed such as: peer-reviewed papers, theses, books, abstracts and articles from academic publishers, professional communities, pre-printed data centers, universities and other academic organizations. Google Scholar provides the profile creation feature of every researcher, expert and lecturer. Quantity of publication from an academic institution along with detailed data on the publication of scientific articles can be accessed through Google Scholar. A recap of the publication of scientific articles of each researcher in an institution or organization is needed to determine the research performance collectively. But the problems that occur, the unavailability of recap services for publishing scientific articles for each researcher in an institution or organization. So that the scientific article publication data can be utilized by academic institutions or organizations, this research will take data from Google Scholar to make a recap of scientific article publication data by applying web scraping technology. Implementation of web scraping can help to take the available resources on the web and the results can be utilized by other applications. By doing web scraping on Google Scholar, collective scientific article publication data can be obtained. So that the process of making scientific publications data recap can be done quickly. Experiments in this study have succeeded in taking 236 researchers data from Google Scholar, with 9 attributes, and 2,420 articles.

Copyrights © 2020