Integrated : Journal Of Information Technology and Vocational Education
Vol 1, No 1 (2019)

Design process data storage and organize data scraping

Falentino Sembiring (Universitas Nusa Putra)
Dian Permata Sari (Indonesian Education University)



Article Info

Publish Date
01 Apr 2019

Abstract

In this study Web scraping will explain the process of retrieving urls from similar sites for the erosion process and storing url data on daily, weekly, monthly, and annual databases, so that url data can be valid and invalid urls will be filtered. filtering will be done to make it easier for a number of processes to be moved into the database. The next process will distinguish url based on available content data based on title, tags, keywords like SEO. Each step will be stored in the data warehouse to create the url data center. Hopefully this is the stage to collect data for big data. Problems are limited by designing web crawlers by searching for similar sites and storing processes in the database. From the database it will be directed to the data warehouse data. after in the data warehouse, data will be processed in the interface to the user divided by classification

Copyrights © 2019






Journal Info

Abbrev

integrated

Publisher

Subject

Computer Science & IT Control & Systems Engineering Education Engineering Other

Description

INTEGRATED is a scientific journal published by the Department of PSTI UPI Kampus Purwakarta. This journal contains scientific papers from Academics, Researchers, and Practitioners about research on information system and vocational education. INTEGRATED is published twice a year in April and ...