This Author published in this journals
All Journal Intelmatics
Wilda Anggriani
Fakultas Teknologi Industri, Universitas Trisakti

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Perolehan Informasi Kembali (Information Retrieval/IR) Menggunakan Topic Modelling untuk Dataset Tempo Wilda Anggriani; Syandra Sari; Anung B. Ariwibowo; Dedy Sugiarto
Intelmatics Vol. 1 No. 2 (2021): Juli - Desember
Publisher : Penerbitan Universitas Trisakti

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.25105/itm.v1i2.5030

Abstract

In the era of technology as it is today, many technologies and information are growing. The presence of information technology makes it easy for everyone to find information. Usually people use search engines like Google, Yahoo, etc. to find information., many technologies and information are growing. The presence of information technology makes it easy for everyone to find information. Usually people use search engines like Google, Yahoo, etc. to find information.Search engines really help humans to get information. Usually the search engine is one example of information retrieval (Information Retrieval / IR). Documents that produced by search engines are relevant documents based on user requests.In this study, the author implemented the IR process to find relevant documents based on existing queries. The results will be compared with relevant documents from previous research using the same dataset, namely the Tempo dataset from 2000 to 2002. This can find out how far the performance of the method used in this research is based on previous research. The method used in this research is the doc2vec method.From the results obtained using the doc2vec model, the smaller the epoch on the doc2vec model, the smaller the results of the average percentage similarity between the relevant documents produced by the doc2vec model and the relevant documents beforehand. While the results of the percentage similarity average of the doc2vec model are based on the vector size which is after the vector size 30 the result is above 35%. Epoch which produces the highest percentage average is epoch 25 from epoch 25, 50, 75, and 100. Vector size that produces the highest average percentage similarity is vector size 40 from vector size 10, 20, 30, 40, 50, 60, 70, 80, 90, and 100. The highest results of the highest percentage similarity are generated by the doc2vec model that uses epoch 25 and vector size 40 is 41,930. In the era of technology as it is today, many technologies and information are growing. The presence of information technology makes it easy for everyone to find information. Usually people use search engines like Google, Yahoo, etc. to find information., many technologies and information are growing. The presence of information technology makes it easy for everyone to find information. Usually people use search engines like Google, Yahoo, etc. to find information.Search engines really help humans to get information. Usually the search engine is one example of information retrieval (Information Retrieval / IR). Documents that produced by search engines are relevant documents based on user requests.In this study, the author implemented the IR process to find relevant documents based on existing queries. The results will be compared with relevant documents from previous research using the same dataset, namely the Tempo dataset from 2000 to 2002. This can find out how far the performance of the method used in this research is based on previous research. The method used in this research is the doc2vec method.From the results obtained using the doc2vec model, the smaller the epoch on the doc2vec model, the smaller the results of the average percentage similarity between the relevant documents produced by the doc2vec model and the relevant documents beforehand. While the results of the percentage similarity average of the doc2vec model are based on the vector size which is after the vector size 30 the result is above 35%. Epoch which produces the highest percentage average is epoch 25 from epoch 25, 50, 75, and 100. Vector size that produces the highest average percentage similarity is vector size 40 from vector size 10, 20, 30, 40, 50, 60, 70, 80, 90, and 100. The highest results of the highest percentage similarity are generated by the doc2vec model that uses epoch 25 and vector size 40 is 41,930.