JOIN (Jurnal Online Informatika)
Vol 5 No 1 (2020)

Two-stage Gene Selection and Classification for a High-Dimensional Microarray Data

Masithoh Yessi Rochayani (Universitas Brawijaya)
Umu Sa'adah (Universitas Brawijaya)
Ani Budi Astuti (Universitas Brawijaya)



Article Info

Publish Date
16 Jul 2020

Abstract

Microarray technology has provided benefits for cancer diagnosis and classification. However, classifying cancer using microarray data is confronted with difficulty since the dataset has high dimensions. One strategy for dealing with the dimensionality problem is to make a feature selection before modeling. Lasso is a common regularization method to reduce the number of features or predictors. However, Lasso remains too many features at the optimum regularization parameter. Therefore, feature selection can be continued to the second stage. We proposed Classification and Regression Tree (CART) for feature selection on the second stage which can also produce a classification model. We used a dataset which comparing gene expression in breast tumor tissues and other tumor tissues. This dataset has 10,936 predictor variables and 1,545 observations. The results of this study were the proposed method able to produce a few numbers of selected genes but gave high accuracy. The model also acquired in line with the Oncogenomics Theory by the obtained of GATA3 to split the root node of the decision tree model. GATA3 has become an important marker for breast tumors.

Copyrights © 2020






Journal Info

Abbrev

join

Publisher

Subject

Computer Science & IT

Description

JOIN (Jurnal Online Informatika) is a scientific journal published by the Department of Informatics UIN Sunan Gunung Djati Bandung. This journal contains scientific papers from Academics, Researchers, and Practitioners about research on informatics. JOIN (Jurnal Online Informatika) is published ...