Tuberculosis (TB), caused by Mycobacterium tuberculosis, is a global health threat that spreads through the air. Factors such as gender, age, and geographical location influence its spread. Indonesia, the country with the second-highest number of TB cases globally, recorded a significant increase in TB cases from 2020 to 2022, especially in Semarang City. To minimize TBs impact, its crucial to identify the factors influencing its progression. Machine Learning techniques like feature selection (Information Gain) and classification algorithms (Random Forest) can be utilized. Feature selection helps determine which factors most influence TB by ranking attribute weights, while Random Forest is used for classification. Oversampling techniques like Synthetic Minority Oversampling Technique (SMOTE) are used to handle data imbalance and improve classification performance. The study concluded that the Random Forest classification model showed the best performance using all features or attributes from the highest to the lowest weight namely; tipe_diagnosis, jenis_fasyankes, usia, kelurahan_kecamatan, riwayat_dm, riwayat_HIV, tahun, paduan_OAT, status_pekerjaan, jenis_kelamin, tipe_TBC, riwayat_TBC, bulan and sumber_obat on the original TB disease dataset in Semarang City. The recall and accuracy rate reached 75%. This result is better than the TB classification model in Semarang City that uses the oversampling dataset with SMOTE and only uses the top 10-12 attributes, with a recall and accuracy rate of 74%. This research shows that certain techniques in Machine Learning can help understand the factors influencing TB treatment outcomes.