Exploring Feature Pruning Techniques on High-Relevance Datasets for Predictive Analysis
DOI:
https://doi.org/10.64803/juikti.v2i1.86Keywords:
Feature Pruning, Predictive Analysis, High-Dimensional Data, Machine Learning, Feature SelectionAbstract
In the era of big data, predictive analytics has become a vital approach for extracting actionable insights from high-relevance datasets across various domains, including healthcare, finance, and environmental science. However, the increasing dimensionality of modern datasets poses significant challenges, such as overfitting, high computational costs, and reduced model interpretability, which can negatively impact predictive performance. Feature pruning has emerged as an effective strategy to address these challenges by eliminating irrelevant or redundant features while preserving the most informative attributes for model learning. This study aims to explore and systematically evaluate the effectiveness of multiple feature pruning techniques when applied to high-relevance datasets for predictive analysis. The research adopts an experimental comparative approach by analyzing filter-based, wrapper-based, embedded, and adaptive pruning methods in conjunction with several widely used predictive models, including Random Forest, Support Vector Machine, and Neural Networks. Performance evaluation is conducted using standard metrics such as accuracy, precision, recall, F1-score, and computational training time to assess both predictive quality and efficiency. The experimental results demonstrate that feature pruning significantly enhances model performance and generalization while reducing computational complexity. Among the evaluated techniques, adaptive pruning methods consistently outperform traditional approaches by dynamically capturing complex feature interactions and minimizing information loss. Moreover, the cross-domain analysis reveals that adaptive and embedded pruning techniques exhibit strong scalability and robustness across different dataset characteristics. These findings highlight the critical role of feature pruning as an integral component of predictive modeling pipelines rather than a mere preprocessing step. Overall, this study contributes to a deeper understanding of feature pruning dynamics and provides practical insights for selecting appropriate pruning strategies to improve predictive accuracy, efficiency, and interpretability in high-dimensional data environments.
References
[1] C. Sun, J. Chen, Y. Li, W. Wang, and T. Ma, “Random pruning: channel sparsity by expectation scaling factor,” PeerJ Comput. Sci., vol. 9, p. e1564, 2023, doi: 10.7717/peerj-cs.1564.
[2] S. Fei, L. Li, Z. Han, Z. Chen, and Y. Xiao, “Combining novel feature selection strategy and hyperspectral vegetation indices to predict crop yield,” Plant Methods, vol. 18, no. 1, 2022, doi: 10.1186/s13007-022-00949-0.
[3] P. A. Seba and J. V. B. Benifa, “Relevancy contemplation in medical data analytics and ranking of feature selection algorithms,” ETRI J., vol. 45, no. 3, pp. 448–461, 2023, doi: 10.4218/etrij.2022-0018.
[4] F. Fischer, A. Birk, P. Somers, K. Frenner, C. Tarín, and A. Herkommer, “FeaSel-Net: A Recursive Feature Selection Callback in Neural Networks,” Mach. Learn. Knowl. Extr., vol. 4, no. 4, pp. 968–993, 2022, doi: 10.3390/make4040049.
[5] H. A. Al-Mamun, M. F. Danilevicz, J. I. Marsh, C. Gondro, and D. Edwards, “Exploring genomic feature selection: A comparative analysis of GWAS and machine learning algorithms in a large-scale soybean dataset,” Plant Genome, vol. 18, no. 1, 2025, doi: 10.1002/tpg2.20503.
[6] J. Shi, J. Gao, and S. Xiang, “Adaptively Lightweight Spatiotemporal Information-Extraction-Operator-Based DL Method for Aero-Engine RUL Prediction,” Sensors, vol. 23, no. 13, p. 6163, 2023, doi: 10.3390/s23136163.
[7] Y. Zhang, H. Ma, C. Ren, and S. Meng, “RDLNet: a channel pruning-based traffic object detection algorithm,” Eng. Res. Express, vol. 7, no. 2, p. 25251, 2025, doi: 10.1088/2631-8695/add64c.
[8] T. Kim, H. Choi, and Y. Choe, “Automated Filter Pruning Based on High-Dimensional Bayesian Optimization,” IEEE Access, vol. 10, pp. 22547–22555, 2022, doi: 10.1109/ACCESS.2022.3153025.
[9] H. Ahn et al., “SAFP-YOLO: Enhanced Object Detection Speed Using Spatial Attention-Based Filter Pruning,” Appl. Sci., vol. 13, no. 20, p. 11237, 2023, doi: 10.3390/app132011237.
[10] C. Zhang, C. Li, B. Guo, and N. Liao, “Neural Network Compression via Low Frequency Preference,” Remote Sens., vol. 15, no. 12, p. 3144, 2023, doi: 10.3390/rs15123144.
[11] R. K. Donthu, A. S. Mohammed, R. S. Pasam, and S. Manchirevula, “A cross-sectional study to explore the association of peer pressure with Internet gaming,” Arch. Ment. Heal., vol. 25, no. 1, pp. 39–44, 2024, doi: 10.4103/amh.amh_32_23.
[12] T. R. Ramesh, U. K. Lilhore, M. Poongodi, S. Simaiya, A. Kaur, and M. Hamdi, “Predictive Analysis of Heart Diseases With Machine Learning Approaches,” Malaysian J. Comput. Sci., vol. 2022, no. Special Issue 1, pp. 132–148, 2022, doi: 10.22452/mjcs.sp2022no1.10.
[13] X. Zheng et al., “An Information Theory-Inspired Strategy for Automated Network Pruning,” Int. J. Comput. Vis., vol. 133, no. 8, pp. 5455–5482, 2025, doi: 10.1007/s11263-025-02437-z.
[14] H. Khalil and A. C. Tricco, “Differentiating between mapping reviews and scoping reviews in the evidence synthesis ecosystem,” J. Clin. Epidemiol., vol. 149, pp. 175–182, 2022, doi: 10.1016/j.jclinepi.2022.05.012.
[15] X. Pan, C. W. Y. Wong, and C. Li, “Circular economy practices in the waste electrical and electronic equipment (WEEE) industry: A systematic review and future research agendas,” J. Clean. Prod., vol. 365, p. 132671, 2022, doi: 10.1016/j.jclepro.2022.132671.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Eka Pandu Cynthia, Maulidania Mediawati Cynthia, Dessy Nia Cynthia (Author)

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.






