Performance Analysis of Machine Learning Regression Techniques to Predict Data Center Power Usage Efficiency

Performance Analysis of Machine Learning Regression Techniques to Predict Data Center Power Usage Efficiency

  IJETT-book-cover           
  
© 2022 by IJETT Journal
Volume-70 Issue-5
Year of Publication : 2022
Authors : Rajendra Kumar , Sunil Kumar Khatri , Mario José Diván
DOI :  10.14445/22315381/IJETT-V70I5P236

How to Cite?

Rajendra Kumar , Sunil Kumar Khatri , Mario José Diván, "Performance Analysis of Machine Learning Regression Techniques to Predict Data Center Power Usage Efficiency," International Journal of Engineering Trends and Technology, vol. 70, no. 5, pp. 328-338, 2022. Crossref, https://doi.org/10.14445/22315381/IJETT-V70I5P236

Abstract
Data Centers & cloud hosting services are critical for IT workload. Datacenter organizations need to equip them with the latest technologies to estimate the power usage efficiency (PUE) to cater to their hosting customers` requirements. Power usage efficiency is one of the major metrics to check how efficiently Data Center consumes their power. To better understand whether machine learning technology can forecast PUE with more accuracy, we have used multiple machine learning regression methods to predict the PUE in a data center and compared their accuracy. The research`s originality resides in the fact that no previous research has examined the regression methods for PUE prediction in data centers. Once the accuracies are identified, future researchers can use the algorithm for effective PUE prediction. The experimental result shows that DT and KNN work effectively with the data center`s PUE data in the research scope. Further, the analysis clearly shows that the Decision tree and KNN predict the PUE with 97% & 98% accuracy, respectively, compared with other regression techniques.

Keywords
Cooling, data center, Machine learning, Optimisation, Power usage efficiency.

Reference
[1] Eltjo Hofstee, The future of data centers is green, (2019). https://www.datacenterdynamics.com/en/opinions/future-data-centers-green/
[2] Dublin, Global Green Data Center Market Report industry Trends, Share, Size, Growth, Opportunity and Forecast 2014-2025, (2020). https://www.globenewswire.com/en/news-release/2020/12/28/2150724/28124/en/Global-Green-Data-Center-Market-Report-2020- Industry-Trends-Share-Size-Growth-Opportunity-and-Forecast-2014-2025.html
[3] S. Taheri, M. Goudarzi, and O. Yoshie, Learning-based power prediction for geo-distributed Data Centers: weather parameter analysis, J. Big Data, 7(1) (2020) 8. doi: 10.1186/s40537-020-0284-2.
[4] H. M. M. Ali, T. E. H. El-Gorashi, A. Q. Lawey, and J. M. H. Elmirghani, Future energy efficient data centers with disaggregated servers, J. Light. Technol., 35(24) (2017) 5361–5380. [Online]. Available: https://www.osapublishing.org/abstract.cfm?uri=jlt-35-24- 5361
[5] A. Shehabi et al., United States Data Center Energy Usage Report, Berkeley, CA (United States), Jun. (2016). doi: 10.2172/1372902.
[6] J. Morley, K. Widdicks, and M. Hazas, Digitalisation, energy and data demand: The impact of Internet traffic on overall and peak electricity consumption, Energy Res. Soc. Sci., 38 (2018) 128–137. doi: 10.1016/j.erss.2018.01.018.
[7] IRENA, Global Energy Transformation: A roadmap to 2050, International Renewable Energy Agency, (2018). [Online]. Available: https://www.irena.org/-/media/Files/IRENA/Agency/Publication/2018/Apr/IRENA_Report_GET_2018.pdf
[8] M. S and D. S. L, A Study on Consolidation of Data Servers in Virtualized Cloud Atmosphere, Int. J. Comput. Sci. Eng., 6(11) (2019) 47–50. doi: 10.14445/23488387/IJCSE-V6I11P110.
[9] M. Dayarathna, Y. Wen, and R. Fan, Data Center Energy Consumption Modeling: A Survey, IEEE Commun. Surv. Tutorials, 18(1) (2016) 732–794.doi: 10.1109/COMST.2015.2481183.
[10] J. Dai, M. M. Ohadi, D. Das, and M. G. Pecht, Optimum Cooling of Data Centers. New York, NY: Springer New York, (2014). doi: 10.1007/978-1-4614-5602-5.
[11] E. Jaureguialzo, PUE: The Green Grid metric for evaluating the energy efficiency in DC (Data Center). Measurement method using the power demand, in 2011 IEEE 33rd International Telecommunications Energy Conference (INTELEC), (2011) 1–8. doi: 10.1109/INTLEC.2011.6099718.
[12] H. Shoukourian, T. Wilde, D. Labrenz, and A. Bode, Using Machine Learning for Data Center Cooling Infrastructure Efficiency Prediction, in 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), (2017) 954–963. doi: 10.1109/IPDPSW.2017.25.
[13] T. Wilde, A. Auweter, and H. Shoukourian, The 4 Pillar Framework for energy efficient HPC data centers, Comput. Sci. - Res. Dev., v29(3-4) (2014) 241–251. doi: 10.1007/s00450-013-0244-6.
[14] R. F. K. Basha, M. . Bharathi, and K. Venusamy, Dynamic prediction of energy and power usage cost using linear regression-machine learning analysis, J. Phys. Conf. Ser.,1921(1) (2021) 012067. doi: 10.1088/1742-6596/1921/1/012067.
[15] M. Balanici; and S. Pachnicke, Server Traffic Prediction Using Machine Learning for Optical Circuit Switching Scheduling, (2019). [16] Y. Li, X. Wang, P. Luo, and Q. Pan, Thermal-Aware Hybrid Workload Management in a Green Datacenter towards Renewable Energy Utilization, Energies, 12(8) (2019) 1494. doi: 10.3390/en12081494.
[17] K. Haghshenas, A. Pahlevan, M. Zapater, S. Mohammadi, and D. Atienza, MAGNETIC: Multi-Agent Machine Learning-Based Approach for Energy Efficient Dynamic Consolidation in Data Centers, IEEE Trans. Serv. Comput., (2019) 1–1, 2019, doi: 10.1109/TSC.2019.2919555.
[18] R. Kumar, S. K. Khatri, and M. J. Diván, Power Usage Efficiency (PUE) Optimization with Counterpointing Machine Learning Techniques for Data Center Temperatures, Int. J. Math. Eng. Manag. Sci., 6(6) (2021) 1594–1611. doi: 10.33889/IJMEMS.2021.6.6.095.
[19] Y. Liu, X. Wei, J. Xiao, Z. Liu, Y. Xu, and Y. Tian, Energy consumption and emission mitigation prediction based on data center traffic and PUE for global data centers, Glob. Energy Interconnect., 3(3) (2020) 272–282. doi: 10.1016/j.gloei.2020.07.008.
[20] J. Gao, Machine learning applications for data center optimization, (2014). [Online]. Available: https://research.google/pubs/pub42542/
[21] K. Bibas, Y. Fogel, and M. Feder, A New Look at an Old Problem: A Universal Learning Approach to Linear Regression, in 2019 IEEE International Symposium on Information Theory (ISIT), (2019) 2304–2308. doi: 10.1109/ISIT.2019.8849398.
[22] A. F. Schmidt and C. Finan, Linear regression and the normality assumption, J. Clin. Epidemiol., 98 (2018) 146–151. doi: 10.1016/j.jclinepi.2017.12.006.
[23] S. Liu and E. Dobriban, Ridge Regression: Structure, Cross-Validation, and Sketching, Oct. (2019). [Online]. Available: http://arxiv.org/abs/1910.02373
[24] Y.-R. Chen, A. Rezapour, and W.-G. Tzeng, Privacy-preserving ridge regression on distributed data, Inf. Sci. (Ny)., 451–452 (2018) 34–49, Jul. 2018, doi: 10.1016/j.ins.2018.03.061.
[25] J. Ranstam and J. A. Cook, LASSO regression, Br. J. Surg., 105(10) (2018) 1348–1348. doi: 10.1002/bjs.10895.
[26] R. Alhamzawi and H. T. M. Ali, The Bayesian adaptive lasso regression, Math. Biosci., 303 (2018) 75–82. doi: 10.1016/j.mbs.2018.06.004.
[27] D. Moon, H. Im, I. Kim, and J. H. Park, DTB-IDS: an intrusion detection system based on decision tree using behavior analysis for preventing APT attacks, J. Supercomput., 73(7) (2017) 2881–2895. doi: 10.1007/s11227-015-1604-8.
[28] W. Mao and F.-Y. Wang, Cultural Modeling for Behavior Analysis and Prediction, in Advances in Intelligence and Security Informatics, Elsevier, (2012) 91–102. doi: 10.1016/B978-0-12-397200-2.00008-7.
[29] J. L. Speiser, M. E. Miller, J. Tooze, and E. Ip, A comparison of random forest variable selection methods for classification prediction modeling, Expert Syst. Appl., 134 (2019) 93–101. doi: 10.1016/j.eswa.2019.05.028.
[30] S. J. Rigatti, Random Forest, J. Insur. Med., 47(1) (2017) 31–39. doi: 10.17849/insm-47-01-31-39.1.
[31] Y. Song, J. Liang, J. Lu, and X. Zhao, An efficient instance selection algorithm for k nearest neighbor regression, Neurocomputing, 251 (2017) 26–34. doi: 10.1016/j.neucom.2017.04.018.
[32] F. Martínez, M. P. Frías, M. D. Pérez, and A. J. Rivera, A methodology for applying k-nearest neighbor to time series forecasting, Artif. Intell. Rev.,52(3) (2019) 2019–2037., doi: 10.1007/s10462-017-9593-z.
[33] E. Shams, A. Rizaner, and A. H. Ulusoy, Trust aware support vector machine intrusion detection and prevention system in vehicular ad hoc networks, Comput. Secur., 78 (2018) 245–254. doi: 10.1016/j.cose.2018.06.008.
[34] L. A. Winters-Miner et al., Prediction in Medicine – The Data Mining Algorithms of Predictive Analytics, in Practical Predictive Analytics and Decisioning Systems for Medicine, Elsevier,(2015) 239–259. doi: 10.1016/B978-0-12-411643-6.00015-6.
[35] H. Taud and J. F. Mas, Multilayer Perceptron (MLP), (2018) 451–455. doi: 10.1007/978-3-319-60801-3_27.
[36] A. Belay Adege, Y. Yayeh, G. Berie, H. Lin, L. Yen, and Y. R. Li, Indoor localization using K-nearest neighbor and artificial neural network back propagation algorithms, in 2018 27th Wireless and Optical Communication Conference (WOCC), (2018) 1–2. doi: 10.1109/WOCC.2018.8372704