Using Artifical Intelligence in Amazon EKS Clusters to Detect Faults Percipiently

Authors

  • Babulal Shaik Cloud Solutions Architect at Amazon Web Services, USA Author

Keywords:

Amazon EKS, AI-driven fault detection, Kubernetes-native tools

Abstract

Maintaining performance & user satisfactions in cloud-native settings such as Amazon Elastic Kubernetes Service requires ensuring high availability & reducing downtime. In order to proactively identify & stop errors in Elastic Kubernetes Service clusters, this research investigates a machine learning-based method. The system detects early indicators of problems that can cause service interruptions by keeping an eye on parameters including pod performance, node health & network conditions. These algorithm forecasts possible issues & offers automated remedies & real-time alerts based on their past performance data. Teams may fix issues before they affect consumers by examining trends in resource utilizations, latency and system faults . This method lessens operating workload while simultaneously enhancing Elastic Kubernetes Service clusters stability & dependability. The study demonstrates how performance and resilience may be improved by incorporating machine learning and AI into Kubernetes operations, providing a more creative approach to cloud infrastructure management.

References

1. Ambati, P., & Irwin, D. (2019). Optimizing the cost of executing mixed interactive and batch workloads on transient vms. Proceedings of the ACM on Measurement and Analysis of Computing Systems, 3(2), 1-24.

2. Chelliah, P. R., Naithani, S., & Singh, S. (2018). Practical Site Reliability Engineering: Automate the process of designing, developing, and delivering highly reliable apps and services with SRE. Packt Publishing Ltd.

3. Mena, J. (1999). Data mining your website. Digital Press.

4. Jugovac, M. (2019). Designing and evaluating recommender systems with the user in the loop.

5. Lerche, L. (2016). Using implicit feedback for recommender systems: characteristics, applications, and challenges.

6. Erdilek, M. (2002). A Research On Electronic Business: Comparison of Electronic Business Models (Master's thesis, Marmara Universitesi (Turkey)).

7. Kietzmann, J., Paschen, J., & Treen, E. (2018). Artificial intelligence in advertising: How marketers can leverage artificial intelligence along the consumer journey. Journal of Advertising Research, 58(3), 263-267.

8. Gudala, L., Shaik, M., Venkataramanan, S., & Sadhu, A. K. R. (2019). Leveraging Artificial Intelligence for Enhanced Threat Detection, Response, and Anomaly Identification in Resource-Constrained IoT Networks. Distributed Learning and Broad Applications in Scientific Research, 5, 23-54.

9. Gayam, S. R. (2019). AI for Supply Chain Visibility in E-Commerce: Techniques for Real-Time Tracking, Inventory Management, and Demand Forecasting. Distributed Learning and Broad Applications in Scientific Research, 5, 218-251.

10. Boutilier, C., Dean, T., & Hanks, S. (1999). Decision-theoretic planning: Structural assumptions and computational leverage. Journal of Artificial Intelligence Research, 11, 1-94.

11. Davenport, T. H. (2018). From analytics to artificial intelligence. Journal of Business Analytics, 1(2), 73-80.

12. He, A., Bae, K. K., Newman, T. R., Gaeddert, J., Kim, K., Menon, R., ... & Tranter, W. H. (2010). A survey of artificial intelligence for cognitive radios. IEEE transactions on vehicular technology, 59(4), 1578-1592.

13. Russomanno, D. J., Kothari, C. R., & Thomas, O. A. (2005, June). Building a Sensor Ontology: A Practical Approach Leveraging ISO and OGC Models. In IC-AI (pp. 637-643).

14. Gade, K. R. (2017). Migrations: Challenges and Best Practices for Migrating Legacy Systems to Cloud-Based Platforms. Innovative Computer Sciences Journal, 3(1).

15. Jensen, R. M., Veloso, M. M., & Bryant, R. E. (2008). State-set branching: Leveraging BDDs for heuristic search. Artificial Intelligence, 172(2-3), 103-139.

16. Nemati, H. R., Steiger, D. M., Iyer, L. S., & Herschel, R. T. (2002). Knowledge warehouse: an architectural integration of knowledge management, decision support, artificial intelligence and data warehousing. Decision Support Systems, 33(2), 143-161.

17. Boda, V. V. R., & Immaneni, J. (2019). Streamlining FinTech Operations: The Power of SysOps and Smart Automation. Innovative Computer Sciences Journal, 5(1).

18. Nookala, G., Gade, K. R., Dulam, N., & Thumburu, S. K. R. (2019). End-to-End Encryption in Enterprise Data Systems: Trends and Implementation Challenges. Innovative Computer Sciences Journal, 5(1).

19. Komandla, V. Enhancing Security and Fraud Prevention in Fintech: Comprehensive Strategies for Secure Online Account Opening.

20. Komandla, V. Transforming Financial Interactions: Best Practices for Mobile Banking App Design and Functionality to Boost User Engagement and Satisfaction.

21. Gade, K. R. (2019). Data Migration Strategies for Large-Scale Projects in the Cloud for Fintech. Innovative Computer Sciences Journal, 5(1).

22. Gade, K. R. (2018). Real-Time Analytics: Challenges and Opportunities. Innovative Computer Sciences Journal, 4(1).

23. Katari, A. (2019). Real-Time Data Replication in Fintech: Technologies and Best Practices. Innovative Computer Sciences Journal, 5(1).

24. Katari, A. (2019). ETL for Real-Time Financial Analytics: Architectures and Challenges. Innovative Computer Sciences Journal, 5(1).

25. Gade, K. R. (2017). Migrations: Challenges and Best Practices for Migrating Legacy Systems to Cloud-Based Platforms. Innovative Computer Sciences Journal, 3(1).

26. Muneer Ahmed Salamkar. Next-Generation Data Warehousing: Innovations in Cloud-Native Data Warehouses and the Rise of Serverless Architectures. Distributed Learning and Broad Applications in Scientific Research, vol. 5, Apr. 2019

27. Muneer Ahmed Salamkar. Real-Time Data Processing: A Deep Dive into Frameworks Like Apache Kafka and Apache Pulsar. Distributed Learning and Broad Applications in Scientific Research, vol. 5, July 2019

28. Naresh Dulam, and Venkataramana Gosukonda. “AI in Healthcare: Big Data and Machine Learning Applications ”. Distributed Learning and Broad Applications in Scientific Research, vol. 5, Aug. 2019

29. Naresh Dulam. “Real-Time Machine Learning: How Streaming Platforms Power AI Models ”. Distributed Learning and Broad Applications in Scientific Research, vol. 5, Sept. 2019

30. Naresh Dulam. Apache Spark: The Future Beyond MapReduce. Distributed Learning and Broad Applications in Scientific Research, vol. 1, Dec. 2015, pp. 136-5

31. Sarbaree Mishra. Distributed Data Warehouses - An Alternative Approach to Highly Performant Data Warehouses. Distributed Learning and Broad Applications in Scientific Research, vol. 5, May 2019

32. Sarbaree Mishra, et al. Improving the ETL Process through Declarative Transformation Languages. Distributed Learning and Broad Applications in Scientific Research, vol. 5, June 2019

Published

07-03-2020

How to Cite

[1]
Babulal Shaik, “Using Artifical Intelligence in Amazon EKS Clusters to Detect Faults Percipiently ”, Distrib. Learn. Broad Appl. Sci. Res., vol. 6, pp. 894–909, Mar. 2020, Accessed: Mar. 14, 2025. [Online]. Available: https://dlbasr.org/index.php/publication/article/view/31