From Observability to Closed-Loop AIOps: Data-Driven Automation for Secure and Resilient Network Operations

Main Article Content

Mohit Bajpai

Abstract

Modern enterprise and service-provider networks are now distributed across cloud, edge, software-defined data centers, mobile access, Internet of Things (IoT), and hybrid work environments. The operational challenge is no longer limited to device availability; teams must interpret high-volume telemetry, fast-changing application paths, user-experience signals, identity context, security events, and configuration drift at machine speed. This updated article expands the original discussion of AI Ops, machine learning, observability, and network security by adding a data-centered reference architecture, operational metrics, model-selection considerations, security controls, deployment phases, and governance requirements. The article explains how telemetry from SNMP, streaming telemetry, NetFlow/IPFIX, syslog, OpenTelemetry, endpoint logs, cloud logs, configuration repositories, and security tools can be converted into actionable intelligence through anomaly detection, forecasting, causal correlation, risk scoring, and policy-based automation. It also positions closed-loop AIOps as a practical operating model that improves mean time to detect, mean time to acknowledge, mean time to resolve, service-level compliance, capacity planning, and security response while preserving human approval for high-risk actions.

Article Details

How to Cite
Bajpai, M. (2023). From Observability to Closed-Loop AIOps: Data-Driven Automation for Secure and Resilient Network Operations. The Eastasouth Journal of Information System and Computer Science, 1(02), 260–267. https://doi.org/10.58812/esiscs.v1i02.1090
Section
Articles

References

[1] K. M. Sivalingam, “Applications of Artificial Intelligence, Machine Learning and related techniques for Computer Networking Systems,” arXiv. 2021. doi: 10.48550/arXiv.2105.15103.

[2] N. Feamster and J. Rexford, “Why (and how) networks should run themselves,” 2018. doi: 10.1145/3232755.3234555.

[3] D. Rossi and L. Zhang, “Landing AI on networks: An equipment vendor viewpoint on autonomous driving networks,” IEEE Trans. Netw. Serv. Manag., vol. 19, no. 3, pp. 3670–3684, 2022, doi: 10.1109/TNSM.2022.3169988.

[4] G. Luo, Q. Yuan, J. Li, S. Wang, and F. Yang, “Artificial Intelligence Powered Mobile Networks: From Cognition to Decision,” IEEE Netw., vol. 36, no. 3, pp. 136–144, 2022, doi: 10.1109/MNET.013.2100087.

[5] P. Notaro, J. Cardoso, and M. Gerndt, “A survey of AIOps methods for failure management,” ACM Trans. Intell. Syst. Technol., vol. 12, no. 6, pp. 1–45, 2021.

[6] Cisco, “Cisco Annual Internet Report (2018-2023) White Paper,” Cisco Systems, 2020.

[7] ETSI, “Zero-touch network and Service Management (ZSM); Reference Architecture (ETSI GS ZSM 002),” European Telecommunications Standards Institute, 2019.

[8] I. B. M. Security, “Cost of a Data Breach Report 2023,” IBM Corporation, 2023.

[9] Verizon, “2023 Data Breach Investigations Report,” Verizon Business, 2023.

[10] E. U. A. for Cybersecurity, “ENISA Threat Landscape 2023,” ENISA, 2023.

[11] D. Kreutz, F. M. V Ramos, P. E. Verissimo, C. E. Rothenberg, S. Azodolmolky, and S. Uhlig, “Software-defined networking: A comprehensive survey,” Proc. IEEE, vol. 103, no. 1, pp. 14–76, 2014.

[12] R. Mijumbi, J. Serrat, J.-L. Gorricho, N. Bouten, F. De Turck, and R. Boutaba, “Network function virtualization: State-of-the-art and research challenges,” IEEE Commun. Surv. Tutorials, vol. 18, no. 1, pp. 236–262, 2016, doi: 10.1109/COMST.2015.2477041.

[13] P. Authors, “Prometheus documentation: Data model and monitoring concepts,” Cloud Native Computing Foundation, 2023.

[14] C. N. C. Foundation, “OpenTelemetry documentation and project updates: Metrics, logs, and traces for cloud-native observability,” CNCF, 2023.

[15] D. Kreutz, F. M. V Ramos, P. E. Verissimo, C. E. Rothenberg, S. Azodolmolky, and S. Uhlig, “Software-defined networking: A comprehensive survey,” Proc. IEEE, vol. 103, no. 1, pp. 14–76, 2015, doi: 10.1109/JPROC.2014.2371999.

[16] V. Chandola, A. Banerjee, and V. Kumar, “Anomaly detection: A survey,” ACM Comput. Surv., vol. 41, no. 3, pp. 1–58, 2009, doi: 10.1145/1541880.1541882.

[17] A. Patcha and J. M. Park, “An overview of anomaly detection techniques: Existing solutions and latest technological trends,” Comput. Networks, vol. 51, no. 12, pp. 3448–3470, 2007, doi: 10.1016/j.comnet.2007.02.001.

[18] A. L. Buczak and E. Guven, “A Survey of Data Mining and Machine Learning Methods for Cyber Security Intrusion Detection,” IEEE Commun. Surv. Tutorials, vol. 18, no. 2, pp. 1153–1176, 2016, doi: 10.1109/COMST.2015.2494502.

[19] M. Bajpai, “Network Infrastructure and Disaster Recovery Planning for Seasonal Events,” Eur. J. Adv. Eng. Technol., vol. 8, no. 11, pp. 132–136, 2021.

[20] M. Aledhari, R. Razzak, and R. M. Parizi, “Machine learning for network application security: Empirical evaluation and optimization,” Comput. Electr. Eng., vol. 91, p. 107052, 2021, doi: 10.1016/j.compeleceng.2021.107052.

[21] S. Rose, O. Borchert, S. Mitchell, and S. Connelly, “Zero Trust Architecture (NIST Special Publication 800-207),” National Institute of Standards and Technology, 2020. doi: 10.6028/NIST.SP.800-207.