LLM-Based Autonomous Remediation for DevSecOps Pipelines

Roshan Kakarla

doi:10.58812/esiscs.v2i02.856

PDF

Published: Dec 31, 2024

DOI: https://doi.org/10.58812/esiscs.v2i02.856

Keywords:

AI Safety; Autonomous Remediation; Cloud Governance; DevSecOps; Infrastructure as Code; Large Language Models; Platform Engineering; Site Reliability Engineering

Roshan Kakarla

DevOps Engineer, Information Technology, Indiana Wesleyan University

Abstract

Modern DevSecOps pipelines operate at a scale and velocity that exceeds the cognitive and operational capacity of traditional rule-based automation and human-centric incident response. While monitoring, alerting, and security scanning tools have matured, remediation remains largely manual, fragmented, and reactive resulting in prolonged mean time to resolution (MTTR), configuration drift, and governance gaps. This paper proposes a novel LLM-Based Autonomous Remediation Framework (LLM-ARF) that introduces a risk-aware, policy-governed control plane for automated detection, diagnosis, and remediation across DevSecOps pipelines. Unlike existing approaches that rely on static runbooks or narrow AI classifiers, LLM-ARF integrates large language models as reasoning agents embedded within a constrained, auditable, and human-supervised execution loop. The framework explicitly separates cognition, decision authority, and actuation, enabling scalable autonomy while preserving accountability and compliance. We present the architectural design, lifecycle control flow, and governance mechanisms of LLM-ARF, and evaluate its operational impact using real-world DevOps metrics such as MTTR reduction, alert fatigue mitigation, and toil reduction. The results demonstrate that LLM-ARF enables a step-function improvement in remediation reliability without compromising safety or human oversight, positioning autonomous remediation as a viable next evolution of enterprise DevSecOps systems.

How to Cite

Kakarla, R. (2024). LLM-Based Autonomous Remediation for DevSecOps Pipelines. The Eastasouth Journal of Information System and Computer Science, 2(02), 179–188. https://doi.org/10.58812/esiscs.v2i02.856

Issue

Vol. 2 No. 02 (2024): The Eastasouth Journal of Information System and Computer Science (ESISCS)

Section

Articles

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

References

J. Humble and D. Farley, Continuous delivery: reliable software releases through build, test, and deployment automation. Pearson Education, 2010.

B. Beyer, C. Jones, J. Petoff, and N. R. Murphy, Site reliability engineering: how Google runs production systems. “ O’Reilly Media, Inc.,” 2016.

J. Dean and L. A. Barroso, “The tail at scale,” Commun. ACM, vol. 56, no. 2, pp. 74–80, 2013.

NIST, AI Risk Management Framework (AI RMF 1.0). 2023.

J. T. Force, “Risk management framework for information systems and organizations,” NIST Spec. Publ., vol. 800, p. 37, 2018.

ISO/IEC 27001, Information Security Management Systems. 2022.

CNCF, Cloud Native Security Whitepaper. 2020.

W. Xu, L. Huang, A. Fox, and D. Patterson, Experience with Model-Based Diagnosis. USENIX, 2010.

J. Kreps, N. Narkhede, and J. Rao, “Kafka: A distributed messaging system for log processing,” in Proceedings of the NetDB, 2011, vol. 11, no. 2011, pp. 1–7.

Google Cloud, Error Budgets and Reliability Governance. 2019.

Article Sidebar

Main Article Content

Abstract

Article Details

References