LLM-Based Autonomous Remediation for DevSecOps Pipelines

Main Article Content

Roshan Kakarla

Abstract

Modern DevSecOps pipelines operate at a scale and velocity that exceeds the cognitive and operational capacity of traditional rule-based automation and human-centric incident response. While monitoring, alerting, and security scanning tools have matured, remediation remains largely manual, fragmented, and reactive resulting in prolonged mean time to resolution (MTTR), configuration drift, and governance gaps. This paper proposes a novel LLM-Based Autonomous Remediation Framework (LLM-ARF) that introduces a risk-aware, policy-governed control plane for automated detection, diagnosis, and remediation across DevSecOps pipelines. Unlike existing approaches that rely on static runbooks or narrow AI classifiers, LLM-ARF integrates large language models as reasoning agents embedded within a constrained, auditable, and human-supervised execution loop. The framework explicitly separates cognition, decision authority, and actuation, enabling scalable autonomy while preserving accountability and compliance. We present the architectural design, lifecycle control flow, and governance mechanisms of LLM-ARF, and evaluate its operational impact using real-world DevOps metrics such as MTTR reduction, alert fatigue mitigation, and toil reduction. The results demonstrate that LLM-ARF enables a step-function improvement in remediation reliability without compromising safety or human oversight, positioning autonomous remediation as a viable next evolution of enterprise DevSecOps systems.

Article Details

How to Cite
Kakarla, R. (2024). LLM-Based Autonomous Remediation for DevSecOps Pipelines. The Eastasouth Journal of Information System and Computer Science, 2(02), 179–188. https://doi.org/10.58812/esiscs.v2i02.856
Section
Articles

References

J. Humble and D. Farley, Continuous delivery: reliable software releases through build, test, and deployment automation. Pearson Education, 2010.

B. Beyer, C. Jones, J. Petoff, and N. R. Murphy, Site reliability engineering: how Google runs production systems. “ O’Reilly Media, Inc.,” 2016.

J. Dean and L. A. Barroso, “The tail at scale,” Commun. ACM, vol. 56, no. 2, pp. 74–80, 2013.

NIST, AI Risk Management Framework (AI RMF 1.0). 2023.

J. T. Force, “Risk management framework for information systems and organizations,” NIST Spec. Publ., vol. 800, p. 37, 2018.

ISO/IEC 27001, Information Security Management Systems. 2022.

CNCF, Cloud Native Security Whitepaper. 2020.

W. Xu, L. Huang, A. Fox, and D. Patterson, Experience with Model-Based Diagnosis. USENIX, 2010.

J. Kreps, N. Narkhede, and J. Rao, “Kafka: A distributed messaging system for log processing,” in Proceedings of the NetDB, 2011, vol. 11, no. 2011, pp. 1–7.

Google Cloud, Error Budgets and Reliability Governance. 2019.