Security Testing

LLM Assisted Security Testing - Use Case

Explore a real example of how an LLM identified a vulnerability missed by automated and manual testing.
Thomas Ballin
8 minute read

LLMs (Large Language Models) aren’t often thought of as an effective tool to identify security vulnerabilities. But within this use case, we’ll demonstrate how security teams can use LLMs to flag pull requests (PRs) where vulnerable code may have been introduced. Resulting in targeted testing, reduced MTTD (Mean Time to Detection) and reduced numbers of vulnerabilities reaching critical production environments. 

Introducing the vulnerability

A vulnerability was introduced into a staging environment that allowed privilege escalation to occur within the application. It allowed any user to set their own privileges, including upgrading to an administrative user with full control over the application and its data. This could have resulted in unauthorised access to data stored by the application, including sensitive information related to application users, as well as administrative functions of the application, e.g. (deletion of application users, access to sensitive customer data).

Discovering the vulnerability

Due to the nature of the vulnerability, the bug was not flagged by the code reviewer or the automated smoke tests implemented in the CI/CD (Continuous Integration / Continuous Deployment) pipeline

The bug was also undetected by SAST / DAST tools,  since it was a flaw specific to the application itself, specifically its authorisation and permission system.

Instead, the vulnerability was discovered with the help of an LLM. It was  provided with an overview of the changes (commits) introduced within the given pull request and asked to provide an insight into what vulnerabilities the PR could bring into the application. This  flagged the issue and after reading the description of the change, the security engineer utilised the testing plan provided by the  LLM to manually check for authorisation vulnerabilities introduced as a result of the change. And thus the  vulnerability was identified. 

Remediating the vulnerability

The associated open PR  was closed and a write-up of the finding was provided to the developer within 5 hours of the vulnerable code reaching the staging environment. This included a description of the vulnerability, how and where it was introduced as well as other supporting material, such as screenshots.


The developer was able to review the issue write-up and associated  code before making necessary changes to prevent the vulnerability from occurring within 24 hours. 

Business impact of using LLMs to detect run-time vulnerabilities

Often security engineering teams  are tasked with working through  100s if not 1,000’s of changes / commits per release cycle or sprint, depending on the size of their organisation. Allowing machine learning, particularly language models, to assist in the classification of changes can significantly speed up and focus this process. They also act as an effective safety net, highlighting missed vulnerabilities from DAST / SAST scans.

Targeted testing

A large contributor to a successful security test is understanding the application being tested. In a typical point in time assessment, a security engineer is tasked with assessing an entire application in one go. This likely equates to a collection of hundreds, or maybe thousands, of changes since the last test. At this level, it becomes a significant burden to work backwards and attempt to understand what these changes actually do, the areas of the application they might affect and what vulnerabilities they could introduce. Particularly in cases where the only available resource is a developer who made the changes perhaps months ago and is likely to have forgotten. 

Ensuring changes are captured close to their inception and categorised accordingly into testable or non-testable changes, enables  security engineers or penetration testers to deliver their value. It allows them to focus most of their time on targeted testing rather than discovery. This is analogous to a developer using an IDE (Integrated Development Environment) to write code. It's not necessary to use an IDE but it makes life a lot easier. Similarly, utilising language models to carry out this initial classification is not a replacement for manual testing, but it is a huge help in ensuring testing is delivered faster and more effectively. 

Improved communication

A secondary benefit of tapping into change data such as pull requests and commits, is the minimisation of  communication issues between security and software engineers. By conveying more accurate remediation advice to a developer down to the code level, it ensures that  a deeper understanding is facilitated between the two groups / individuals and both teams are fully informed. 

Summary

The use of LLMs to summarise code changes introduced via PRs significantly improves the speed and effectiveness of vulnerabilities identified. 

Shifting left and incorporating continuous security testing into the SDLC ensures that vulnerabilities are identified well before they reach production environments. Resulting in the following improvements:

  • Reduced  MTTD  of vulnerabilities and prevention from reaching critical  production environments, potentially within hours of  release rather than weeks or months.  
  • Targeted and higher quality testing of individual application components that require testing, reducing the time spent by security engineers on non-testable changes.
  • Increasing the efficiency and performance of security engineers so they spend less time understanding application capabilities and more time focusing on testing areas of applications affected by change.

Security Testing

LLM Assisted Security Testing - Use Case

Explore a real example of how an LLM identified a vulnerability missed by automated and manual testing.
Thomas Ballin
3
min read
two white dot

LLMs (Large Language Models) aren’t often thought of as an effective tool to identify security vulnerabilities. But within this use case, we’ll demonstrate how security teams can use LLMs to flag pull requests (PRs) where vulnerable code may have been introduced. Resulting in targeted testing, reduced MTTD (Mean Time to Detection) and reduced numbers of vulnerabilities reaching critical production environments. 

Introducing the vulnerability

A vulnerability was introduced into a staging environment that allowed privilege escalation to occur within the application. It allowed any user to set their own privileges, including upgrading to an administrative user with full control over the application and its data. This could have resulted in unauthorised access to data stored by the application, including sensitive information related to application users, as well as administrative functions of the application, e.g. (deletion of application users, access to sensitive customer data).

Discovering the vulnerability

Due to the nature of the vulnerability, the bug was not flagged by the code reviewer or the automated smoke tests implemented in the CI/CD (Continuous Integration / Continuous Deployment) pipeline

The bug was also undetected by SAST / DAST tools,  since it was a flaw specific to the application itself, specifically its authorisation and permission system.

Instead, the vulnerability was discovered with the help of an LLM. It was  provided with an overview of the changes (commits) introduced within the given pull request and asked to provide an insight into what vulnerabilities the PR could bring into the application. This  flagged the issue and after reading the description of the change, the security engineer utilised the testing plan provided by the  LLM to manually check for authorisation vulnerabilities introduced as a result of the change. And thus the  vulnerability was identified. 

Remediating the vulnerability

The associated open PR  was closed and a write-up of the finding was provided to the developer within 5 hours of the vulnerable code reaching the staging environment. This included a description of the vulnerability, how and where it was introduced as well as other supporting material, such as screenshots.


The developer was able to review the issue write-up and associated  code before making necessary changes to prevent the vulnerability from occurring within 24 hours. 

Business impact of using LLMs to detect run-time vulnerabilities

Often security engineering teams  are tasked with working through  100s if not 1,000’s of changes / commits per release cycle or sprint, depending on the size of their organisation. Allowing machine learning, particularly language models, to assist in the classification of changes can significantly speed up and focus this process. They also act as an effective safety net, highlighting missed vulnerabilities from DAST / SAST scans.

Targeted testing

A large contributor to a successful security test is understanding the application being tested. In a typical point in time assessment, a security engineer is tasked with assessing an entire application in one go. This likely equates to a collection of hundreds, or maybe thousands, of changes since the last test. At this level, it becomes a significant burden to work backwards and attempt to understand what these changes actually do, the areas of the application they might affect and what vulnerabilities they could introduce. Particularly in cases where the only available resource is a developer who made the changes perhaps months ago and is likely to have forgotten. 

Ensuring changes are captured close to their inception and categorised accordingly into testable or non-testable changes, enables  security engineers or penetration testers to deliver their value. It allows them to focus most of their time on targeted testing rather than discovery. This is analogous to a developer using an IDE (Integrated Development Environment) to write code. It's not necessary to use an IDE but it makes life a lot easier. Similarly, utilising language models to carry out this initial classification is not a replacement for manual testing, but it is a huge help in ensuring testing is delivered faster and more effectively. 

Improved communication

A secondary benefit of tapping into change data such as pull requests and commits, is the minimisation of  communication issues between security and software engineers. By conveying more accurate remediation advice to a developer down to the code level, it ensures that  a deeper understanding is facilitated between the two groups / individuals and both teams are fully informed. 

Summary

The use of LLMs to summarise code changes introduced via PRs significantly improves the speed and effectiveness of vulnerabilities identified. 

Shifting left and incorporating continuous security testing into the SDLC ensures that vulnerabilities are identified well before they reach production environments. Resulting in the following improvements:

  • Reduced  MTTD  of vulnerabilities and prevention from reaching critical  production environments, potentially within hours of  release rather than weeks or months.  
  • Targeted and higher quality testing of individual application components that require testing, reducing the time spent by security engineers on non-testable changes.
  • Increasing the efficiency and performance of security engineers so they spend less time understanding application capabilities and more time focusing on testing areas of applications affected by change.

Prioritise Your Testing Programme Around Your Development Schedule

Detect Vulnerabilities Faster
Patch Vulnerabilities Faste
Be more compliant
Book a Demo

Related Posts

Vulnerability Management
How do you understand performance over time?
In order to get to grips with the performance of your software or product over time, you really need to be taking incremental measurements of your cybersecurity.
Thomas Ballin
February 2, 2021
Security Testing
Automated penetration testing - 5 key business benefits
Automated penetration testing is becoming increasingly popular. But how does this compare to manual penetration testing? Understand the main key benefits.
Thomas Ballin
June 4, 2024
Vulnerability Management
Will there come a day where there are 0 vulnerabilities to find?
There's a growing potential for AI to remove many sources of vulnerabilities, but does that mean we're going to see a day where code is being written without any vulnerabilities being introduced into systems?
Thomas Ballin
June 4, 2024
cytix frame image
By clicking “Accept All Cookies”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. View our Privacy Policy for more information.