Top 10 vulnerabilities in LLM applications such as ChatGPT
Table of Contents
OWASP has published a ranking of the top vulnerabilities in LLM applications to help companies strengthen the security of generative AI
If one technology has captured the public’s attention so far this year, it is undoubtedly LLM applications. These systems use Large Language Models (LLMs) and complex learning algorithms to understand and generate human language. ChatGPT, OpenAI’s proprietary text-generative AI, is the most famous of these applications, but dozens of LLM applications are already on the market.
In the wake of the rise of these AIs, OWASP has just published version 1 of its Top 10 LLM application vulnerabilities. This ranking, compiled by a foundation that has become a global benchmark in risk prevention and the fight against cyber threats, focuses on the main risks that both the companies that develop these applications and the companies that use them in their day-to-day work must take into account.
The OWASP Top 10 LLM Application Vulnerabilities aims to educate and raise awareness among developers, designers, and organizations of the potential risks they face when deploying and managing this disruptive technology. Each vulnerability includes:
- Definition
- Common examples of vulnerability
- Attack scenarios
- How to prevent it
Below, we will break down OWASP’s top 10 LLM application vulnerabilities and how to prevent them to avoid security incidents that could harm companies and their customers.
1. Prompt injections
Prompt injections occupy the first position in the Top 10 LLM application vulnerabilities. Hostile actors manipulate LLMs through prompts that force applications to execute the actions the attacker desires. This vulnerability can be exploited by:
- Direct prompt injections, known as «jailbreaking», occur when a hostile actor can overwrite or disclose the underlying prompt of the system. What does this imply? Attackers can exploit backend systems by interacting with insecure functions and data stores.
- Indirect prompt injections. This occurs when an LLM application accepts input from external sources that can be controlled by hostile actors, e.g., web pages. In this way, the attacker embeds a prompt injection into the external content, hijacking the conversation context and allowing the attacker to manipulate additional users or systems that the application can access.
OWASP points out that the results of a successful attack vary and can range from obtaining confidential information to influencing critical decision-making processes. Moreover, in the most sophisticated attacks, the compromised LLM application can become a tool at the attacker’s service, interacting with plugins in the user’s configuration and allowing the former to gain access to confidential data of the targeted user without the latter being alerted to the intrusion.
1.1. Prevention
The Top 10 vulnerabilities in LLM applications indicate that prompt injections are possible by the very nature of these systems, as they do not segregate instructions from external data. And since LLMs use natural language, they consider both types of inputs to be provided by legitimate users. Hence, the measures proposed by OWASP cannot achieve total prevention of these vulnerabilities, but they do serve to mitigate their impact:
- Control LLM application access to backends. It is advisable to apply the principle of least privilege and restrict LLM access, granting it the minimum level of access so that it can perform its functions.
- Establish that the application has to obtain the user’s authorization to perform actions such as sending or deleting emails.
- Separate external content from user prompts. OWASP gives an example of the possibility of using ChatML for Open AI API calls to indicate to the LLM the input source of the prompt.
- Establish trust boundaries between the LLM application, external sources, and plugins. The application could be treated as an untrusted user, establishing that the end user controls decision-making. However, we should be aware that a compromised LLM application can act as a man-in-the-middle and hide or manipulate information before it is shown to the user.
2. Insecure handling of outputs
The insecure handling of the language model outputs occupies second place in the Top 10 vulnerabilities in LLM applications. What does this mean? The output is accepted without being scrutinized beforehand and transferred directly to the backend or privileged functionalities. In addition, the content generated by an LLM application can be controlled by introducing prompts, as we pointed out in the previous section. This would provide users with indirect access to additional functions.
What are the possible consequences of exploiting this vulnerability? Privilege escalation, remote code execution on backend systems, and even if the application is vulnerable to external injection attacks, the hostile actor could gain privileged access to the target user’s environment.
2.1. Prevention
The OWASP guide to the Top 10 LLM application vulnerabilities recommends two actions to act on this risk:
- Treat the model as a user, ensuring validation and sanitization of model responses directed to backend functions.
- Encrypt the outputs of the model back to the users to mitigate the execution of malicious code.
3. Poisoning of training data
One of the critical aspects of LLM applications is the training data supplied to the models. This data must be large, diverse, and cover various languages. Large language models use neural networks to generate output based on the patterns they learn from the training data, which is why this data is so important.
This is also why they are a prime target for hostile actors who want to manipulate LLM applications. By poisoning training data, it is possible to:
- Introduce backdoors or biases that undermine the security of the model.
- Alter the ethical behavior of the model, which is of paramount importance.
- Cause the application to provide users with false information.
- Degrade the model’s performance and capabilities.
- Damage the reputation of companies.
Hence, training data poisoning is a problem for cybersecurity and the business model of companies developing LLM applications. It can result in the model being unable to make correct predictions and interact effectively with users.
3.1. Prevention
The OWASP Top 10 vulnerabilities in LLM applications proposes four primary measures to prevent the poisoning of training data:
- Verify the legitimacy of the data sources used in training the model and refining it.
- Design different models from segregated training data designed for other use cases. This results in more granular and accurate generative AI.
- Employ more stringent filters for training data and data sources to detect spurious data and sanitize the data used for model training.
- Analyse training models for signs of poisoning. As well as analyzing tests to evaluate model behavior. In this sense, security assessments throughout the LLM application lifecycle and the implementation of Red Team exercises specially designed for this type of application are of great added value.
4. Denial of Service attacks against the model
DoS attacks are a common practice launched by malicious actors against companies’ IT assets, such as web applications. However, denial-of-service attacks can also affect LLM applications.
An attacker interacts with the LLM application to force it to consume a considerable amount of resources, resulting in:
- Degrading the service provided by the application to its users.
- Increased resource costs for the company.
Furthermore, this vulnerability could open the door for an attacker to interfere with or manipulate the LLM context window, i.e., the maximum length of text the model can handle in terms of inputs and outputs. Why could this action be severe? The context window is set when creating the model architecture and stipulates how complex the linguistic patterns the model can understand can be and the size of the text it can process.
Considering that the use of LLM applications is increasing, thanks to the popularisation of solutions such as ChatGPT, this vulnerability is set to become more and more relevant in terms of security as the number of users and the intensive use of resources will increase.
4.1. Prevention
In its Top 10 vulnerabilities in LLM applications, OWASP recommends:
- Implement input validation and sanitization to ensure that inputs comply with the limits defined when creating the model.
- Limit the maximum resource usage per request.
- Set rate limits in the API to restrict user or IP address requests.
- Also, limit the number of queued actions and the total number of activities in the system that react to model responses.
- Continuously monitor LLM application resource consumption to identify abnormal behavior that can be used to detect DoS attacks.
- Stipulate strict limits regarding the context window to prevent overload and resource exhaustion.
- Raise developers’ awareness of the consequences of a successful DoS attack on an LLM application.
5. Supply chain vulnerabilities
As with traditional applications, LLM application supply chains are also subject to potential vulnerabilities, which could affect:
- The integrity of training data.
- Machine Learning models.
- The deployment platforms of the models.
Successful exploitation of vulnerabilities in the supply chain can result in:
- The model generates biased or incorrect results.
- Security breaches occur.
- A widespread system failure that threatens business continuity.
The rise of Machine Learning has brought with it the emergence of pre-trained models and training data from third parties, both of which facilitate the creation of LLM applications but carry with them associated supply chain risks:
- Use of outdated software.
- Pre-trained models are susceptible to be attacked.
- Poisoned training data.
- Insecure plugins.
5.1. Prevention
To prevent the risks associated with the LLM application supply chain, OWASP recommends:
- Verify the data sources used to train and refine the model and use independently audited security systems.
- Use trusted plugins.
- Implement Machine Learning best practices for your models.
- Continuously monitor for vulnerabilities.
- Maintain an efficient patching policy to mitigate vulnerabilities and manage obsolete components.
- Regularly audit the security of suppliers and their access to the system.
6. Disclosure of sensitive information
Addressing the sixth item of the Top 10 LLM application vulnerabilities, OWASP warns that models can reveal sensitive and confidential information through the results they provide to users. This means that hostile actors could gain access to sensitive data, steal intellectual property, or violate people’s privacy.
It is, therefore, important for users to understand the risks associated with voluntarily entering data into an LLM application, as this information may be returned elsewhere. Therefore, companies that own LLM applications need to adequately disclose how they process data and include the possibility that data may not be included in the data used to train the model.
In addition, companies should implement mechanisms to prevent users’ data from becoming part of the training data model without their explicit consent.
6.1. Prevention
Some of the actions that companies owning LLM applications can take are:
- Employ data cleansing and data cleansing techniques.
- Implement effective strategies to validate inputs and sanitize them.
- Limit access to external data sources.
- Adhere to the rule of least privilege when training models.
- Secure the supply chain and control access to the system effectively.
7. Insecure design of plugins
What are LLM plugins? Extensions that the model automatically calls during user interactions. In many cases, there is no control over their execution. Thus, a hostile actor could make a malicious request to the plugin, opening the door to even remote execution of malicious code.
Therefore, plugins must have robust access controls, not unquestioningly trust other plugins, and believe that the legitimate user provided the inputs for malicious purposes. Otherwise, these negative inputs can lead to:
- Data exfiltration.
- Remote code execution.
- Privilege escalation.
7.1. Prevention
The Top 10 vulnerabilities in LLM applications recommends, concerning the design of plugins, to implement the following measures:
- Strictly apply input parameterization and perform the necessary checks to ensure security.
- Apply the recommendations defined by OWASP ASVS (Application Security Verification Standard) to ensure the correct validation and sanitization of data input.
- Carry out application security tests continuously: SAST, DAST, IAST…
- Use authentication identities and API keys to ensure authentication and access control measures.
- Require user authorization and confirmation for actions performed by sensitive plugins.
8. Excessive functionality, permissions or autonomy
To address this item of the Top 10 vulnerabilities in LLM applications, OWASP uses the concept of «Excessive Agency» to warn of the risks associated with giving an LLM excessive functionality, permissions, or autonomy. An LLM that does not function properly (due to a malicious injection or plugin, when poorly designed prompts, or poor performance) it can perform harmful actions.
Granting excessive functionalities, permissions, or autonomy to an LLM may have consequences that affect data confidentiality, integrity, and availability.
8.1. Prevention
To successfully address the risks associated with “Excessive Agency”, OWASP recommends:
- Limit the plugins and tools that LLMs can call and also the functions of LLM plugins and devices to the minimum necessary.
- Require user approval for all actions and effectively track each user’s authorization.
- Log and monitor the activity of LLM plugins and tools to identify and respond to unwanted actions.
- Apply rate-limiting measures to reduce the number of possible unwanted actions.
9. Overconfidence
According to OWASP’s Top 10 LLM application vulnerabilities guide, overconfidence occurs when systems or users rely on generative AI to make decisions or generate content without proper oversight.
In this regard, we must understand that LLM applications can create valuable content but can also generate incorrect, inappropriate, or unsafe content. This can lead to misinformation and legal problems and damage the company’s reputation using the content.
9.1. Prevention
To prevent overconfidence and the severe consequences it can have not only for the companies that develop LLM applications but also for the companies and individuals that use them, OWASP recommends:
- Regularly monitor and review LLM results and outputs.
- Check the results of the generative AI against reliable sources of information.
- Improve the model by making adjustments to increase the quality and consistency of the model outputs. The OWASP guidance states that pre-trained models are more likely to produce erroneous information than models developed for a given domain.
- Implement automatic validation mechanisms capable of contrasting and verifying the results generated by the model with known facts and data.
- Segment tasks into subtasks performed by different professionals.
- Inform users of the risks and limitations of generative AI.
- Develop APIs and user interfaces that encourage accountability and safety when using generative AI, incorporating measures such as content filters, warnings of possible inconsistencies, or labeling AI-generated content.
- Establish safe coding practices and guidelines to avoid the integration of vulnerabilities in development environments.
10. Model theft
The last place in the OWASP Top 10 LLM application vulnerabilities is model theft, i.e., unauthorized access and leakage of LLM models by malicious actors or APT groups.
When does this vulnerability occur? When a proprietary model is compromised, physically stolen, copied, or the parameters needed to create an equivalent model are stolen.
The impact of this vulnerability on companies owning generative AI includes substantial financial losses, reputational damage, loss of competitive advantage over other companies, misuse of the model, and improper access to sensitive information.
Organizations must take all necessary measures to protect the security of their LLM models, ensuring their confidentiality, integrity, and availability. This involves designing and implementing a comprehensive security framework that effectively safeguards the interests of companies, their employees, and users.
10.1. Prevention
How can companies prevent the theft of their LLM models?
- Implementing strict access and authentication controls.
- Restrict access to network resources, internal services, and APIs to prevent insider risks and threats.
- Monitoring and auditing access to model repositories to respond to suspicious behavior or unauthorized actions.
- Automating the deployment of Machine Learning operations.
- Implementing security controls and putting in place mitigation strategies.
- Limiting the number of API calls to reduce the risk of data exfiltration and employing techniques to detect improper extractions.
- Employing a watermarking framework throughout the LLM application lifecycle.
11. Generative AI and cybersecurity
OWASP’s Top 10 LLM application vulnerabilities highlights the importance of having highly skilled and experienced cybersecurity professionals to address the complex cyber threat landscape successfully.
If generative AI becomes established as one of the most relevant technologies in the coming years, it will become a priority target for criminal groups. Therefore, companies must place cybersecurity at the heart of their business strategies.
11.1. Cybersecurity services to mitigate vulnerabilities in LLM applications
To this end, advanced cybersecurity services are available to secure LLM applications throughout their lifecycle and prevent risks associated with the supply chain, which is highly relevant given the development and commercialization of pre-trained models:
- Code audits and application security testing (DAST, SAST, IAST, etc.) from design and throughout the application lifecycle.
- Vulnerability management to detect, prioritize, and mitigate vulnerabilities in all system components.
- Detection of emerging vulnerabilities to remediate problems before hostile actors exploit them.
- Simulation of DoS attacks to test resilience against this attack and improve defensive layers and resource management.
- Red Team services to evaluate the effectiveness of the organization’s defensive capabilities to detect, respond to, and mitigate a successful attack, as well as to recover normality in the shortest possible time and safeguard business continuity.
- Supplier audits to prevent supply chain attacks.
- Training and educating all professionals to implement reasonable security practices and avoid errors or failures that lead to exploitable vulnerabilities.
In short, OWASP’s Top 10 vulnerabilities in LLM applications spotlights the security risks associated with generative AI. These technologies are already part of our lives and are used by thousands of companies and professionals daily.
In the absence of the European Union approving the first European regulation on AI, companies must undertake a comprehensive security strategy capable of protecting applications, their data, and their users against criminal groups.
This article is part of a series of articles about AI and cybersecurity
- What are the AI security risks?
- Top 10 vulnerabilities in LLM applications such as ChatGPT
- Best practices in cybersecurity for AI
- Artificial Intelligence Fraud: New Technology, Old Targets
- AI, deepfake, and the evolution of CEO fraud
- What will the future of AI and cybersecurity look like?
- The Risks of Using Generative AI in Business: Protect Your Secrets
- MITRE ATLAS: How can AI be attacked?