Launching LVE: The First Open Repository of LLM Vulnerabilities and Exposures
Today, we are excited to announce the formation and launch of the LVE Project. LVE stands for Language Model Vulnerability and Exposure and is a community-focused open source project, to publicly document and track exploits and attacks on large language models (LLMs) like (Chat)GPT, Llama and Mistral models. Throughout the past year, LLMs like ChatGPT have seen an explosion in popularity, both among the broader public as well as developers who have started to build novel AI-powered applications and machine learning systems on top of them. While most focus on the capabilities of LLMs, there has also been a growing concern about the safety and security implications of these models and how they are being used. However, due to the rapid pace of developments, the discourse around LLM safety remains challenging and fragmented.
The discourse around LLM Safety remains largely unstructured and disconnected, making it difficult for researchers, developers and the broader public to converse and keep track of the latest developments in the field:
The mission of the LVE project is to improve the discourse around LLM safety by providing a hub for the community, to document, track and discuss language model vulnerabilities and exposures (LVEs).
We do this to raise awareness and help everyone better understand the capabilities and vulnerabilities of state-of-the-art large language models and to help improve future models and applications, by open sourcing and sharing the LVEs that we find.
More technically, we go beyond basic anecdotal evidence and ensure transparent and traceable reporting by capturing the exact prompts, inference parameters and model versions that trigger a vulnerability. We do so by providing a systematic open source framework for describing and recording vulnerabilities and exploits found with state-of-the-art LLMs. Our key principles are: open source (global community should freely exchange LVEs), reproducibility (transparent and traceable reporting) and comprehensiveness (cover a broad variety of issues).
Due to the versatility of LLMs, the scope of LVEs is very broad and includes a wide range of issues. With today's release we already track numerous issues that fall into the following categories:
While these categories are not exhaustive, they provide a good starting point for the LLM safety discourse and will be expanded over time.
The LVE Project is structured as an open source project with two main components:
LVE has the goal of being a community-driven project, and we invite everyone to contribute. This can include challenge participation, documenting new LVEs, reproducing existing LVEs, or even contributing to the LVE Framework itself. To learn more about how to contribute, please visit the LVE Docs or the LVE GitHub. Or just play a few levels in a Community Challenge.
We also encourage everyone to join the LVE Discord to discuss LVE and LLM safety issues.