Record summary
A quick snapshot of what this page covers.
Risk profile
How this risk is described and categorized.
"The quality of training data is another challenge faced by generative AI. The quality of generative AI models largely depends on the quality of the training data (Dwivedi et al., 2023; Su & Yang, 2023). Any factual errors, unbalanced information sources, or biases embedded in the training data may be reflected in the output of the model. Generative AI models, such as ChatGPT or Stable Diffusion which is a text-to-image model, often require large amounts of training data (Gozalo-Brizuela & Garrido-Merchan, 2023). It is important to not only have high-quality training datasets but also have complete and balanced datasets."
Suggested mitigations
Defenses that may help with related attacks.
Source
Research source for this risk, when available.
Included resource
Generative AI and ChatGPT: Applications, Challenges, and AI-Human Collaboration
Original source
MIT AI Risk Repository
Open the public repository used for AI risk records and taxonomy fields.
