Creating a Controlled Dataset of Environmental Assessment Texts for Generative AI Models: The Danish EA Hub

Lone Kørnøv, Ivar Lyhne, Karl Rasmus Sveding

Research output: Contribution to journalJournal articleResearchpeer-review

Abstract

Relevance and roles of artificial intelligence (AI) within impact assessment (IA) depends on the quality of the data input to the AI model. This paper explores the development of a specialized dataset of environmental assessment (EA) texts for use in generative AI models. By establishing a controlled dataset, we aim to ensure that AI models draw from relevant, high-quality sources, avoiding the pitfalls of using potentially irrelevant or inaccurate online data. Key considerations include quality control, content selection, copyright, ownership, structure, classification, and maintaining up-to-date information. Using the Danish EA Hub as a case study, we outline the steps and considerations involved in creating such a dataset. Finally, we discuss the potential of using this controlled dataset to enhance the capabilities of generative AI in the field of environmental assessment. The results are of significant relevance to actors engaged in the process of or intending to develop AI solutions within IA.
Original languageEnglish
JournalImpact Assessment and Project Appraisal
ISSN1461-5517
Publication statusSubmitted - 26 Oct 2024

Keywords

  • Digitalization
  • Environmental Assessment
  • Data-centric artificial intelligence
  • Generative AI
  • EA Hub
  • Socio-Technical system

Fingerprint

Dive into the research topics of 'Creating a Controlled Dataset of Environmental Assessment Texts for Generative AI Models: The Danish EA Hub'. Together they form a unique fingerprint.

Cite this