BIOSCAN-1M Insect Dataset

  • Zahra Gharaee (Ophavsperson)
  • ZeMing Gong (Ophavsperson)
  • Nicholas Pellegrino (Ophavsperson)
  • Iuliia Zarubiieva (Ophavsperson)
  • Joakim Bruslund Haurum (Ophavsperson)
  • Scott C. Lowe (Ophavsperson)
  • Jaclyn T. A. McKeown (Ophavsperson)
  • Chris C. Y. Ho (Ophavsperson)
  • Joschka McLeod (Ophavsperson)
  • Yi-Yun C. Wei (Ophavsperson)
  • Jireh Agda (Ophavsperson)
  • Sujeevan Ratnasingham (Ophavsperson)
  • Dirk Steinke (Ophavsperson)
  • Angel X. Chang (Ophavsperson)
  • Graham W. Taylor (Ophavsperson)
  • Paul Fieguth (Ophavsperson)

Datasæt

Beskrivelse

The BIOSCAN-Insect Dataset is a new large dataset of hand-labelled insect images. Each record is taxonomically classified by an expert, and also has associated genetic information including raw nucleotide barcode sequences and assigned barcode index numbers, which are genetically-based proxies for species classification. This paper presents a curated million-image dataset, primarily to train computer-vision models capable of providing image-based taxonomic assessment, however, the dataset also presents compelling characteristics, the study of which would be of interest to the broader machine learning community. Driven by the biological nature inherent to the dataset, a characteristic long-tailed class-imbalance distribution is exhibited. Furthermore, taxonomic labelling is a hierarchical classification scheme, presenting a highly fine-grained classification problem at lower levels.
Dato for tilgængelighed12 jun. 2023
ForlagZenodo
  • A Step Towards Worldwide Biodiversity Assessment: The BIOSCAN-1M Insect Dataset

    Gharaee, Z., Gong, Z., Pellegrino, N., Zarubiieva, I., Haurum, J. B., Lowe, S. C., McKeown, J. T. A., Ho, C. C. Y., McLeod, J., Wei, Y-Y. C., Agda, J., Ratnasingham, S., Steinke, D., Chang, A. X., Taylor, G. W. & Fieguth, P., sep. 2023, (Accepteret/In press) Advances in Neural Information Processing Systems. Bind 37.

    Publikation: Bidrag til bog/antologi/rapport/konference proceedingKonferenceartikel i proceedingForskningpeer review

Citationsformater