Integrated genome-wide investigations of the housefly, a global vector of diseases reveal unique dispersal patterns and bacterial communities across farms

Data set


Abstract Background Houseflies (Musca domestica L.) live in intimate association with numerous microorganisms and is a vector of human pathogens. In temperate areas, houseflies will overwinter in environments constructed by humans and recolonize surrounding areas in early summer. However, the dispersal patterns and associated bacteria across season and location are unclear. We used genotyping-by-sequencing (GBS) for the simultaneous identification and genotyping of thousands of Single Nucleotide Polymorphisms (SNPs) to establish dispersal patterns of houseflies across farms. Secondly, we used 16S rRNA gene amplicon sequencing to establish the variation and association between bacterial communities and the housefly across farms. Results Using GBS we identified 18,000 SNPs across 400 individuals sampled within and between 11 dairy farms in Denmark. There was evidence for sub-structuring of Danish housefly populations and with genetic structure that differed across season and sex. Further, there was a strong isolation by distance (IBD) effect, but with large variation suggesting that other hidden geographic barriers are important. Large individual variations were observed in the community structure of the microbiome and it was found to be dependent on location, sex, and collection time. Furthermore, the relative prevalence of putative pathogens was highly dependent on location and collection time. Conclusion We were able to identify SNPs for the determination of the spatiotemporal housefly genetic structure, and to establish the variation and association between bacterial communities and the housefly across farms using novel next-generation sequencing (NGS) techniques. These results are important for disease prevention given the fine-scale population structure and IBD for the housefly, and that individual houseflies carry location specific bacteria including putative pathogens.
Dato for tilgængelighed2020