Automated calculation of genomewide CADD scores as accumulants of functional annotations

Projectleider(s):
Marco Bink | marco.bink@hendrix-genetics.com

Samenvatting project

Genomic prediction (GP) has revolutionized the world of animal breeding, but the lack of functional genomic information currently hampers further development. Although successful, GP uses the genome as a black box, by working with a set of genomic markers distributed across the genome to predict the performance of an animal. However, recent developments in animal genomics have produced frameworks to score genomic variants on their likely functionality, which will facilitate the rapid discovery of novel functional variants and improved genomic prediction. The Combined Annotation Dependent Depletion (CADD) framework is a tool that can predict the impact of mutation via integration of multiple annotations into one metric. Accurate impact prediction of mutations is extremely valuable to understand the genotype-phenotype link, one of the major research topics in the life sciences. These CADD scores are built on important layers of annotations that include sequence context, conservation scores, gene expression data, non-synonymous mutation scores, and epigenomic data. CADD was originally developed for human and in 2020 we have produced and published CADD for two livestock species, pig, and chicken. However, as increasingly more functional data is being generated, regular updates are needed to integrate this new information into the CADD scores. We recently, generated a highly improved reference genome and generated functional genome information (RNAseq data) for many tissues and developmental stages for turkey. Furthermore, an additional wealth of sequence, genotypic and phenotypic information for several elite turkey populations are available at Hendrix Genetics. This new information now enables developing and deploying a CADD approach in turkey as well. With the concurrent development of a versatile bioinformatics pipeline to calculate and compare CADD scores, we plan to generate a resource that will drive the future development of functional genomics resources in turkey and other livestock species.

Doel van het project

The overall goal of the project is to utilize functional genomic information to assist in the identification of causal variants for health and robustness traits in turkey and to improve genomic prediction with specific emphasis on health and robustness traits in turkey.

Motivatie

The proposed activities fit within KIA MMIP S2 Biotechnology and Breeding. Within our proposal we will develop a bioinformatics tool (based on our established knowledgebase, genomics data and AI approaches) for precision breeding in poultry with specific emphasis on turkey.

Geplande resultaten

1. A database with CADD C-scores for every possible single nucleotide variant in the turkey genome.
2. A standardized bioinformatics pipeline for regular updates of CADD C-scores in livestock with turkey and chicken as use cases.
3. Implementation of poultry breeding strategies that include the use of CADD scores for identification of novel causal genetic markers.
4. Scientific peer reviewed publication describing the development and use of turkey CADD scores for the identification of causal variants.

Resultaten

Er zijn nog geen resultaten voor dit project.

Impact

Er is nog geen impact voor dit project.