Section 09 – Dairy Cattle Genetic Evaluation: Difference between revisions
Line 130: | Line 130: | ||
== Evaluation steps == | == Evaluation steps == | ||
<div class="mw-collapsible mw-collapsed"> | |||
<div> | |||
=== Statistical treatment and effects in the genetic evaluation model === | |||
Organisations responsible for national GES should strive for simplicity of the analysis model and avoid amendments that reduce simplicity and clarity of the analysis model. The best model should be decided upon considering the fit and predictive ability of the model. | |||
Decision on statistical treatments and effects in model should take into consideration several factors, such as: | |||
# How large are (contemporary) group sizes? | |||
# Are the estimates of parameters constant over time? | |||
# Are multiplicative adjustment factors necessary? | |||
# What are the consequences of the environmental effects being adjusted for or included in the model for components of variance? | |||
# Is the effect to be estimated from data or from the main random effects included in the model (breeding values, residuals)? | |||
# What are effects of different combination of parameters on the degree of freedom and of the fit of the model? | |||
In considering an effect as fixed or random the following should be taken into consideration: | |||
# If there is enough evidence to suggest that the effect is non randomly associated with the main random effect; | |||
# If number of levels is small; | |||
# If size of groups is large; | |||
# If the effect has a repeating nature; | |||
# If the effect is used to elucidate the time trend. | |||
For the choice of evaluation model for milk production traits the following set of priorities is recommended: | |||
# An animal model in contrast to a sire model; | |||
# A within lactation multiple trait model in contrast to a within lactation single trait model; | |||
# A multiple lactation model in contrast to a single lactation model; | |||
# A multiple trait multiple lactation model in contrast to a single trait repeatability model; | |||
# A test day model in contrast to a lactation model. | |||
=== Explanatory note === | |||
The above recommendation almost exclusively deals with milk production traits and does not take into consideration many aspects of genetic analysis models for other traits (see below for calving traits). The guiding principle is to choose a model that is more capable of utilising (or exposing) the genetic variation. It translates into choice of models that have either theoretical superiority or enable us to obtain an estimate of an animal’s breeding value that encompass a larger proportion of animal’s genome and/or lifetime. Interbull recommends adherence to superior theoretical models and encourages identification of the practical circumstances under which the theoretical expectations are not realised. | |||
=== Statistical treatment and effects in the genetic evaluation model for calving traits === | |||
For calving traits, whenever possible, calving ease (CE) and stillbirth (SB) proofs should be estimated by: | |||
# Multi-trait: considering CE (first parity), CE (later parities), SB (first parity), and SB (later parities). | |||
# Animal model. | |||
# Fitting both direct and maternal genetic effects. | |||
If at the national level the data structure prevents the possibility of applying an animal model, a Sire-Maternal-GrandSire (S-MGS) model can be fitted instead. In this case, SIRE and MGS genetic effectsare, by definition, predicted transmitting abilities (PTA) and denoting direct (D) and maternal (M)genetic effects with the related subscripts: | |||
SIRE genetic effect = PTA<sub>D</sub> | |||
MGS genetic effect = ½ * PTA<sub>D</sub> + PTA<sub>M</sub> | |||
Deriving PTAD and PTAM as a function of the SIRE and MGS genetic effects, the following proofs’ definitions are recommended for the calving traits evaluation: | |||
# for DCE and DSB: PTA<sub>D</sub> = SIRE | |||
# for MCE and MSB: PTA<sub>M</sub> = MGS – ½ * SIRE | |||
# The effective daughter contribution (EDC) for these data submissions should be consistent with these linear functions and computed using multiple-trait EDC methods (Sullivan, 2007, Interbull Bulletin 37,78-81.; Sullivan et al., 2006, Interbull Bulletin 35,112-116.) | |||
=== Model’s unbiasedness === | |||
For the purpose of international genetic evaluations unbiasedness should be considered as the most important single criteria, although some degree of compromise can be envisaged for the national genetic evaluation, for example to avoid high prediction error variance. For this purpose, Interbull has put forward five different validations tests to assess the level of bias in the genetic model. More information about the tests is available in Guidelines chapter 4 - Post-evaluation steps “system validation”. | |||
=== Genetic parameters === | |||
Phenotypic and genetic parameters should be estimated as often as possible and definitely, at least, once per generation. All aspects of estimation procedures for estimation of variance components (data structure, method and model of estimation, effects included in the model and so on) should be as similar as possible to the estimation procedures for breeding values. | |||
=== Use of phantom parent groups === | |||
The evaluation procedure should be certain to group unknown parents according to breed, country of origin, selection path and birth date or some other method to establish time trends. The procedures used for formation of phantom parent groups must give special attention to imported animals in order to evaluate correctly these in the national GES. Phantom parent groups should have a minimum size of 10-20 animals, although larger groups may be necessary for traits with low heritability. | |||
=== Use of Single Step evaluation === | |||
Interbull’s EBVs are used as inputs to national genomic evaluations, as pseudo-phenotype to predict genotype effects, therefore it is extremely important for such EBVs to not include any type of genomic information else the related national and international genomic evaluations would accumulate an amount of bias which would increase exponentially from evaluation to evaluation. | |||
NGECs using a single step approach in their national GES are encouraged to apply one of the following Interbull’s recommendations prior sending their national conventional EBVs for a MACE Interbull evaluation: | |||
# Generate EBVs from pre-adjusted phenotypes, using estimates of environmental effects from single step model; | |||
# Generate EBVs from a BLUP evaluation excluding genotypes. | |||
NGECs are, however, allowed to submit single-step national evaluations results (GEBVs) as input to the Interbull genomic (GMACE) evaluation.</div> | |||
</div> | |||
== Post-Evaluation steps == | == Post-Evaluation steps == |
Revision as of 13:00, 29 July 2024
Background
The present Guidelines aim to provide a general view of the current practices in place regarding genetic and genomic evaluations of dairy cattle both at national and international level. The overview is based on the genetic/genomic evaluation systems (GES) as currently provided by the National Genetic Evaluation Centres (NGEC) participating in Interbull evaluations.
International bull evaluations for dairy cattle offered by Interbull are of three main types (https://interbull.org/ib/cop_chap5):
- Conventional evaluation based on the exchange of national EBVs using the MACE (Multiple-trait Across Country Evaluation) methodology for Holstein, Brown Swiss, Simmental, Jersey, Guernsey and Red Dairy Cattle populations;
- Genomic evaluation based on the exchange of genotypes between Interbull and several countries for Brown Swiss and (small) Holstein cattle populations using the InterGenomics methodology;
- Genomic evaluation based on the exchange of GEBVs for young genotyped Holstein bulls only, (and only until they have enough daughters to qualify for official MACE proofs) using the GMACE (Genomic MACE) methodology.
The requirements provided in these guidelines are solely intended for participation in the international genetic/genomic evaluation services offered by Interbull. They deal mostly with production traits but the same principles can in most cases be equally well applied to other traits.
In this document Genetic Evaluation System (GES) is meant to include all aspects from population structure and data collection to publication of results. Each and every statistical treatment of the data that has a genetic breeding motivation or justification is an integrated part of GES.
The purpose of this set of guidelines is to facilitate a higher degree of harmonisation in the things that can be harmonised and to encourage documentation of the things that cannot be harmonised at this time. These guidelines should increase the quality and accuracy of evaluations at the national and international level. The aim is also to increase clarity in showing the biological and statistical reasons for what is done in national GES.
Recommendations presented here should also be viewed holistically as a coherent system. Every specific recommendation presupposes acceptance and adherence to many other such specific recommendations. Therefore, and as an example, when “unique identification of all animals” is recommended in one section, then all further reference to “animals” is to be interpreted as “uniquely identified animals”.
National genetic evaluation centres (NGECs) should keep official, up to date and detailed documentation on all aspects of their GES. Documentation on all aspects of GES should also be made available on the Interbull Centre website (www.interbull.org) and updated regularly with any changes as soon as they have taken place.
Pre-Evaluation steps
Introduction
All countries are recommended to establish national GES for all of their locally and internationally recognised breeds. Assignment of an animal to a specific breed is justified if 75% of the animal’s genes originate from that breed (or both sire and maternal grandsire are from the breed of evaluation).
For the sake of international evaluation, bulls should be classified under one of the following breed groups:
- Brown Swiss-type.
- Guernsey-type.
- Holstein-Friesian-type.
- Jersey-type.
- Red Dairy Cattle-type (including Milking Shorthorn and several Red-and-White breeds)-type;.
- Simmental (including Montbeliarde)-type.
according to the definition given in each country and based on the direction the population has taken in this country. Individual countries should identify the breed’s groups their populations belong to. In the case of cross-breeding, the breed with the highest percentage should be considered
Animal Identification
All animals should be identified and registered in accordance with the ICAR’s General Rules (Section 01 – General Rules).
Each animal’s ID should be unique to that animal, given to the animal at birth, never be used again for any other animal, and be used throughout the life of the animal in the country of birth and also by all other countries. For exchanging of information with Interbull, the following information should be provided for each animal:
Breed code Character 3 (ICAR breed codes)
Code of Country of birth Character 3 (ISO 3166)
Sex code Character 1 (M/F)
Animal registration Alphanumeric 12
All parts of an animal ID should be kept intact. If, for any reason, modification of the original animal ID is necessary, it should be considered as a re-registration and fully documented by a cross-reference record relating the original (and intact) animal ID and the new animal ID. For international evaluations, such record shall be uploaded into the Interbull Centre database
Pedigree information
The parentage of an animal shall be recorded by identifying and recording the service sire and the served animal at the time of service, as provided for by ICAR ‘s General Rules (Section 01 – General Rules).
NGECs should, in co-operation with other interested parties, keep track and report percentage of animals with missing ID and pedigree information. The overall quantitative measures of data quality should include percentage of sire and dam identified animals or alternatively percentage of missing IDs.
The doubtful pedigree and birth information should be set to unknown (set parent ID to zero).
To ensure sufficient pedigree information it is recommended to have evaluations including a minimum of 3 generations of pedigree, even if phenotype records may not be available for all such animals
Genetic defects
The information that the animal is a carrier of genetic defects, as defined by the International Breed Association, should be made available internationally as soon as possible after such information is discovered. For most breed associations the transfer of such information currently happens through bilateral (in most cases manual) exchange of data. To help sharing such information internationally, Interbull Centre has collaborated with the World Holstein Friesian Federation (WHFF) who has agreed on sharing its harmonised codes and nomenclatures pertaining to true genetic tests using the Interbull Centre’s database (IDEA) and its dedicated module for sharing of animals’ information (AnimInfo). International breed associations are highly encouraged to work on the standardization and harmonisation of the genetic defects most relevant to their breeds so that sharing of such information could also be improved by the usage of the IDEA AnimInfo module
Sire categories
Countries should clearly and correctly describe different sire categories, that is to distinguish between:
- domestically proven bulls;
- imported bulls;
- young bulls genomically tested but not yet selected for AI;
- young bulls with first batch of daughters;
- proven bulls with second batch of daughters;
- bulls with only parent average and genomic information, and;
- most important of all between NS bulls vs. AI bulls.
Quantitative measures should be employed to define AI bulls. Responsible organisations are recommended to strive for establishing daughters in a large number of herds (preferably > 10) for young AI bulls.
Type of proof
Countries should clearly and correctly describe different type of proofs used, distinguishing between:
- based on first crop sampling daughters;
- based on first and second crop daughters;
- based on parent average and genomic information only;
- based on imported semen of proven bull, second crop daughters only;
- based on more than 50% imported daughters or daughters born from imported embryos.
Official publication of proof
Countries should clearly and correctly describe different type of publication of proofs used, that is to distinguish between:
- if bull proof meets national standards for official publication in the country sending information;
- if bull is part of a simultaneous progeny-testing program, but the proof does not yet meet national standards for official publication.
Young bulls may be used in simultaneous progeny testing in two or more countries with large enough number of daughters in each country to warrant an independent official evaluation. These bulls should clearly be classified as “simultaneously progeny tested bulls”.
Traits of evaluation
Direct measurement of traits and utilisation of the metric system is encouraged. Recording organisations should adopt recording schemes that ensure accurate collection and reporting of all data. It is recommended that national genetic evaluation centres provide detailed definitions of traits on their web sites when possible, in line with the ICAR atlas. The definitions should include all data checks and edits, such as range of acceptable phenotypic values, age, parity, etc
Data requirements for various traits of interest
Records of all animals with known Animal ID should be included in the genetic evaluations.
All records should be accompanied by relevant dates (birth, calving, etc.).
All records should be accompanied by sufficient information for formation of contemporary groups, such as herd and geographical location of the herd (e.g. region). Information on internationally standardised methods of recording should be included. An example for the production traits is ICAR A4, A6, B4, etc.
All other relevant information, depending on the trait of interest, should accompany the records, e.g. number of milkings per day, production system (e.g. Alpine pasture, total mixed ration (TMR) or grazing), methods for estimation of 24-hour and 305-day yields, extension methods, adjustment methods etc.
Number of years of production data to be included in the evaluations should desirably be equal to at least 3 generation intervals (i.e. 15 years) of consistently recorded data.
Number of lactations included
Number of lactations to be included in the evaluations is recommended to be at least three. Breeding values should be produced for the whole lactation period, separately for different lactations. Separate breeding values should then be combined into one single composite breeding value for each trait for the whole life, in which different lactations are given separate weights based on each lactation’s economic value.
Data quality
It is desirable that all data related to all animals (herd book, insemination, milk recording, veterinary practices, etc.), irrespective of their sources, be available to the genetic evaluation centres in form of an integrated database. A complete documentation of data checks, including data edits conducted by milk recording organisations, is essential. All member organisations / countries should adopt quantitative measures of assessing data quality. National genetic evaluation centres should devise simple methods of checking for detection of outliers and exclusion of logical inconsistencies in the input data. Biological improbabilities should also be checked. Extra precautions should be employed so that no inadvertent selection of data or introduction of bias becomes possible. Poor quality data should be excluded from genetic evaluations. Complete documentation of all procedures to check and edit the data is very important. National genetic evaluation centres are encouraged to have quality assurance systems implemented.
Inclusion and extension of records
Different kinds of lactations, i.e. records in progress, records from culled cows, records of dried off cows (i.e. lactations of cows remaining in the herd but terminated artificially because of a new pregnancy or any other management reasons), naturally terminated lactations shorter than 305 days and finally, lactations longer than 305 days should be identified in the system and treated differently.
All records with ≥ 45 DIM or two test days should be included in the evaluations. Extension or lack thereof should be decided upon after enough scientific/empirical justifications have been established for each kind of lactation. Records in progress and short lactations from culled cows should normally be extended. Lactations of cows dried off before 305 days and naturally terminated lactations shorter than 305 days may be extended provided adjustment for days open and / or current calving interval have not been satisfactory. Data from lactations longer than 305 days should be cut at 305 days.
Extension methods and factors should be re-evaluated continually to ensure that they are up to date and that no unplanned selection of data occurs. Extension factors should be re estimated at least every 5 years. Different kinds of lactations should be extended using the same extension method and different extension factors. Extension rules and methods should be the same across lactations. Whenever the data span over many years, the extension rules and factors should be appropriate and specific to the various time periods.
Pre-adjustment of records
All effects should preferably be accounted for in the evaluation model. If records are to be pre adjusted, it is more justifiable to do so for those environmental effects that are in need of multiplicative adjustments. Effects in need of additive adjustments should be considered in the model. In any case, adjustment should be made to the population mean and not to an extreme class. Pre adjustment factors should be updated as often as possible (at least once per generation) and be specific to different time periods.
Adjustment for Genomic Reliability values
Theoretical genomic reliabilities depend on model assumptions of conventional or genomic models, they tend to be higher than those realized reliabilities which are calculated from validation R2 values derived from genomic validation with truncated data. Therefore, those theoretical model genomic reliabilities must be adjusted to the level of the realised ones. An adjustment procedure for genomic reliability values has been developed using genomic validation results following Interbull’s GEBV Test (Mäntysaari et al. 2010). Interbull recommends following the procedure put together by Liu et al. (2017) and available on https://interbull.org/static/web/A_technical_document_on_derivation_and_application_of_adjustment.pdf and https://interbull.org/static/web/A_supplementary_document_to_the_Interbull_genomic_reliability_method-1.pdf
Evaluation steps
Statistical treatment and effects in the genetic evaluation model
Organisations responsible for national GES should strive for simplicity of the analysis model and avoid amendments that reduce simplicity and clarity of the analysis model. The best model should be decided upon considering the fit and predictive ability of the model.
Decision on statistical treatments and effects in model should take into consideration several factors, such as:
- How large are (contemporary) group sizes?
- Are the estimates of parameters constant over time?
- Are multiplicative adjustment factors necessary?
- What are the consequences of the environmental effects being adjusted for or included in the model for components of variance?
- Is the effect to be estimated from data or from the main random effects included in the model (breeding values, residuals)?
- What are effects of different combination of parameters on the degree of freedom and of the fit of the model?
In considering an effect as fixed or random the following should be taken into consideration:
- If there is enough evidence to suggest that the effect is non randomly associated with the main random effect;
- If number of levels is small;
- If size of groups is large;
- If the effect has a repeating nature;
- If the effect is used to elucidate the time trend.
For the choice of evaluation model for milk production traits the following set of priorities is recommended:
- An animal model in contrast to a sire model;
- A within lactation multiple trait model in contrast to a within lactation single trait model;
- A multiple lactation model in contrast to a single lactation model;
- A multiple trait multiple lactation model in contrast to a single trait repeatability model;
- A test day model in contrast to a lactation model.
Explanatory note
The above recommendation almost exclusively deals with milk production traits and does not take into consideration many aspects of genetic analysis models for other traits (see below for calving traits). The guiding principle is to choose a model that is more capable of utilising (or exposing) the genetic variation. It translates into choice of models that have either theoretical superiority or enable us to obtain an estimate of an animal’s breeding value that encompass a larger proportion of animal’s genome and/or lifetime. Interbull recommends adherence to superior theoretical models and encourages identification of the practical circumstances under which the theoretical expectations are not realised.
Statistical treatment and effects in the genetic evaluation model for calving traits
For calving traits, whenever possible, calving ease (CE) and stillbirth (SB) proofs should be estimated by:
- Multi-trait: considering CE (first parity), CE (later parities), SB (first parity), and SB (later parities).
- Animal model.
- Fitting both direct and maternal genetic effects.
If at the national level the data structure prevents the possibility of applying an animal model, a Sire-Maternal-GrandSire (S-MGS) model can be fitted instead. In this case, SIRE and MGS genetic effectsare, by definition, predicted transmitting abilities (PTA) and denoting direct (D) and maternal (M)genetic effects with the related subscripts:
SIRE genetic effect = PTAD
MGS genetic effect = ½ * PTAD + PTAM
Deriving PTAD and PTAM as a function of the SIRE and MGS genetic effects, the following proofs’ definitions are recommended for the calving traits evaluation:
- for DCE and DSB: PTAD = SIRE
- for MCE and MSB: PTAM = MGS – ½ * SIRE
- The effective daughter contribution (EDC) for these data submissions should be consistent with these linear functions and computed using multiple-trait EDC methods (Sullivan, 2007, Interbull Bulletin 37,78-81.; Sullivan et al., 2006, Interbull Bulletin 35,112-116.)
Model’s unbiasedness
For the purpose of international genetic evaluations unbiasedness should be considered as the most important single criteria, although some degree of compromise can be envisaged for the national genetic evaluation, for example to avoid high prediction error variance. For this purpose, Interbull has put forward five different validations tests to assess the level of bias in the genetic model. More information about the tests is available in Guidelines chapter 4 - Post-evaluation steps “system validation”.
Genetic parameters
Phenotypic and genetic parameters should be estimated as often as possible and definitely, at least, once per generation. All aspects of estimation procedures for estimation of variance components (data structure, method and model of estimation, effects included in the model and so on) should be as similar as possible to the estimation procedures for breeding values.
Use of phantom parent groups
The evaluation procedure should be certain to group unknown parents according to breed, country of origin, selection path and birth date or some other method to establish time trends. The procedures used for formation of phantom parent groups must give special attention to imported animals in order to evaluate correctly these in the national GES. Phantom parent groups should have a minimum size of 10-20 animals, although larger groups may be necessary for traits with low heritability.
Use of Single Step evaluation
Interbull’s EBVs are used as inputs to national genomic evaluations, as pseudo-phenotype to predict genotype effects, therefore it is extremely important for such EBVs to not include any type of genomic information else the related national and international genomic evaluations would accumulate an amount of bias which would increase exponentially from evaluation to evaluation.
NGECs using a single step approach in their national GES are encouraged to apply one of the following Interbull’s recommendations prior sending their national conventional EBVs for a MACE Interbull evaluation:
- Generate EBVs from pre-adjusted phenotypes, using estimates of environmental effects from single step model;
- Generate EBVs from a BLUP evaluation excluding genotypes.