Milk Analysis

Field of application

These guidelines concern methods for fat, protein, lactose, urea and somatic cell count determinations in individual cow, goat and ewe milk. Milk samples are in most cases preserved with chemical substances. This will be taken into account in the procedures. These guidelines define:

Authorised reference methods.
Operation of validated routine methods.
Recommendations for controlling sample quality.
Recommendations for quality control of analyses.

Reference methods

The wording "reference methods" designates the methods used to calibrate the routine methods.

The reference methods should be internationally standardised methods (i.e. ISO, IDF, AOAC methods), although practical arrangements are permitted (see note below). The reference methods are listed in clause 8 (Appendix 2. Methods) below.

Note: Reference transfers

Rapid chemical methods can be used instead of a more time consuming reference method as far as results have shown to be equivalent to those from reference methods (i.e. Gerber method for fat, Amido Black method for protein, enzymatic method for lactose).
Master instruments may be used to produce "reference values" for other instruments and for other laboratories in case of a system with centralised calibration. Instrumental values may be considered equivalent to the values of the method used as reference for the calibration. Application of a centralised calibration concept must take into account sensitivity of routine methods to matrix effects (milk composition).

Routine (instrumental) methods

Routine methods should be methods which are fit for purpose on the basis of a performance evaluation by an expert laboratory and using a standardised protocol, or methods certified at the international level by ICAR. With this respect, conditions and procedure of evaluation, as well as requirements for ICAR certification, are defined in a standard protocol certified by ICAR (Procedure 1 of Section 12 of the ICAR Guidelines - as here) as relevant for the purpose of milk recording.

Specific recommendations for controlling the quality of DHI milk samples

Refer to Section 11 for guidelines on devices for collecting milk samples and sample sizes.

The quality of the sample is the first major requirement for a consistent analytical result. Good quality samples are a prerequisite to establish whether analytical quality requirements are met.

Bottles

In general terms, vials and stoppers must be suitable for their purpose (to bring milk without loss or damage to laboratories). For instance, a too large empty volume above the milk may facilitate churning during transport, especially with non-refrigerated milk. A too small empty volume above the milk may give rise to problems with mixing. Fat loss may occur with imperfectly tight stoppers.

Preservatives

Preservation of milk recording samples using chemical compounds should:

Maintain the physical and chemical properties of the milk during the period between sampling and analysis under the locally applicable temperature and transport conditions.
Not prevent from performing reference analysis, as the possibility of comparative analysis remains to the laboratories.
Have no effect on the results of analysis with the reference methods and no or only a limited but consistent effect on the reference method and routine method responses. A limited but consistent effect can be compensated for through calibration and/or applying a fixed correction.
Be innocuous to DHI and laboratory staff according to local health and safety regulations.
Be innocuous to environment according to local environmental regulations.

Notes

Sample preservation is promoted by working with clean milking and sampling equipment, by storage of samples at cool temperatures during limited time with a minimum of handling.
Appropriate preservatives are mentioned in relevant standards with guidance (ISO 9622 | IDF 141 and ISO 13366 | IDF 148). Nevertheless, in general care must be taken for:
- the preservative excipient: depending on the excipient - generally salts - various effects can be observed for applied formulations where none exists in the pure form (case of potassium dichromate and bronopol in milk by mid infra red spectrometry);
- some dyes which are used as colour tracer may interfere with the instrumental response (absorption of light or dye-binding with DNA). The accuracy or the sensitivity of a method may therefore be reduced. These dyes should be avoided.

Quality control in DHI laboratories

Quality control on reference methods

Any systematic error with the reference method leads to an overall systematic error on routine results. This type of error, which may exist between laboratories within a country (or organisation) and between countries co-operating within international frameworks such as ICAR, justifies performance evaluations at both levels, national and international.

External control

Every DHI routine laboratory should participate or otherwise be linked in with an interlaboratory proficiency testing (IPT) scheme. Proficiency testing should be organised preferably by a national reference or pivot laboratory appointed for that by the national DHI organisation. The reference laboratory will provide analytical precision traceability by its regular participation in international proficiency trials.

Note:

In situations where there are not sufficient laboratories to implement a national scheme, the laboratory can join PT schemes organised by a national or an international PT provider or the national DHI PT scheme of a neighbouring country.

The minimum frequency for participation in interlaboratory proficiency testing should be 2 times a year.

National reference laboratories should take part in international proficiency tests at a minimum frequency of twice per year. A more frequent participation is advised.

These trials are to be organised according to international standards, or failing that, international guidelines or agreements as indicated in this section.

Internal control

If available, reference materials (RMs) are advised for use to check the exactness and the stability of reference methods used by comparison with nominal values. They will be used preferably when reference analyses for calibration of routine methods are carried out.

These can be:

Certified reference materials (CRMs) produced by a recognised official organisation.
Secondary reference materials (SRMs) prepared by an external supplier.
In-house reference materials (IRMs) prepared by the laboratory itself, where traceability is established with CRMs, SRMs or interlaboratory proficiency tests.

Whatever the choice made by the laboratory, CRMs and SRMs are to be produced and provided in QA conditions and according to international standards, or failing that, international guidelines or agreements as indicated in clause 5.1.1 above.

Quality control on routine methods

Routine methods provide the results effectively used for DHI purposes and, therefore, their consistency has to be checked.

For this, reference is made to the standard ISO 8196-2 | IDF 128-2.

External control

A periodical check of the accuracy must be applied by an national expert laboratory, either through individual external control (IEC), by comparison of routine methods to reference analysis on samples representative of the laboratory area, or through participation in interlaboratory proficiency testing when it has been clearly demonstrated that a single calibration can be used for all the laboratories. In the latter case, recommendations in clause 5.1.1 above are to be followed. The minimum frequency recommended is 2 times a year.

Repeatability and suitability of calibration are the main parameters to be checked. Depending on the experimental design, additional aspects can be evaluated such as sample preservation and instrumental parameters such as linearity, intercorrections (with MultiLinear Regression (MLR)-based calibration models) and intra-laboratory reproducibility.

Internal control

Irrespectively of the parameter, an internal quality control on routine methods has to be carried out in routine testing at the laboratory.

In general the standard ISO 8196 | IDF 128 does not define limits to fulfil for each method and/or milk component. Therefore specific standards have to be applied where they exist:

Fat, protein, lactose and urea (mid infra-red spectrometry): ISO 9622 | IDF 141.
Somatic cell count: ISO 13366-2 | IDF 148-2.

Preparation of control or pilot samples, used for monitoring instrument stability, should be made under quality assurance (i.e. quality control for homogeneity and stability), thereby referring to relevant indications of international standards/guides for reference materials.

According to ISO 8196 | IDF 128, the major checks in quality control are on:

Repeatability.
Daily and short-term stability of instrument.
Calibration.

In addition, checkings are recommended for:

Carry-over effect (all methods).
Linearity (all methods).
Zero-setting (all methods).
Intercorrections (with MLR-based calibration models).
Homogenisation (infra-red).

It is advised to fulfil requirements about frequencies and limits as in clause 7 below.

Requirements for analytical quality control and quality assurance tools

Interlaboratory proficiency tests

Interlaboratory proficiency trials are to be organised in quality assurance conditions, according to international standards, or failing that, international guidelines or agreements:

ISO 17043.
ILAC-G13.
International Harmonized Protocol for Proficiency Testing of (Chemical) Analytical Laboratories (IDF Bulletin 342:1999).
ISO Standard 13528.

Reference materials

Reference materials used for DHI analytical purposes are to be produced in quality assurance conditions, according to international standards, or failing that, international guidelines or agreements:

ISO 17034
ILAC-G9.
ILAC-G12.

Choice of AQA service suppliers

Choice of Analytical Quality Assurance (AQA) service suppliers - i.e. proficiency testing and reference material - by DHI laboratories is to be made in tight relation with the overall DHI AQA system.

Services suppliers should operate under quality assurance and be able to provide documented proof of that.

Service suppliers should submit themselves to a periodical independent audit, i.e. a third party, in order to have the conformity of its QA system judged. These audits can be carried out by accreditation assessors, commissions of user representatives, experts acting on behalf of the national DHI national organisation, provided that their competence and independence are guaranteed and that the audits are conducted in line with ISO and ILAC recommendations.

Appendix 1. Analytical quality control in milk testing laboratories

It is to be expected that meeting these requirements will provide a satisfactory minimum quality level for analytical measurements, as well as comparability between laboratories and countries. If the following scheme cannot be immediately applied, it should be considered as a target.

Components of quality control and recommended minimum frequencies

Table 1. Components of quality control and recommended minimum frequencies.
Control	Frequencies	Mode
Reference methods
- External control	Half-yearly	IPT
- Internal control	Weekly (check of mean bias)	CRMs, SRMs, IRMs
Routine methods
- External control	Half-yearly	IPT/IEC
- Internal control	(see 1.1)	IRMs

IPT: Interlaboratory Proficiency Testing.

CRMs: Certified Reference Materials.

IEC: Individual External Control.

SRMs: Secondary Reference Materials.

IRMs: In-house Reference Materials.

Frequencies and limits for routine methods

Frequencies and limits stated hereafter in Table 2 are for a part defined in existing ISO | IDF standards or are derived from contained recommendations. Other values are indicative as they are not defined in a standard. Experience will show whether or not the latter ones are suitable for all laboratories.

Limits stated below in Table 2 are proposed as "action limits" for internal instrument management. They should only be considered as targets to users and not be used for external evaluations for which other (larger) values can appear more suitable.

Table 2. Minimum frequencies and limits for checking routine methods.
Checks	Frequencies	F P L Limits		SCC Limits
Instrumental fittings
Homogenization	Monthly	£ 1.0 % relative	(a)	Not applicable
Carry-over	Monthly	£ 1 % relative	(a)	£ 2 % relative	(b)
Linearity (curving)	Quarterly	£ 2 % of range	(a)	£ 2 % of range	(b)
Intercorrection	Quarterly	±0.02 % units	(a)	Not applicable
Calibration
Mean bias	Weekly	±0.02 % units	(c)	±5 % relative	(c)
Slope	Monthly	1.00±0.03	(c)	1.00±0.05	(c)
Overall daily stability
Repeatability (sr)	Daily/every	0.014 % units	(a)	6% relative	(b)
	start-up			at 150 000 cells/ml
				5 % relative
				at 300 000 cells/ml
				4% relative
				at 450 000 cells/ml
				3% relative
				at >750 000 cells/ml
Daily/short-term stability	³ 3/hour	±0.05 % units	(c)	±10 % relative	(c)
Zero-setting	³ 4/day	±0.03 % units	(c)	£ 8 000 cells/ml	(b)

(a): Limit stemming from ISO 9622 | IDF 141

(b): Limit stemming from ISO 13366-2 | IDF 148-2

(c): Indicative limit as there is no value specified in corresponding international standards

Note 1:

In case calculated values are out of limits but do not differ from a statistical point of view, adjustments in instrumental settings are not justified. Therefore, representative and/or adequate sample sets should be used in such a way that any outside value is significantly different. Relevant aspects in this are type and number of samples, number of replicates and level of concentration.

Note 2:

Milk with high fat and protein concentrations (milk of buffaloes, ewes, and specific cow and goat species). Because of variable high fat and protein contents, reliable limits for repeatability and short-term stability can be determined by multiplying limits for cows by the ratio of buffaloes (or ewes) average level versus cows average level.
Goats milk: Limits can be the same as for cows milk in case of similar fat and protein content. In case of high fat and protein contents, one will operate according to a).

Note 3:

The criteria are calculated according the ICAR protocol on milk analyser evaluation and ISO 8196-3 IDF 128-3, available here.

Checking

Check on homogenisation (only applicable with IR instruments): In infra-red analysis, the natural size of fat globules strongly affects the measurement of fat, therefore a fat size reduction is applied through an homogenisation before the measurement. Inefficient homogenisation results in poor repeatability and drifts of the signal.
Check on carry-over: In case of successive samples with strong differences of component concentrations, the result for a sample may have been affected by the former milk sample, e.g. by the residual volume of milk in the flow system or by the contamination by the stirrer and the pipette. The error is a proportion of the difference of concentration with the previous sample. The overall carry-over effect should be minimised, should not exceed limits stated and can be corrected for in routine operation by applying carry-over compensation factors.
Check on linearity: Specific sets of samples are prepared in order to cover the whole range of concentration and check that the instrumental measurement is proportional to the concentration of the component measured. The percentage of the bending can be estimated by the ratio (range of the residuals observed) x 100 /(range of the levels).
Check on intercorrections (with MLR-based calibration models): Specific sets of samples are prepared in order to create independent modification in respective components and verify that changes in one particular component do not affect significantly the measurement of the other components. Intercorrections are set in order to compensate the natural interactions due to a incomplete specificity of methods. The larger the range of concentrations of the correcting channel, the bigger the potential error due to an inadequate intercorrection adjustment for the corrected channel.
Check on the mean bias: Representative milk samples are used to check the validity of the calibration at a medium level and to indicate whether any drift has occurred due to changes in milk composition or progressive wear of instruments.
Check on the slope: Specific sets of (calibration) samples are prepared in order to cover the whole range of levels and check that the slope is within the stated limits. The larger the range of concentrations, the bigger the error for extreme values in case of an inadequate slope adjustment.
Repeatability: A repeatability check is indicating whether or not the instrument is working stable. Repeatability is evaluated at the start-up of each instrument on the basis of 10 times replicate analysis of one (control) milk sample. During routine testing a regular test can be made by replicate analysis of the control sample. The estimate of the standard deviation of repeatability should meet stated limits.
Daily and short-term stability: Every day and regularly along a working day, the so-called control samples (or pilot samples) are used to check instruments functioning at one or more concentration levels. Differences observed against assigned values should not exceed the stated limits +/-L. It is advised to complete the control using the calculation of the cumulative mean of the n successive differences which should not exceed the limits +/-L/√n, see ISO 8196|IDF 128.
Zero-setting: Rinsing the flow system and checking the "zero value" are periodically required to check for fouling on the walls of the measurement cells and/or (depending on instruments) to detect any drift of the basic signal.

Appendix 2. Methods

International reference methods

Table 3. International reference methods.
Fat
Gravimetry (Röse-Gottlieb)	ISO 1211 \| IDF 1
Gravimetry (modified Mojonnier)	AOAC 989.05
Crude (or total) Protein
Titrimetry (Kjeldahl)	ISO 8968 \| IDF 20, parts 1 and 3
AOAC 991:20
AOAC 991:21
AOAC 991:22
AOAC 991:23
Casein
Titrimetry (Kjeldahl)	ISO 17997 \| IDF 29, parts 1 and 2
AOAC 998.05
AOAC 998.06
AOAC 998.07
Lactose
HPLC	ISO 22662 \| IDF 198
Urea
Differential pH-method	ISO 14637 \| IDF 195
Somatic cell count
Direct microscopic somatic cell count	ISO 13366-1 \| IDF 148-1

Secondary international reference methods

Table 4. Secondary international reference methods.
Fat
Butyrometric method (Gerber)	ISO 19662 \| IDF 238
	AOAC 2000.18
Babcock	AOAC 989.04
Protein
Dye-binding (Amido Black)	AOAC 975.17
Lactose
Enzymatic	ISO 5765 \| IDF 79
	AOAC 984.15
Differential pH-method	ISO 26462 \| IDF 214

Standardized routine methods

Table 5. Standardized routine methods.
Fat
Automated turbidimetry I	AOAC 969.16
Automated turbidimetry II	AOAC 973.22
Protein
Automated dye-binding (Amido Black)	AOAC 975.17
Fat-protein-lactose-urea
Mid infrared (MIR) spectrometry	ISO 9622 \| IDF 141
	AOAC 972.16
Somatic cell count
Fluoro-opto-electronic methods	ISO 13366-2 \| IDF 148-2
	AOAC 978.26

Procedure 1: Protocol for evaluation of milk analysers for granting ICAR certification

Foreword

The present protocol has been produced by the Working Group on Milk Testing Laboratories.

Though various standards or normative documents already treat the subject of the evaluation of instrumental or indirect or alternative methods, there are as yet no documents with sufficient practical indications on the way to execute, and on the specific technical requirements to fulfil, in the evaluation of analytical routine methods for the particular aspect of the Certification for milk recording by an (official) international body such as ICAR.

Therefore, it is the aim of the present document to define an overall procedure starting from the request for the certification, the procedure for the certification, the description of the technical evaluation needed, providing at the end the elements for a decision on certification.

The present document complies with ISO Standard 8196 (equivalent of IDF standard 128) and will concern milk of various species within the scope of ICAR (cows, goats, ewes, buffaloes) and the various components of interest for milk recording (fat, protein, lactose, somatic cell count, urea).

Introduction

Before being used for milk recording, a new analytical method or new equipment is to be submitted to an evaluation and must be approved for use by a competent body. At present, evaluations are carried out individually with, as a consequence, possible multiplication of evaluations in numerous countries. Moreover, the absence of a common protocol for such evaluations can result in incomplete and inaccurate technical information and numerous reports with non-comparable or partly comparable results.

The objective of this protocol is to define all relevant analytical parameters to be evaluated, providing respective limits to comply with in the relevant ranges for various animal species.

On the basis of this protocol, a limited number of evaluations should suffice to decide about an international certification on common ICAR rules for the application of analytical methods and/or equipment in milk recording.

Rules of the certification

Stages of the evaluation and general principles

Phase I: Every new instrument will be evaluated in specific conditions of test bed, within the period of time necessary to assess all the technical requirements prescribed in the present protocol. This part of the evaluation must be carried out by an expert laboratory specialised in analytical evaluations as well as experienced in (the) reference method(s) required. This laboratory should be accredited for this activity or be recognised as competent for this task by a competent body (national milk recording organisation and/or ICAR).
Phase II: The second phase of the evaluation starts after having succeeded with the first one. At least two new instruments will be used for a two-month period of observation in routine conditions in two different milk recording laboratories. They should fulfil the day-to-day quality control and satisfactorily respond to general convenience needs.
National certification: Request for an evaluation should be brought by manufacturers (or suppliers) to an official organisation (i.e. national milk recording, ministry, etc) who should appoint the laboratories to be involved in the evaluation and would give them an assignment for the work. Reports of both phases I and II will be examined by an official committee. Then, on the basis of technical reports produced by laboratories, a national certification can be pronounced.
International certification: For an international certification by ICAR, the total evaluation should be renewed successfully in three ICAR countries and on similar bases as defined in the protocol. Collation of reports and the request for ICAR certification should be made by manufacturers to ICAR. Milk analyser files will be submitted to the relevant ICAR Sub-Committee (Milk Analysis) for technical advice to the Board.

Then the ICAR board will pronounce itself about the request for certification.

Field of validity of the certification

An certification is given only:

For the field of application where the instruments has been evaluated (component, concentration range, animal species, etc.)
- In case milks of different animal species are to be analysed, specific evaluations for every species concerned have to be carried out to assess that the instrument is appropriate for the expected use. Refer to Table 6 for species specific component ranges.
- In case of breed with unusual milk fat and protein contents (i.e. Jersey breed with high fat and protein contents), the evaluation should be carried out within the same component range with milk of the specific breed.
For the specific instrument configuration used during the evaluation.
- In case of configuration changes, the proof should be brought that it does not affect the precision and the accuracy beyond acceptable limits.

Animal species and particularities of configuration(s) assessed should be carefully noted in the evaluation report.

Table 6. Indicative milk component ranges at least to be covered by an evaluation.
	Cows	Goats	Ewes	Buffaloes	Units
Fat	2.0 – 6.0	2.0 – 5.5	5.0 – 10.0	5.0 – 14.0	g/ 100 g
Protein	2.5 – 4.5	2.5 – 5.0	4.0 – 7.0	4.0 – 7.0	g/ 100 g
Lactose	4.0 – 5.5	4.0 – 5.5	4.0 – 5.5	4.0 – 5.5	g/ 100 g
Urea	10.0 – 70.0	10.0 – 70.0	10.0 – 70.0	10.0 – 70.0	mg / 100 g
Cells	0 – 2000	0 – 2000	0 – 2000	0 – 2000	10³ cells/ml

Course of operations of a technical evaluation:

Introduction to the principle of the evaluation (explanatory note)

Whatever the indirect method is, a standard measurement processing can be presented by the scheme in Figure 1. Each step does not necessarily exist in every instrument. This depends on manufacturers choices in relation to the principle of the measurement and the component measured – for example little or negligible effect (for instance step 3 in somatic cell count in cow’s milk) - or in some case can be merged (for instance, steps 2, 3 and 4 in particular infra-red devices). Nevertheless, in theory the different steps of the signal process can be set up in the instrument and remain available to be activated or not, through active or neutral mathematical matrices. On the other hand, interactions of major components or carry over effect can be eliminated by the method or the physical device (physical treatment, chemical reagents, tube length) and therefore no longer need numerical corrections

Figure 1. Example of a theoretical measurement process in conventional analysers. Every step of the measurement process corresponds to an element of the breakdown of overall accuracy of the method. Minimising the overall error is achieved through minimising every component thereby optimising every step of the measurement process. Then the experimental design for the evaluation of a milk analyser is defined in order to assess that every measurement step is correctly adjusted.

1 Measurement : Zero/blank, repeatability, stability, reproducibility.

2 Amplification : Sensitivity, measurement lower limit ; repeatability.

3 Linearisation : Linearity range ; upper limit; accuracy.

4 Interactions : Effect of other milk components ; accuracy.

5 Calibration : Suitability of manufacturer calibration system ; accuracy.

6 Carry over effect : Effect of previous milk intake ; repeatability, accuracy.

Every step of the evaluation described in the following paragraphs can be required to fulfil appropriate limits for each analytical criteria (component) before starting up the next step.

Minimum necessary assessments for an evaluation

This part defines and describes the elements of the evaluation which are compulsory to evaluate.

Whatever the method and precision element assessed, an evaluation is to be carried out from test results displayed expressed in standardised units and no prior data transformation should be performed (e.g. log or square root for somatic cell counting). Evaluation results should comply with specifications stated in the following paragraphs.

Assessment of preliminary instrumental fittings

Before starting any further assessment, one has to verify basic criteria that indicate a proper functioning of the method or the instrument. These criteria are daily precision (including repeatability and short-term stability), carry-over and linearity.

Daily precision (repeatability and short-term stability)

Basically, a milk analyser should present a signal stability which complies with the precision requirements. If not, the analyser is either in dysfunction (and should not be used) or its precision is not suitable for the objective of the analysis. Therefore, the instantaneous stability (repeatability) and the signal level stability have to be assessed prior to any other parameters.

Along a whole day period and every 15-20 minutes, analyse a same milk sample in triplicate by the instrument without any change in the adjustment of the calibration in order to obtain a minimum of 20 check test series. It should be preferably operated in as close as possible conditions as routine. Therefore, sufficient number of samples should be planned to keep the instrument running between the periodical checks.

The precision will be evaluated at three different concentrations of each component, low, medium and high. To achieve this three different milk samples can be split in as many identical sub-samples as necessary for the analyses.

Using a one-way ANOVA, calculate the estimate of the standard deviation of repeatability (Sr), the standard deviation between check series (Sc) and the standard deviation of daily reproducibility (SR), referring to Appendix 1 (section 6 on page 19):

${\text{SR}}=({\text{Sc}}^{2}+{\text{Sr}}^{2})^{1/2}$

The values Sr and SR obtained should comply with the limits stated for milk recording analysis (see Table 2 and Table 3).

One can check the significance of the non-stability using a F-test. Alternatively, a one-way analysis of variance can be carried out to confirm the non-stability of signal.

Carry-over effect

Strong differences in component contents between two successive milk samples analysed may influence the result of the latter one. It can happen because of an incomplete rinsing of the flow system and the measuring cell by milk circulation and/or a contamination of the former sample by the stirring device. The overall carry-over effect (including both sources of error) will be evaluated on the one hand and the rinsing efficiency of the flow system on the other hand.

Automated analysers often allow to apply on-line corrections to compensate the overall carry-over effect when necessary, therefore:

Rinsing efficiency of the flow system must be assessed by running tests without any correction (correction factor fit to zero) in manual mode (bypass the automated stirrer). Rinsing efficiency should not be less than 99 % or the internal carry-over should not exceed 1 %.
Overall carry-over effect will be assessed including the correction factors either set in the instrument or obtained using the method supplied by the manufacturer. It should not exceed the values stated for the component for milk recording purposes.

Method

Analyses Replicate as many times (n) as necessary the analytical sequence (LL,LL,LH,LH) where LL is a low component concentration sample and LH is a high component concentration sample.
Samples
- Sufficient number of sub-samples of each sample L_L and L_H must be prepared prior to analysis in order to analyse each sub-sample only once.
- L_L and L_H should preferably be milks or liquids of similar viscosity as milk.
- Respective component concentrations must differ considerably. Depending on the component and the method, this can be achieved by using natural separation (creaming for fat), artificial separation (ultra-filtration for protein, micro-filtration for somatic cells) or addition (lactose and urea).
- For biochemical component determinations, concentrations of L_L and L_H should better be extreme values in the measuring range. At the contrary, for somatic cell count, one will assess the carry-over for three different high cell contents (500, 1000, 1500 10³ cells/ml) and a single low cell content, preferably a zero-cell milk.
Calculation
- Calculate the mean and the standard deviations of the differences d_Li = L_1i-L_2i and d_Hi = H_2i - H_1i, respectively , Sd_L, _H, Sd_H.
- calculate the mean difference of concentration

Then carry-over ratios C.O.R. and their standard deviations SC.O.R. are obtained using the following formulas:

As well, C.O.R. can be obtained by the equivalent formulas:

The two values obtained should not significantly differ from each other and should not exceed the limit (L_c.o.r.) stated for the component.

Note

Acceptable limit for conformity: At the worst, the carry-over effect should not produce in the extreme case of lowest and highest concentration of the measuring range (ΔC) an error higher than the repeatability admitted for the method r=2.√2.Sr. Therefore, the limit of c.o.r. can be defined as:

${\text{Lc.o.r.}}=\left({\frac {r}{\Delta C}}\right)\times 100$

A 1-2 % limit is generally recommended in standards.

Number (n) of analytical sequences: It can be defined in order to allow to estimate C.O.R. values with a ± 20 % maximum relative confidence interval (i.e. 1±0,2 %). Thus 2. SC.O.R. ≤ 0,20 . (C.O.R.)

Between 10 and 20 analytical sequences are generally recommended in standards.

Linearity

According to the classical definition of an indirect method, instrument signal should result from a characteristic of the component measured, thereby allowing to define a simple relationship with component concentration.

Nevertheless, newly developed indirect methods can be based on much less specific signal, still providing consistent results from multiple signals through multivariate statistical approaches. For these latter analysers linearity is no longer an absolute requirement in every case (though it must be in some specific utilisation of dairy industry, i.e. on processed milk with progressive contents stemming from concentration or dilution). Since then, for those methods and depending of analytical objectives, the step of linearity assessment can be discarded. The quality of the relationship with reference will be assessed in evaluating overall accuracy. In such a case, any routine measurement outside the calibration concentration range should be considered of doubtful quality and preferably not be used.

Linearity expresses the constancy of the ratio between the increase of milk component measured and the corresponding increase of the instrument measurement. Therefore linearity of the instrument signal is in most cases essential to maintain a constant sensitivity along the measuring range and to allow easy handling of calibration and fittings. Moreover, it allows in routine (to some extent) measurements beyond the concentration range of calibration through a linear extrapolation of calibration within the assessment range. Since then it can help to cope with possible particular limitations of reference methods (e.g. somatic cell count for goat’s milk).

It can be assessed using sets of (n=8 to 15) samples with component concentrations regularly distributed all over the measuring range:

Samples should preferably be milks or liquids of similar physical characteristics (i.e. density, viscosity) as milk obtained by accurate dilution (weighing) of a high content sample by a low content one.
Concentrations should vary in regular intervals. Depending on the component, this can be obtained using various ways such as natural separation (creaming for fat), artificial separation (ultra-filtration for protein, micro-filtration for somatic cells)and pure solutions (lactose and urea).
Assessment concentration range should be at least the ones stated in Table 1, §2.2. Nevertheless, it is up to the evaluator to extend linearity assessment range in order to determine the upper limit for acceptable measurements.
Reference for linearity will be either the volume mixing ratio (volume/volume or mass/volume) or theoretical concentrations calculated from the concentrations of the initial samples (one can refer to Annex A of IDF Standard 141).

Note

Independently of expression units, reference for linearity should be according to the intake measurement principle, that is volumetric in all milk analysers developed till today, at the opposite of milk weighting quite impracticable. Since then the theory would require volume/volume or mass/volume ratio.

Nevertheless, using mass/mass ratios provides identical figures when mixing liquids with the same density.

Analyse each sample in triplicate, first in the order of increasing concentrations, second in the order of decreasing concentrations and calculate the linear regression equation y=bx+a (y=instrument, x=dilution ratio) and the residuals ei (ei=yi-(bxi+a)) from the means of replicates and dilution rations. Plot the residuals ei (y axis) versus the dilution ratio (x axis) on a graph. A visual inspection of the data points will usually yield sufficient information about the linearity of the signal.

Calculate the ratio of the residual range to the signal values range:

${\frac {De}{DC}}={\frac {e_{max}-e_{min}}{C_{max}-C_{min}}}$

where:

e_max and e_min = the upper and lower residuals, respectively

C_max and C_min = the upper and lower signal values, respectively

DE/DC should not exceed the limit stated for the component (generally 1-2 %):

Criteria	F	P	L	Urea	SCC
Limits for De/DC	0.01	0.01	0.02	0.02	0.02

Alternatively, a one-way analysis of variance can be carried out to confirm the statistical significance of non-linearity and statistical tests of comparison of variances can be applied to confirm the significance of difference between residual variances (see Annex).

One way is to calculate polynomial regressions with a progressive increase of the degree to determine the most appropriate adjustment of the signal that is, providing minimum standard deviation Sy,x^k (the degree of the polynomial should better not exceed 3 with significant coefficient) and to compare the estimate Sy,x^k with Sy,x of linear regression on the basis of significant ratio or F-test.

The final judgement on linearity adjustment of instrument is:

Good if the value Sy,x ≤ Sy,x^k
Correct if Sy,x > Sy,x^k and DE/DC ≤ limit%
Incorrect if Sy,x > Sy,x^k and DE/DC > limit%

Using the statistical test for comparison of residual variances or standard deviations (see Appendix 1: Usual statistical formulas for method evaluations on page 19).

Measurement limits

Limits of an instrumental method measurement exist at both extremities of the analytical range, lower limit and upper limit.

It is not required to determine these limits in case where natural concentration ranges for the respective components and species are normally located far from zero (general case for biochemical components, i.e. fat, protein, lactose, urea) and within the range of linearity of the method. Determination and assessment of measurement limits are carried out with the evaluation of linearity.

Lower limits

Lower limits evaluation is not treated by ISO 8196 (equivalent IDF 128) therefore reference can be made to standard EN ISO 16140:2000, which is dedicated to alternative microbiological methods, for definition and general principles.

At the date of the redaction of this document, only somatic cell counting is concerned by a lower limit evaluation for milk recording.

Definition

Lower limits are defined in three ways depending on the risk of error accepted and a priori precision requirements:

• Critical level (CL) or decision limit which is the smallest amount which can be detected (not null), but not quantified as an exact value (risk b =50 %). Below it cannot be assumed that the value is not null:

CL = u1_-a . s or CL = 1.645 . s with a = 5 % (1)

• Detection limit (DL) for which the second type of error is minimised up to a defined level, generally equal to the level of risk s (5 %). It consists in the lowest result, which differs significantly from zero (first type error a), that can be produced with a sufficiently low probability (second type error b) of including the blank value (zero) and with a sufficient confidence interval

DL = (u1-α + u1-β) . σ or DL = 3.29 . σ with α = β = 5 % (2)

• Quantification limit (QL) or determination limit which is the smallest amount of analyte which can be measured and quantified with a defined relative standard deviation SD% (or coefficient of variation CV%):

QL = kq . s and SD% = s /QL => kq = 1 / SD%

QL = DL => kq = 3.29 => SD% = 30 % (3)

Limit values to fulfill

In somatic cell counting, DL of cell milk counters should not be higher than 5000 cells/ml and SD% (CV%) at the lower level (close to zero) should not exceed 30 %, with QL equal to DL.

Standard deviations

In milk recording analysis, where only single determinations are carried out in routine, s is the standard deviation of random error of the measurement that is, in the best case, the repeatability standard deviation at the proximity of zero content.

Standard deviation s can be estimated in different ways:

Repeatability is dependent on concentration levels: standard deviation of repeatability (Sr) of the blank (zero) or estimated standard deviation at concentration values close to zero;
Repeatability is not dependent on concentration levels: standard deviation of repeat- ability (Sr) estimated by taking benefit of replications at different levels in linearity assessment,
Repeatability and sample variance are not dependent on concentration levels: standard deviation Sy₍₀₎ of the single estimate y₍₀₎ for x=0 using linear regression equation calculated in a linearity assessment in a linear part close to zero:

Sy₍₀₎ = S_y,x. (1 + 1/q + / SCE_X)^1/2

Note

In that case, Sy₍₀₎ slightly overestimates σ as it takes into account sample errors and line estimation error in addition to repeatability.

Upper limit

Upper limit corresponds to the threshold where the signal or the measurement deviates significantly from linearity (cf. linearity assessment).

An upper limit met on the range of concentration concerned by the evaluation will produce a ratio De/DC exceeding accepted limits (see linearity). Plotting linearity assessment results on a graph will provide necessary information on the shape of the curve response.

One can check if measured upper values deviating from linearity yU differ significantly from y(xU) which should be obtained with the linear equation (prediction) calculated on the linear range without taking into account that result:

tobs = | y_U – y_(xU) | / Sy(x_U)

with S y(x_U) = Sy,x . (1 + 1/q + (x_U -)2 / SCE_X)^1/2 and q-2 d.f. and α = 0,05

· if t_obs ≤ t_1-α/2 => no deviation from linearity at that point

if t_obs > t_1-α/2 => significant deviation from linearity at that point

Evaluation of the overall accuracy

One can refer to IDF standard 128 for a general information of this part of the evaluation.

The overall accuracy is composed by the sum of repeatability error, accuracy (or error of estimates versus reference) and error of calibration which occur in routine analytical conditions.

Each part of the overall accuracy is measured through the analysis of individual milk samples and herd milk samples of the specified animal species. Herd milk samples are to be collected in addition to individual milk samples in order to measure more accurately the part of variance related to herd effects.

The evaluation is to be performed on the instrument in the same state (working parameters, speed, calibration) the manufacturer intend to provide customers (users) with.

In case different analytical speed are available, parts of the overall accuracy will be assessed for the higher and the lower ones.

a. Calibration

A preliminary calibration (or pre-calibration) is required and should be set in the instrument (or supplied with it) by the manufacturer with a detailed calibration procedure appropriate to the instrument.

In case the instrument is to be used directly without any local calibration (set-in), instrumental analyses of the evaluation will be directly performed on appropriate (representative) milk samples.

In case local calibration is necessary, prior calibration will be performed according to manufacturer recommendations and using instrument facilities, before starting up the evaluation.

b. Samples.

Milks have to be sampled and collected in optimum conditions such as no damages should occur and could produce erroneous repeatability estimate. Individual milks should cover the maximum concentration range of the component according to Table 1.

- Calibration samples. They will be samples prepared according to recommendations of relevant standards for the criteria or, if no standardised procedure exits, in a similar way as prediction samples (half part for calibration and the other part for prediction).

- Prediction samples. Minimum numbers of 100 individual milk samples collected in 4-6 different herds and 50 herd milk samples should be used.

c. Reference methods

Reference methods should be standardised methods and, in all cases, the method used should be in a close agreement with one or more of the international reference methods (ISO, IDF, AOAC).

Assessment of repeatability

Repeatability is the main criteria which indicates whether an instrument allows suitable results according user requirements or not and it is a major element of internal quality control. Therefore every new instrument has to fulfil a maximum limit for repeatability value stated in the relevant international standard in order to satisfy to certification criteria.

Milk samples are to be analysed on the instrument calibrated according the manufacturer recommendations, preferably in duplicate. Indeed this minimum replicate number keeps closer to true conditions of repeatability and prevents from possible damage on fat. Series of 15-20 milk samples are successively analysed twice after recovering initial analytical condition (i.e. temperature by heating) when necessary.

Then standard deviation of repeatability will be calculated from duplicate results obtained from the whole set of data and, for criteria covering a wide range of concentration –that is more than 1 log scale- (case of somatic cell count), part-by-part after splitting of the whole concentration range in different parts, three parts for the minimum (i.e. low, medium and high).

The standard deviation of repeatability will be calculated with the formula of IDF Standard 128 (see Appendix 1: Usual statistical formulas for method evaluations on page 19):

Sr = ( ∑ w_i² / 2_q )^1/2

where w_j is the difference between duplicates of sample i (w_i = x_1i – x_2i) and q the sample number.

Compare the values obtained (Sr) with the standardised repeatability values (σr) defined for the criteria and the application in Tables 2 and 3. It is expected that

Sr ≤ σr . (Χ²_1-α /q)^1/2.

Assessment of accuracy of the mean

According to IDF Standard 128 , the error of accuracy of the mean is broken down in the error of exactness of calibration and the error of accuracy (accuracy of estimates).

Statistical parameters to be used are those indicated in IDF Standard 128 and summed up in Appendix 1 (starting on page 19): ; Sd ; Sy,x ; slope (b) ; student t test for and b.

They are obtained from a simple linear regression calculated using means of duplicate instrumental results (x) and so-called reference results (y) obtained with a reference method recognised by ICAR (analyse in duplicates).

Assessment of accuracy

Accuracy is assessed for individual animal milks and herd milks separately.

It is measured through the residual standard deviation Sy,x of the simple linear regression of instrumental results (x) and reference results (y).

It is expected that the differences to the regression line are normally distributed, therefore any outlying result should be carefully scrutinised. In case of outlying results, an other split sample of the same milk should be reanalysed by reference and the analyser when possible. When not or if outlying figure remains, reporting should present Sy,x estimates and graphs including all data – with the outliers identified, their number and respective biases - and the same Sy,x calculation after discarding outliers. Statistical methods used to identify outliers should be specified in the evaluation report. The proportion of outliers should not exceed 5 %.

The estimate value of Sy,x should fulfil respective limits σy,x defined for individual milk samples and herd milk samples in Tables 2 and 3.

It is expected that

Sy,x ≤ σ_y,x .[X²_1-α / (q-2)]^1/2.

For criteria covering a wide range of concentration –that is more than 1 log scale- (case of somatic cell count), accuracy evaluation should be performed for the whole range and for successive parts of the range after splitting of the whole concentration range in different parts, three parts for the minimum (i.e. low, medium and high).

Table 7. Precision values for medium content milk samples (cows, goats).
	ICAR limits
Criteria (units)	F (g/ 100 g)	P (g/ 100 g)	L (g/ 100 g)	Urea (mg/ 100 g)	SCC (%)
Repeatability
Average Sr L / M / H	0.014 ⁽¹⁾	0.014 ⁽¹⁾	0.014 ⁽¹⁾	1.4 ⁽²⁾	4 % ⁽¹⁾ 8 % / 4 % / 2 %
Reproducibility
Average SR (SR%) L / M / H	0.028 ⁽¹⁾	0.028 ⁽¹⁾	0.028 ⁽¹⁾	2.8 ⁽²⁾	5 % ⁽¹⁾ 10 % / 5 % / 2.5 %
Accuracy
Animals Sy,x	0.10 ⁽¹⁾	0.10 ⁽¹⁾	0.15 ⁽²⁾	6.0 ⁽²⁾	10 % ⁽²⁾
Herds Sy,x	0,07 ⁽¹⁾	0,07 ⁽¹⁾	0,07 ⁽²⁾	4.0 ⁽²⁾	10 % ⁽²⁾

⁽¹⁾ Limits in accordance with IDF Standard 141C and 148A.

⁽²⁾ Limits derived from experimental results and IDF 141C (SR~2.Sr).

Note

For lactose IDF Standard 141C recommends the same limits for Sy,x as for fat and protein which are difficult to fulfil with regards to poor chemical method available as reference at that date.

Table 8. Precision values for high content milk samples (ewes, buffaloes, particular cow/goat species). Derived from medium levels limits by applying relevant level ratios: 2 for F and P; 1 for L and urea.
	ICAR limits
Criteria (units)	F (g/ 100 g)	P (g/ 100 g)	L (g/ 100 g)	Urea (mg/ 100 g)	SCC (%)
Repeatability
Average Sr (Sr%) L / M / H	0.028 (0.35 %)	0.028 (0.4 %)	0.014 (0.3 %)	1.4 (2 %)	4 % 8 % / 4 % / 2 %
Reproducibility
Average SR (SR%) L / M / H	0.056 (1) (0.7 %)	0.056 (1) (0.8 %)	0.028 (1) (0.6 %)	2.8 (2)	5 % 10 % / 5 % / 2.5 %
Accuracy
Animals Sy,x (Sy,x%)	0.20 (2.5 %)	0.20 (3.0 %)	0.15	6.0	10 %
Herds Sy,x (Sy,x%)	0,14 (1.75 %)	0,14 (2.0 %)	0,07	4.0	10 %

Assessment of exactness of calibration

Prior to analyses, the instrument is calibrated according to the procedure recommended by the manufacturer and expressed in the same units as reference method used for the evaluation. Since then raw signals are not concerned and further statistical comparisons can be made at a same scale for both instrumental and reference values, allowing classical tests of identity and assessments against standardised target values. For this purpose, individual animal and herd samples will be analysed to provide the relevant information on the quality of the adjustment.

Depending on the principle of the method, quality of calibration can be more or less influenced by the representativeness of calibration samples in addition to calibration technique applied (i.e. mathematical model, experimental design, process). Therefore, sources of error of representativeness shall be reduced at the maximum for instance by sampling calibration samples in close or identical condition as for prediction milk samples.

Exactness of calibration is to be assessed using the parameters of the regression y=b.x+a: the mean bias and the slope b (see Appendix 1 - starting on page 19, and IDF Standard 128) taking care of eventual outlying results as in Assessment of accuracy on page 13. Estimates and b should normally fulfil the limits in

Table 9. Failing that goal should normally imply further investigations or explanations.

Table 9. Tentative indicative ICAR limits for exactness of calibration assessment.
a. Medium level (cows, goats)
Criteria	F	P	L	Urea	SCC
Mean bias	±0.05 ⁽¹⁾	±0.05 ⁽¹⁾	±0.05 ⁽¹⁾	±2.5 ⁽²⁾	±5 % ⁽²⁾
Slope b	1±0.05 ⁽¹⁾	1±0.05 ⁽¹⁾	1±0.05 ⁽¹⁾	1±0.05 ⁽²⁾	1±0.05 ⁽²⁾
b. High level (ewes, buffaloes, goats)
Mean bias	±0.10 ⁽¹⁾	±0.10 ⁽¹⁾	±0.10 ⁽¹⁾	±2.5 ⁽²⁾	±7 % ⁽²⁾
Slope b	1±0.05 ⁽¹⁾	1±0.05 ⁽¹⁾	1±0.05 ⁽¹⁾	1±0.05 ⁽¹⁾	1±0.07 ⁽²⁾

⁽¹⁾ Limits in accordance with IDF Standard 41C.

⁽²⁾ Limits derived from experimental results.

Additional informative investigations

The following items are not compulsory elements to evaluate even though they are of interest as possible parts of the overall accuracy of the method and the knowledge one can get about the method may have implications in milk sample handling (sampling, preservation, shipping, etc). Therefore, they can be considered as only informative for a proper use of the method if it obtains ICAR certification thanks to the former part. Nevertheless, for ICAR certification and the common knowledge, it would be very useful that they are evaluated once when the information is not available from manufacturers.

Ruggedness

Ruggedness is the ability of an instrument not to be influenced by external elements other than the component measured itself. Possible effects can come from concentration variation of major milk component or interactions (depending on the instrument, they can be compensate by intercorrections), biochemical changes of milk component related to preservation (lipolysis, proteolysis, lactic souring) or chemicals added in milk such as preservatives.

Principle of robustness measurement is to produce a significant change in the concentration of each interacting component separately and measure the corresponding measurement change of the influenced component. Then, one calculates the ratio (difference observed)/(change introduced) expressed in the relevant units.

Effect of major milk components (interactions)

For milk composition (fat, protein, lactose), one will refer to Annex B of IDF Standard 141 for sample preparation and calculation procedures: single variation method or multiple variation method by recombination of non correlated milk sample sets.

Effect of urea on other component measurements will be evaluated by addition of urea in milk as it is proposed for lactose.

Effect of high fat and protein content on somatic cell count in milk (ewes, goats and buffaloes) will be evaluated using cream (natural creaming) and milk retentate according a recombination in a similar way as in IDF 141.

Effect should be better measured at three relevant levels for the component under interaction and the species (i.e. low, medium and high). A minimum of two strongly different level are required and better three.

Effect of biochemical changes in components

Biological changes in milk result usually in damages to milk component affected. They can be produced by bacteria growths in milk or enzymes activity which affect directly or not milk components measured in milk. Unless achieved souring turning into milk clotting, there are no quick and easy way to distinguish such milks from well preserved milk samples and they are normally analyses. Then, the sensitivity of the method of measurement of milk components to such ways of deterioration can be of interest, in particular in order to evaluate the suitability of sampling and shipping conditions of routine milk recording of which depends the preservation quality of samples.

Clotting, churning and oiling are more evident defects of milk of which effects on analytical results are drastic for the first (no way for analysis) or depends essentially of the homogeneity of milk and representativeness of intakes. In those cases, defects can be easily identified and samples discarded.

Lipolysis

One will relate modifications in the measurement with the most appropriate indicator of lipolysis (milk fat acidity) after an artificial induction of an increased lipolysis (cooling and action of native lipase or addition of bacterial lipase (i.e. Pseudomonas). One will raise lipolysis level up to 5 meq/100 g fat minimum.

At least 5 levels are required. The effect exists if the variation ration calculated (slope of linear regression) is significantly different from 0.00.

Proteolysis

One will relate modifications in the measurement with the most appropriate indicator of proteolysis (whey protein or soluble nitrogen SN) after having achieved a proteolysis (i.e. using microflora proteases). One will try to obtain a minimum range of 0.8 % SN in milk. At least 5 levels are required. The effect exists if the variation ration calculated (slope of linear regression) is significantly different from 0.00.

Lactic souring

One will proceed by addition in milk of increasing amount of lactic acid. At least 5 levels are required. Check that the higher level does not clot at the water-bath temperature in order not to damage the instrument liquid system.

One will relate modifications in the measurement with the amount of lactic acid added. The effect exists if the variation ration calculated (slope of linear regression) is significantly different from 0.00.

Effect of sample history and handling conditions

Condition of optimal preservation of milk samples are well known but often not fit at the optimum for economical reasons. For instance, combination of cooling and storage at about 4°C with a preservative such as bronopol (2-bromo 2-nitro 1,3-propandiol) is known to allow a quite good preservation for clean (uncontaminated) milk samples. These optimal conditions are in most cases applied to calibration and control milk samples. Different conditions for sample preservation may exist in one laboratory depending of the origin (different type of chemical preservatives, life-times and temperatures) .

Therefore it is of interest to determine how far differences in preservation conditions can affect analytical results obtained by the instrument and provide the relevant information to milk recording organisation and laboratories for good choices in analytical apparatuses and sample handling methods.

For each item, for practical conclusions, component concentrations should cover the usual range of routine and sample number per series be defined in order to allow to conclude to positive effects through statistically significant differences (30 to 40 is generally sufficient).

Effect of chemicals added (preservatives)

Differences in analytical results will be measured by comparisons of identical parallel series of milk samples preserved with different chemical preservative used in routine conditions. Other preservation parameters must be maintained equal not to bias the results. The effect of both nature and concentration is to be evaluated.

Effect of milk intake temperature

Analytical instrument may be sensitive to environmental conditions related to their analytical principle (i.e. humidity, temperature, vibrations) and dispose of systems to compensate these sources of dysfunction. Indications are given by manufacturers regarding cautions to be taken by users in particular for sample temperature with respect to internal instrument temperature. Then, it is a useful information to know how large is the effect within the range of temperature of milk samples analysed in routine and allow to refine sample preparation before analysis (i.e. heating temperature and time). The comparison of effect of two extremes limits (lower and upper) advised by manufacturers on identical set of different milk sample will provide with a sufficient information.

Effect of storage conditions (i.e. time and temperature)

Sample temperature can determine the physical aspect of milk components (i.e. crystallisation of fat glycerides; solubility of casein and mineral fraction).

Besides, storage time can determine the ability of milk to recover its native physical and chemical aspects before being analysed. It is often the case that cream separated from skim milk becomes so firm that difficulties in reincorporating it uniformly in milk can occur. In such cases, fat globule clusters can remain and be source of troubles in the instrument (i.e. milk homogenisation in infra-red devices). The effect of various couples (time x temperature) can be measured by comparison with an optimal preservation method defined as reference.

Practical conveniences (Phase II)

It consists in various elements of which depends the laboratory ability to produce analytical results within the time expected and at the cost expected or needed. These practical and economical elements are evaluated during Phase II of the course of the total evaluation, on a period of time and a number of laboratories such as stated in Stages of the evaluation and general principles on page 3.

Speed

Speed announced by the manufacturer will be verified. Precision performances should be reported with the information on the speed used when different speeds are available and were successfully tested in Phase I.

Robustness

Frequency of troubles and servicing operations will be registered with the nature of incidents happened.

Monitoring and servicing facilities

Convenience for the utilisation of calibration procedure will be noted with user- friendliness of interfaces and software. Easiness for troubleshooting and operating reparations and servicing will be noted as well as weak points of devices in order users to be able aware of them and be able to cope with them.

Validation of precision in routine conditions

Via the application of the internal quality control according to recommendations of relevant guidelines of ICAR routine checks will be applied on instruments during Phase II of the evaluation and results will be registered and reported to complement the report of Phase I.

Report and certification delivery

Evaluation reports for both Phases I and II will be duly reported in specific documents with all the necessary information on the evaluation course, tables of results of analytical performances measured, discussion or comments and summaries.

Raw results will be available on paper format and magnetic records for computer (i.e. magnetic supports, CD, DVD) and in a record format compatible with usual data calculation programmes.

The report material will be provided to ICAR by the organisation asking for ICAR certification according to conditions defined in Stages of the evaluation and general principles on page 3.

Appendix 1: Usual statistical formulas for method evaluations

Application in the assessment of the precision

Standard deviation of repeatability: (q levels and n replicates)

Standard deviation of daily reproducibility: (q check tests and n replicates)

SR2= Sr2-²(1-1/n)

and

SR²= Sc² + Sr²

Standard deviation between control test checks

Sc = (Sr-²/n)^1/2

Application in the assessment of the accuracy

Means

Sum of Squares and of products

Slope

b = SPEXY / SCEX

Intercept

Estimate for x

y(x) = bx + a

Conditional mean for x

(x) = bx + a

Residual (e)

e = y - (x) = y - b.x - a

Difference (d)

d = x - y

Correlation coefficient

r = (SPEXY² / (SCEY. SCEX))^1/2

Standard deviations of

· differences (d)

Sd = ( SCE_d / (q-1))^1/2

Sd = ((SCE_Y²+ SCE_X²- 2.SPE_XY) / (q-1))^1/2

· residuals (ei):

Sy,x = (∑(yi-b.xi-a)/(q-2))^1/2

Sy,x = ((SCE_Y²-SPE_XY²/ SCE_X) / (q-2))^1/2

Sy,x = (SCE_Y.(1-r²) / (q-2))^1/2

· slope (b)

Sb = Sy,x / SCE^1/2

· intercept (a)

S_a = Sy,x . (1/q + / SCE_X)^1/2

conditional mean (x0)

S_y_(x0) = Sy,x . (1/q + (x0 - )2 / SCE_X)^1/2

· estimate y(x0)

S y(x₀) = Sy,x . (1 + 1/q + (x₀ - )² / SCE_X)^1/2

Conformity tests

• conformity of an estimate:

· slope b versus 1,000: tobs = | b-1,000 | / S_b ≤ t _1-α/2

with q-2 d.f. and α = 0,05

· slope b versus 0,00: tobs = | b| / Sb ≤ t _1-α/2

with q-2 d.f. and α = 0,05

· mean difference versus 0,00: tobs = | | / (Sd / √q) ≤ t_1-α/2

with q-1 d.f. and α = 0,05

· or (when b ≠ 1,000) versus : tobs = | - | / (Sy,x / √q) ≤ t_1-α/2

with q-2 d.f. and α = 0,05

· intercept a versus 0,00: tobs = | a | / Sa ≤ t_1-α/2

with q-2 d.f. and α = 0,05

conditional mean (x₀) versus reference value y_o or residual e_o versus 0,00

e_o = y_o - (x₀) = y_o- b _q-1.x_o+a _q-1

S _(x0) = Sy,x _q-1.(1/(q-1) + (x_o - _q-1 )² / SCEx _q-1)^1/2

tobs = |eo| / S _(x0) ≤ t_1-α/2

with q-3 d.f. and α = 0,05

For outlier detection or departure from linearity:One checks whether point Mo(x_o,y_o) belongs to the linear curve calculated without that point.

· conformity of a standard deviation S versus σ

Ø Method 1 (Chi²)

σ² ≤ k.S² /Χ²_1-α ⇒ S ≤ σ.(Χ²_1-α / k)^1/2

with k d.l. and α = 0,05

Ø Method 2 (error standard) (can replace method 1 for k > 50)

S - u_1-α . S /√2k’ ≤ σ ⇒ S ≤ σ / (1 - u_1-α /√2k’)

with k’ data and α = 0,05

· Linearity test

Ø comparison of a line with a k degree polynomial (reduction of residual error by):

Fobs = ((n-2).Sy,x² - (n-k-1).Sy,x^{k 2}) / (k-1).Sy,x^{k 2} < F_1-α

or

Sy,x / Sy,x^k < ((F_1-α .(k-1) + (n-k-1)) / (n-2))^1/2

with: n samples, k polynomial degrees,

k1 = k-1, k2 = q-k-1 and α risk of error.

Ø Sample or level effect interpreted as linearity compared to repeatability:

Fobs = (n.Ss² + Sr²)/ Sr² = (n.(S² - Sr²/n) + Sr² )/ Sr² = n.S ²/ Sr² < F1-α

or S / Sr < (F_1-α /n)^1/2

with: n replicates, = means difference of replicates, k1 = q-2, k2 = q.(n‑1) and α risk of error.

Note

k1 = q-1 when testing the effect of a source of variation with no regression (1 way- ANOVA)

Appendix 2: Examples of calculation and presentation

Assessment of preliminary instrumental fittings

Daily precision: Example of fat analysed by infra red spectroscopy (cf. IDF 141)

Test No q	Replicates	Sum	Mean (m)	Mean bias (d)	Test number (n)	Sum of squares (SOS)	Variance (Var)	Within check Sr(i)
1	4,00 4,03 4,01	12,04	4,013	0,008	3	0,000467	0,000233	0,015
2	4,02 4,03 4,02	12,07	4,023	0,018	3	0,000067	0,000033	0,006
3	4,01 4,00 4,00	12,01	4,003	-0,002	3	0,000067	0,000033	0,006
4	3,99 4,00 4,02	12,01	4,003	-0,002	3	0,000467	0,000233	0,015
5	3,99 4,01 4,01	12,01	4,003	-0,002	3	0,000267	0,000133	0,012
6	3,97 3,99 4,00	11,96	3,987	-0,018	3	0,000467	0,000233	0,015
7	4,01 4,00 3,98	11,99	3,997	-0,008	3	0,000467	0,000233	0,015
8	4,02 4,02 3,99	12,03	4,010	0,005	3	0,000600	0,000300	0,017
9	4,01 4,00 4,03	12,04	4,013	0,008	3	0,000467	0,000233	0,015
10	3,99 3,99 4,01	11,99	3,997	-0,008	3	0,000267	0,000133	0,012
Sum Average SD	120, 150	120, 150	40,050 4,005 0,010	0,000 0,000 0,010	30	0,00360 0,000180	0,00180 0,000180	0,013

Check homogeneity of variances within checks

Thanks to:

Cochran Index = Var(max) / Sum of Var < Cochran limit => SD limit = (Cochran limit x Sum of Var)^1/2

=> Cochran limit (P=0,95 ; 2 ; 10) = 0,445 => SD limit = 0,0283 never smaller than SD values observed => variance homogeneity admitted

Daily reproducibility:	SR=(Sm² - Sr².(1-1/n))^1/2	SR = 0,015	< 0,028 => conform to IDF 141
Variation between checks:	Sc = (Sm² - Sr²/n)^1/2	Sc = 0,007
Repeatability:	Sr = (Sum Sr(i)² / q)^1/2	Sr = 0,013	< 0,014 => conform to IDF 141

Source of variation	df	Sum of squares	Mean squares	SD	F
Between tests	9	0,002950	0,00032778	0,018	1,821
Within tests	20	0,003600	0,00018	0,013
Total	29	0,00655	0,00022586	0,015

Conclusions

From Fobs = 1,82 smaller than F0,95 = 2,39, stability is assessed positively: no significant shift of instrument response observed
From residual SD =0,013 smaller than Sr=0,014, instrument functioning is assessed positively: no abnormal individual fluctuation

Assessment of preliminary instrumental fittings

Carry over effect: Example of fat analysed by infra red spectroscopy (cf. IDF 141)

Sequence N°	Concentrations				Differences
Sequence N°	LL1	LL2	HL1	HL2	dL	dH
1 2 3 4 5 6 7 8 9 10	0,00 0,01 0,00 -0,01 -0,01 0,01 0,00 0,01 -0,01 0,01	-0,01 -0,01 -0,02 -0,02 -0,02 0,00 -0,02 -0,01 -0,02 -0,01	3,98 3,99 3,97 3,97 3,96 3,98 3,99 3,97 3,98 3,99	3,99 4,01 3,99 3,98 3,98 4,00 4,01 3,99 3,99 4,00	0,010 0,020 0,020 0,010 0,010 0,010 0,020 0,020 0,010 0,020	0,010 0,020 0,020 0,010 0,020 0,020 0,020 0,020 0,010 0,010
Mean Std dev. N t-Student	0,001 0,009 10	-0,014 0,007 10	3,978 0,010 10	3,994 0,011 10	0,015 0,005 10 9,00	0,016 0,005 10 9,80
Mean Std dev. N t-Student					0,015 0,005 10 9,00	0,016 0,005 10 9,80
Minimum Maximum D=Max-Min	-0,01 0,01 0,02	-0,02 0,00 0,02	3,96 3,99 0,03	3,98 4,01 0,03	0,01 0,02 0,01	0,01 0,02 0,01

Mean bias dL and dH are significant according to t-Student test t 0,975 = 2,26

	Value	Conf. min	Conf. max
C.O.R. (H/L)	0,37	0,28	0,49	C.O.R. lower than 1 % => conform
C.O.R. (L/H)	0,40	0,31	0,47	C.O.R. lower than 1 % => conform

Assessment of linearity: Example of fat analysed by infra red spectroscopy (cf. IDF 141)

Sample set of progressive dilution of a 10 % fat milk by skim milk

Test No	% dilution (m/v) X	Replicates			Mean concent. C Y	Mean residual e	Std Dev. Sr
Test No	% dilution (m/v) X	1	2	3	Mean concent. C Y	Mean residual e	Std Dev. Sr
1 2 3 4 5 6 7 8 9 10	15,50 20,35 25,64 31,18 34,80 39,80 45,15 50,50 56,65 61,95	1,54 2,02 2,55 3,10 3,49 3,97 4,50 5,02 5,61 6,11	1,52 2,02 2,56 3,11 3,48 3,99 4,50 5,02 5,63 6,13	1,53 2,02 2,55 3,12 3,49 4,00 4,51 5,01 5,62 6,12	1,530 2,020 2,553 3,110 3,487 3,987 4,503 5,017 5,620 6,120	-0,023 -0,013 -0,003 0,005 0,024 0,029 0,016 0,000 -0,006 -0,030	0,010 0,000 0,006 0,010 0,006 0,015 0,006 0,006 0,010 0,010
Level N Mean SD	10,0 38,152 16,428	10,0 3,791 1,622	10,0 3,796 1,631	10,0 3,797 1,626	10,0 3,795 1,626	10,0 0,000 0,020	0,009
Minimum Maximum D=Max-Min	15,500 61,950 46,450	1,540 6,110 4,570	1,520 6,130 4,610	1,530 6,120 4,590	1,530 6,120 4,590	-0,030 0,029 0,059

Linear regression on	Replicates	Means
Slope	0,09898	0,09898
Intercept	0,01856	0,01856
N	30	10

SD of residual means:	Se = 0,0203
SD of repeatability :	Sr = 0,0088
SD of level bias:	Sl = 0,0197 (calculated by Sl = (Se²-Sr²/n)^1/2)

Tests

a- Ratio De = 0,059

DC = 4,590

De/DC =0,013 < 0,01 => Conclusion: Linearity default

B- Bias from linearity test using Sd of residual means

Fobs=(Sr² + n.Sl²) / Sr² = n.Se²/Sr² should be lower than F0,95 = 2,45 with k1=q-2 and k2=q.(n-1)
Fobs = 16,17 > F0,95 = 2,45 => Conclusion: Linearity default
k1 = 8 k2 = 20

C - ANOVA from linear regression on the individual data: (equivalent to b-)

Source of variation	df	Sum of squares	Mean squares	SD	F
Regression	1	63,4522972	63,4522972	7,966	827638,66
Between levels	8	0,009916	0,001239516	0,035	16,17
Within levels	20	0,001533	7,66667E^-05	0,009
Total	29	63,46374667	2,188405057	1,479

> F0,95 = 2,45 => Conclusion: Linearity default

D - Compliance with polynome of 2nd and 3rd degree

Thanks to:

Sy,x / Sy,xk < ((F1-a .(k-1) + (n-k-1)) / (n-2))1/2

with: n samples, k polynomial degrees,k1 = k-1, k2 = q-k-1 and a risk of error.

Polynome	b3	b2	b1	a
Degree 3	-0,000001	0,000014	0,102190	-0,056563
Degree 2		-0,000087	0,105744	-0,093564
Degree 1			0,098975	0,018563

Polynome	Sy,xk	d.f.	Sy,xk/Sy,x3	F0,95	Limit	Sy,xk/Sy,x2	F0,95	Limits
Degree 3	0,010	26	1,00
Degree 2	0,010	27	1,01	4,23	1,21	1,00
Degree 1	0,020	28	2,07	3,37	1,26	2,05	4,21	1,18

Conclusions:

Both 2nd and 3rd degrees polynomial adjustement can improve linearity significantly according to the limits defined by F-tests => linearity default. Nevertheless, a 2nd degree adjustment is sufficient as no significant improvement is noted between the 2nd and 3rd degree polynomial adjustment

Examples: Assessment of preliminary instrumental fittings

Assessment of linearity

Sample set of progressive dilution of a high cell content milk by a low cell content milk

Test No	% dilution (m/v) X	Mean concent. Y	Residuals e regr. 1-21	Residuals e regr. 1-9	Ratio De/DC	Std. dev. prediction Sy,xi	t-test Student from line
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21	0,0 5,4 10,1 15,2 19,7 24,4 30,2 35,0 39,9 44,9 49,6 55,3 59,8 64,5 69,9 74,6 79,4 84,6 89,7 95,5 100,0	7,2 131,2 238,8 356,5 461,7 564,0 689,7 800,2 900,5 1013,5 1122,8 1249,3 1348,5 1441,7 1561,0 1653,5 1766,5 1865,2 1983,8 2074,8 2143,0	-25,2 -18,2 -12,4 -5,1 2,6 3,1 3,2 9,7 3,9 8,6 16,1 19,1 20,8 12,2 14,6 5,3 14,3 0,4 8,5 -26,1 -55,4	-4,9 -2,2 -0,2 3,0 7,1 3,8 -0,7 2,0 -7,8 -7,1 -3,4 -4,9 -6,8	0,022 0,021 0,023 0,026 0,022 0,018 0,015 0,017 0,015 0,013 0,012 0,011	5,538 5,255 5,054 4,884 4,776 4,704 4,675 4,699 4,770 4,889 5,045 5,292 5,530	-1,006 -0,464 -0,039 0,645 1,559 0,849 -0,163 0,433 -1,714 -1,541 -0,719 -1,020 -1,379
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21				-19,1 -21,1 -34,2 -29,0 -47,1 -43,0 -82,3 -115,2	0,018 0,018 0,025 0,023 0,029 0,027 0,043 0,057	5,821 6,208 6,590 7,024 7,545 8,105 8,804 9,391	-3,804 -4,066 -6,389 -5,248 -8,226 -7,253 -13,312 -18,038	<- upper limit t0,975 = 2,365 with P=5 % and 7 df
Level Numb Mean Std. dev.	21 49,89 30,95	21 1113,02 670,56	21 0,00 18,957	9 0,00 4,905		9	9
Level Numb Mean Std. dev.	21 49,89 30,95	21 1113,02 670,56	21 0,00 18,957	9 0,00 4,905
Minimum Maximum D=Max-Min	0,00 100,00 100,00	7,20 2143,00 2135,80	-55,39 20,84 76,23	-7,80 7,10 14,90	0,02 0,03 0,01	4,67 5,54 0,86	-1,71 1,56 3,27

Assessment of measurement limits: Example of a somatic cell counter (cf. IDF 148)

a - Lower limit: 10 measurements close to zero

Data	3	5	4	3	5
Data	4	5	3	5	4
Mean Std. Dev. CV% DL N	4,100 0,876 21,4 2,881 10	< 30 % => conform < 5000 => conform

b - Upper limit: Regression: Slope b = 22,4603 Intercept a = 12,1324

From the figure, identification of the linear part ; calculation of the regression equation

y = b.x+a on the linear part (level 1 to 9) on the whole range, calculation of:

residuals: ei = yi -y(xi) = yi - b.xi - a
t test on residuals: tobs = | ei | / Sy,x .(1/q + (xi -m(x) )² / SCEx)^1/2

Conclusion

From level n°14, departure from linearity observed with tobs significant with P=0,95
N°14 corresponds to the increase of the residualrange/concentration range ratio test

Assessment of linearity: Example of a somatic cell counter (cf. IDF 148)

SD of residual means: Se =19,0 (measured)

SD of repeatability : Sr =16,4 (measured )

SD of level bias: Sl =16,4231 (calculated by Sl = (Se2-Sr2/n)1/2)

Linearity tests

Ratio (on the whole range i.e. 1 to 21)

De/DC = 0,036 > 0,02 Conclusion: Linearity default

Note

This test is simple to apply - generally advised for quich checks in routine - nevertheless, due to the irregularity of residual scattering with SCC, it is important to confirm by a graph examination of residual plotting.

b. Bias from linearity test

Fobs=(Sr² + n.Sl²) / Sr² = n.Se²/Sr² should be lower than F0,95 = 1,84

(with triplicates on 21 levels)

Fobs = 4,01 > F0,95 = 1,84 => Conclusion: Linearity default

Note

This test understands replicates are performed at every level and that the variance of residuals is uniform throughout the range which is rarely observed - therefore is not strictly exact - with SCC due to the very large scale (4 log paths: 10³ to 10⁶)

It is more suitable for chemical analyses, nevertheless can be considered as sufficiently informative for SCC.

c. Compliance with polynome of 2nd and 3rd degree

=>Test : Sy,x / Sy,xk < ((F1-a .(k-1) + (n-k-1)) / (n-2))^1/2

with: n samples, k polynomial degrees,

k1 = k-1, k2 = q-k-1 and a risk of error.

Mean concent. Y	% dilution (m/v) X	% dilution (m/v) X2	% dilution (m/v) X3	Residuals X2	Residuals X3
7,2 131,2 238,8 356,5 461,7 564,0 689,7 800,2 900,5 1013,5 1122,8 1249,3 1348,5 1441,7 1561,0 1653,5 1766,5 1865,2 1983,8 2074,8 2143,0	0,0 5,4 10,1 15,2 19,7 24,4 30,2 35,0 39,9 44,9 49,6 55,3 59,8 64,5 69,9 74,6 79,4 84,6 89,7 95,5 100,0	0,0 29,2 102,0 231,0 388,1 595,4 912,0 1225,0 1592,0 2016,0 2460,2 3058,1 3576,0 4160,3 4886,0 5565,2 6304,4 7157,2 8046,1 9120,3 10000,0	0 157 1030 3512 7645 14527 27544 42875 63521 90519 122024 169112 213847 268336 341532 415161 500566 605496 721734 870984 1000000	5,4 2,6 0,7 0,7 2,8 -1,8 -6,8 -3,5 -11,7 -8,4 -1,4 2,1 5,2 -1,3 4,6 -0,7 13,3 5,8 21,2 -4,0 -25,0	-5,9 -1,6 1,1 4,4 8,3 4,7 -0,4 2,0 -7,6 -6,2 -1,2 0,0 1,4 -6,5 -1,7 -7,2 7,5 1,8 20,4 0,8 -14,3
21 1113,02 670,56	21 49,89 30,95	21 3401,16 3207,87	21 260958 310802	21 0,00 9,14	21 0 7
7,20 2143,00 2135,80	0,00 100,00 100,00	0,00 10000,00 10000,00	0 1000000 1000000	-24,98 21,20 46,18	-14 20 35

Polynome	b3	b2	b1	a
Degree 3	-0,000256	0,019324	22,068420	13,063507
Degree 2		-0,019194	23,580701	1,847156
Degree 1			21,660009	32,390894

Polynome	Sy,xk	d.f.	Sy,xk/Sy,x3	F0,95	Limit	Sy,xk/Sy,x2	F0,95	Limit
degree 3	7,78	17	1,00			0,81
degree 2	9,63	18	1,24	4,45	1,09	1,00
degree 1	18,96	19	2,44	3,59	1,15	1,97	4,41	1,09

Note

This test can be run with all data (replicates ) or only the mean values (the example) depending on the sensitivity needed.

It normally requires the variance of residuals to be uniform throughout the range which is generally not achieved with cell counting. Nevertheless, provided with the residual plotting it can be considered as sufficiently informative for SCC linearity assessment.

Conclusions

Significant improvement by 2nd and 3rd degree polynomes which confirm a linearity default

Assessment of overall accuracy: Example for fat

Analysed by infra-red spectroscopy (cf. IDF 141).

Set of individual cow milk samples

Test No	Reference method Y	Instrumental method				Repeatability	Accuracy
Test No	Reference method Y	Replic. 1 X1	Replic. 2 X2	Mean X	Estimates Y/xi	Bias w=\|X1-X2\|	Differences d=X-Y	residual e
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20	1,89 1,98 2,48 2,66 3,10 3,23 3,37 3,57 3,53 3,52 4,02 4,15 4,59 4,61 5,10 5,23 5,49 5,61 5,80 5,89	1,92 2,05 2,55 2,56 3,16 3,20 3,31 3,51 3,51 3,57 4,00 4,05 4,52 4,59 5,06 5,18 5,44 5,48 5,74 5,80	1,94 2,06 2,56 2,56 3,13 3,22 3,34 3,50 3,50 3,57 4,01 4,09 4,51 4,57 5,06 5,19 5,44 5,47 5,76 5,78	1,930 2,055 2,555 2,560 3,145 3,210 3,325 3,505 3,505 3,570 4,005 4,070 4,515 4,580 5,060 5,185 5,440 5,475 5,750 5,790	1,90 2,03 2,54 2,55 3,15 3,22 3,33 3,52 3,52 3,59 4,04 4,10 4,56 4,63 5,12 5,25 5,52 5,55 5,84 5,88	0,02 0,01 0,01 0,00 0,03 0,02 0,03 0,01 0,01 0,00 0,01 0,04 0,01 0,02 0,00 0,01 0,00 0,01 0,02 0,02	0,04 0,07 0,07 -0,10 0,04 -0,02 -0,04 -0,06 -0,02 0,05 -0,01 -0,08 -0,08 -0,03 -0,04 -0,04 -0,05 -0,14 -0,05 -0,10	-0,006 -0,045 -0,061 0,114 -0,049 0,014 0,035 0,050 0,010 -0,067 -0,016 0,047 0,028 -0,019 -0,024 -0,022 -0,025 0,058 -0,035 0,014
N Mean SD	20 3,991 1,260	20 3,960 1,223	20 3,963 1,219	20 3,962 1,221	20 3,991 1,259	20 0,014 0,011	20 -0,030 0,059	20 0,000 0,047
Minimum Maximum D=Max-Min	1,890 5,890 4,000	1,920 5,800 3,880	1,940 5,780 3,840	1,930 5,790 3,860	1,896 5,876 3,980	0,00 0,04 0,04	-0,14 0,07 0,21	-0,07 0,11 0,18

	Parameter	Estimate	Limits	Conformity
Repeatability	Sr	0,012	0,014	Yes
Accuracy	Mean d Sd (=Sx-y) N t obs df	-0,030 0,059 20 2,218 19	+/-0,050 0,100 t0,975 =2,093	Yes Y P< 0,05
Regression	Slope b Sb tobs b vs.1	1,0311 0,0088 3,511	1+/-0,05 t0,975 = 2,101	Yes P<0,001
	Intercept a Sa tobs a vs.0	-0,0935 0,037 2,556	t0,975 = 2,101
	df Sy,x	18 0,047	0,100	Yes

Conclusion

Instrument accuracy complies with limits defined for the component analysed, in the example fat in cow milk.

Procedure 2: Guidance on application of EC JRC Certified Reference Material for SCC in milk

Background

Milk somatic cell count (SCC) is a widely used indicator for monitoring the udder health of several mammalian species and is relevant in food hygiene regulations, milk payment testing, farm management and breeding programmes^[1]. In February 2020, the European Commission Joint Research Centre (EC JRC) launched a new certified reference material (CRM ERM^®-BD001) for somatic cell counting in milk. The launch is one of the tangible outcomes of a close cooperation between the International Dairy Federation (IDF), the International Committee for Animal Recording (ICAR) and EC JRC in developing solutions and tools to promote a better global equivalence in somatic cell counting in milk.

Description of the EC JRC Certified Reference Material

Sets of CRM ERM^®-BD001 are available from EC JRC and its authorised distributors and they are delivered in sachets, see https://crm.jrc.ec.europa.eu/p/ERM-BD001. Each sachet contains two bottles with 14g spray-dried cow milk each in an inert gas atmosphere (argon), ERM^®-BD001a with a low SCC and ERM^®-BD001b with a high SCC. Following accompanying instruction protocols for reconstitution in double-distilled or ultrapure/type 1 quality water of 40 °C, the resulting samples will contain about 60 000 and 1 200 000 cells/mL respectively. The stated certified reference values for the resulting two liquid samples are based on direct microscopic counting according to ISO 13366-1|IDF 148-1 [6] or Chapter 10 in the Standard Methods for the Examination of Dairy Products^[2] and fluoro-opto-electronic counting according to ISO 13366-2|IDF 148-2^[3] or Chapter 11.032 in the Standard Methods for the Examination of Dairy Products^[2]. For the first batch, a total of thirty-two laboratories around the globe participated in the characterisation exercise. The stated certified reference values are to be considered as robust estimates of the true SCC values of these materials.

Upon reconstitution, the samples can be used as such, but the two reconstituted samples can also be mixed in different ratios to arrive at samples with values that lie between the SCC values stated for the two reconstituted samples. More background on the materials and their characterisation can be found in the certification report^[4], which is available through the website mentioned above.

Application of the EC JRC Certified Reference Material

CRM ERM^®-BD001 can be used for multiple purposes as it follows in the following paragraphs.

Method performance verification

The CRM ERM^®-BD001 materials and their certified reference values, with stated uncertainty information, can be used by both reference method users and routine method users to verify whether their method operates correctly. With method performance verification of the reference method, the stated certified reference values and related uncertainty information based on the accepted reference method data sets are to be used. These stated values are based solely on the accepted results from ISO 13366-1|IDF 148-1^[5] compliant measurements. With method performance verification of a routine method, the recommendation is to work with the values based on the 50/50 merged data pool stemming from the reference method data sets and the randomly selected routine method data sets. These assigned values come with a lower uncertainty.

Before verifying the performance of a method with CRM ERM^®-BD001 materials, it should be assessed that the method is properly functioning. For the reference method, this means fulfilling the requirements with regard to the correct dimensions of the microscope field and the repeatability. For the common applied routine method with fluoro-opto-electronic counting, this means fulfilling requirements with regard to blank checks, carry-over, linearity effect, other method-specific critical aspects and repeatability. For guidance, see ISO 13366–2|IDF 148-2^[3].

For verifying method performance, the procedure according to the certification report^[4] can be applied:

Measure each sample in at least duplicate and calculate the mean measured value, ȳ_i
Calculate for each sample the absolute difference, △_i.meas between ȳ_i and the certified value, x_i,CRM: △_i.meas = | - x_i,CRM| (1)
Combine the measurement uncertainty, u_i,meas of with the uncertainty of the certified value, u_i,CRM on the certificate:

(2)

d. Calculate the expanded uncertainty, U_i,Δ from the combined uncertainty, u_i,Δ using an appropriate coverage factor, corresponding to a level of confidence of approximately 95%:

U_i,Δ = 2 _* u_i,Δ (3)

e. If _Δ_i,meas < U_i,Δ then no significant difference exists between the measurement result and the certified value at a confidence level of approximately 95%.

Verification/adjustment of calibration settings with routine methods.

Large scale somatic cell counting relies on the application of routine methods, such as fluoro‑opto-electronic counting. Results with these methods are subject to variation between different methods, between different laboratories, between different instruments, and in time. Therefore, it is relevant to have routine methods properly calibrated against a reference and to frequently verify whether the calibration settings are still correct. Where results of the direct microscopic reference method for somatic cell counting come with limited precision, a calibration sample set prepared from CRM ERM^®-DB001 materials with stated reference values may serve as a stable and robust alternative.

The advice is to follow a multi-step procedure according to Figure 1.

*Figure 1: Flow chart showing verification of calibration settings of routine methods.*

Check on proper functioning of the routine method

Before verifying the calibration settings of a routine method, it should be assessed that the routine method is properly functioning and fulfilling the requirements with regard to blank checks, carry-over effect, other method-specific critical aspects and repeatability. For further guidance on these checks with fluoro-opto-electronic counters, see ISO 13366-2|IDF 148-2^[3].

Preparation of a sample set for verification/adjustment of the calibration settings

A sample set for the verification/adjustment of the calibration settings of a routine method can be prepared by mixing reconstituted sample material from ERM^®-BD001a and ERM^®‑BD001b in different ratios according to the instruction protocols provided with the materials. This will result in a sample set with at least five cell count levels with an equidistant distribution from the certified reference value of ERM^®-BD001a up to the certified reference value of ERM^®-BD001b. The recommendation to use the stated reference values based on the 50/50 merged data pool stemming from the reference method data sets and the randomly selected routine method data sets^[4] is in accordance with the guidance in ISO 13366-2|IDF 148-2^[3] on the use of suitable calibration materials. The reference value for each sample can be calculated from the relative amount of ERM^®-BD001a and ERM^®‑BD001b in each sample. The resulting sample set allows the verification and, if necessary, the adjustment of the calibration settings in accordance with the recommendations of ISO 13366-2|IDF 148-2^[3] and ISO 8196-2|IDF 128-2^[6] , see next paragraph.

Verification/adjustment of calibration settings

NOTE 1

The approach presented in this paragraph is based on an ordinary least square regression model with the underlying prerequisite of an approximate constancy in the distribution of the residuals throughout the calibration range. Otherwise, transform the data so that the residual variance throughout the range is equalized. From experience, a square root transformation could be a good option for this.

NOTE 2

When working with certified reference materials, it is assumed that the error in the calculated reference values for the samples in the calibration sample set is negligible. Therefore, the calculated reference values are to be plotted on the x-axis and the mean of the routine method values on the y-axis. It is noted that this is contrary to the situation in routine testing, where the true value is estimated from a routine method measurement through applying slope and intercept settings:

x_est = slope _* y + intercept (4)

with

x_est = estimate of the true value

y = instrument read out

a. For the verification of slope and intercept settings of a routine method, the calibration sample set is to be measured with the current settings of slope and intercept.

b. Measure each sample of the calibration sample set with the routine method in at least duplicate. Calculate the mean value, ȳ_i, for each sample.

c. Plot the calculated reference values, x_i, and the individual mean values obtained with the routine method, ȳ_i , of the q samples in an XY-axis diagram. Check the distribution of the data points, which should appear linear, regular and homogeneous. If one or more data points deviate considerably from the linear tendency, check the preparation of the sample set, the calculation of the reference values and the functioning of the routine method. If necessary, repeat the measurements.

d. For the collected values with the calibration sample set, calculate the regression equation, ȳ_i = b _* x_i + a, using the ordinary least squares method.

e. Calculate the residual standard deviation from the regression, s_yx:

(5)

f. Collect or calculate the values for b_c and a_c, corresponding with the current settings for slope and intercept:

b_c = 1/slope_c (6)

and

a_c = -intercept_c/slope_c (7)

with

slope_c = current setting of the slope

intercept_c = current setting of the intercept

g. Test whether the calculated value for b differs statistically significant from the value of b_c .

For that, calculate the standard deviation of b:

(8) with

(9) Applying a 95% confidence interval, the current slope is still correct if:

(10)

where t_0.975 is the 0,975 quantile of the Student distribution with q-2 degrees of freedom.

h. Test whether the mean bias | x -y(x) | differs statistically significant from the value of a with the current calibration, a_c.

For that, calculate the standard error of the regression equation:

(11)

where q is the number of calibration samples.

The mean bias is still correct if in agreement with:

(12) where t_0,975 is the 0,975 quantile of the Student distribution with q-2 degrees of freedom and ȳ(xbar) being the predicted value from

If both the test for slope and mean bias are negative, a does not statistically differ from a_c if

(13) with

(14)

For more background information on this verification procedure, see ISO 8196-2|IDF 128-2^[6].

i. Adjustment of the calibration is necessary if one of the tests under g. or h. is negative, that is, if one of the requirements is not fulfilled. In such a case, the resulting values for the new slope, slope_n , and new intercept, intercept_n are:

slope_n = 1/b (15)

and

intercept_n = -a/b (16)

Record the calculated values for a, b , and slope_n and intercept_n for future verification.

For commonly applied fluoro-opto-electronic methods according to ISO 13366-2|IDF 148-2^[3], expect slope values of 1.00 ± 0.10 and intercept values of 0 ± 50 000 cells/mL in case of ordinary least square regression with untransformed data.

Assigning reference values to Secondary Reference Material (SRM)

In many situations, so-called SRMs could provide more practical and/or more economical means for a traceable anchoring of routine test results. SRMs can be used for calibration purposes or as pilot samples between measurements. Proper reference values can be assigned to each SRM by applying a comparative approach in SRM characterization^[7].

For this, a similar SRM with about an equal SCC is to be analysed simultaneously with a CRM using a well-functioning routine method under constant conditions. The SCC of the CRM material can be tuned by mixing ERM^®-BD001a and ERM^®-BD001b in an appropriate ratio and calculating the corresponding reference value, see paragraph 5.2.

The uncertainty of the mixed CRM material, umixed CRM, can be calculated from:

(17) with

= uncertainty with the certified value for CRM ERM^®-BD001a

= uncertainty with the certified value for CRM ERM^®-BD001b

= volume fraction of CRM ERM^®-BD001a in the mixed sample

= volume fraction of CRM ERM^®-BD001b in the mixed sample

a. Prepare an amount of (mixed) CRM and SRM that allows for at least 15 (= n) replicate pair wise measurements with the same instrument in the same laboratory, directly after one another, providing paired values of C_i,CRM and C_i,SRM coming from two subsequent measurements.

b. Calculate for each pairwise measurement the difference E_i between the two results:

(18) c. Assuming additivity of bias for about equal SCC values, the reference value for the SRM, x_SRM, can now be calculated:

(19)

with

C_crm is the reference value of the (mixed) CRM and (20)

d. The standard uncertainty with the reference value for the SRM, u_SRM, is:

(21) with u_CRM being the standard uncertainty of the value carried by the CRM and

(22)

e. The expanded uncertainty, U_SRM, amounts to:

U_SRM = 2u_SRM (23)

Use in proficiency testing

CRM ERM^®-BD001 is stable and comes with certified reference values. It can therefore serve as a robust anchor sample when included in sample sets for proficiency testing. As an alternative, one or more SRMs with assigned reference values according to the earlier described procedure according to Kuselman et al. (2002)^[7] can be included for this purpose.

NOTE 3

An Excel calculation file that easily performs the above described calculations for the verification of method performance, the verification of the calibrations settings of a routine method and the assignment of reference values to SRMs can be downloaded at https://www.icar.org/Excel-templates-with-guidance-on-use-ECJRCCRMSCC.xls.

This Excel calculation file thereby provides the means to combine the verification of method performance, the linearity and the calibration settings in one procedure. The only extra prerequisite is that each sample of the sample set for verifying the calibrations settings is measured in at least 15 replicates.

Transition with calibration settings of routine methods

Applying CRM ERM^®-BD001 for calibration of routine methods means a change in the anchoring of somatic cell counting in milk. This might come with a significant change in calibration settings with routine methods and, as a consequence, in routine measurement results. The extent of this will differ between laboratories and geographies, depending on anchoring systems that have been applied thus far. It is to be noted that in case of an expected bigger shift, regulatory limits, limit values in milk payment systems and/or in udder health monitoring programmes might also need reassessment. It is therefore advised that laboratories, which notice a large shift in counting level when switching to the use of CRM ERM^®-BD001, contact the relevant regulatory and supervising bodies and other stakeholders in order to arrange for an optimal transition.

Aspects for consideration amongst others might be:

Extent of the shift in the counting level when applying CRM ERM^®-BD001 materials with their stated certified reference values.
Involvement of somatic cell counting entities (laboratories, veterinarians, farmers) in the concerned geography and need for alignment in transition.
Involvement of regulatory and supervising bodies. Need and possibilities for accompanying reassessment of regulatory limits, limit values in milk payment and/or udder health programmes.
Consequences for the performance of laboratories in proficiency testing. What anchoring system do other participating laboratories apply? Are changes in proficiency testing and the assessment of performance therein necessary?
Consequences for the laboratory protocols.
Timing of the transition. When is the transition best made? Will it be done in one step or in multiple steps?
Communication to relevant stakeholders. What? To whom?

As situations will considerably differ, tailored approaches are to be developed locally.

If the choice is made for a transition in multiple steps, the local counting level will be gradually adapted to a counting level that fully coincides with the stated reference values of CRM ERM^®-BD001. It is expected that, when applicable, a transition in two or three steps will suffice.

A pragmatic procedure in making these steps is under closely controlled and properly safeguarded conditions to measure the reconstituted CRM ERM^®-BD001 materials under the current calibration settings in plural and, after removal of possible outlying results, to calculate the mean value for both samples ERM^®-BD001a and ERM^®-BD001b, respectively x(bar)_a,local and x(bar)_b,local. From these values and the stated reference values with CRM ERM^®‑BD001, x_a,CRM and x_b,CRM , temporary local reference values, x_{a,CRM intermediate} and xb_{,CRM intermediate} can be calculated and assigned to the two reconstituted reference materials.

Example: In case of a two-step transition, intermediate local reference values can be set at:

(24)

Such materials with intermediate locally assigned reference values can during a transition phase be used according to referenced international standards and the guidance in this document, thereby noting the intermediate status of these assigned reference values and avoiding any suggestion that resulting counting levels in routine will be traceable to CRM ERM^®-BD001.

Acknowledgments

Harrie van den Bijgaart, Qlip B.V., The Netherlands, bijgaart@qlip.nl
Silvia Orlandini, International Committee on Animal Recording, The Netherlands, silvia@icar.org
Werner Luginbühl, ChemStat, Switzerland, info@chemstat.ch

↑ IDF. (2008) Towards A Reference System For Somatic Cell Counting In Milk. Bull IDF 427.
↑ ^2.0 ^2.1 American Public Health Association. (2012) Standard methods for the examination of dairy products, 17th ed. APHA Press.
↑ ^3.0 ^3.1 ^3.2 ^3.3 ^3.4 ^3.5 ISO 13366-2|IDF 148-2. (2006) Milk – Enumeration of somatic cells – Part 2: Guidance on the operation of fluoro-opto-electronic counters.
↑ ^4.0 ^4.1 ^4.2 EC Joint Research Centre. (2020) Certification report. The certification of the concentration of somatic cells (somatic cell count, SCC) in cow's milk: ERM^®-BD001. Available through: https://crm.jrc.ec.europa.eu/p/ERM-BD001
↑ ISO 13366-1|IDF 148-1. (2008) Milk – Enumeration of somatic cells – Part 1: Microscopic method (Reference method).
↑ ^6.0 ^6.1 ISO 8196-2|IDF 128-2. (2009) Milk — Definition and evaluation of the overall accuracy of alternative methods of milk analysis — Part 2: Calibration and quality control in the dairy laboratory.
↑ ^7.0 ^7.1 Kuselman, I., Weisman, A. & Wegscheider, W. (2002) Traceable property values of in-house reference materials. Accred Qual Assur 7 p122–124.

[1] IDF. (2008) Towards A Reference System For Somatic Cell Counting In Milk. Bull IDF 427.

[:0-2] 2.0 ^2.1 American Public Health Association. (2012) Standard methods for the examination of dairy products, 17th ed. APHA Press.

[:1-3] 3.0 ^3.1 ^3.2 ^3.3 ^3.4 ^3.5 ISO 13366-2|IDF 148-2. (2006) Milk – Enumeration of somatic cells – Part 2: Guidance on the operation of fluoro-opto-electronic counters.

[:2-4] 4.0 ^4.1 ^4.2 EC Joint Research Centre. (2020) Certification report. The certification of the concentration of somatic cells (somatic cell count, SCC) in cow's milk: ERM^®-BD001. Available through: https://crm.jrc.ec.europa.eu/p/ERM-BD001

[5] ISO 13366-1|IDF 148-1. (2008) Milk – Enumeration of somatic cells – Part 1: Microscopic method (Reference method).

[:3-6] 6.0 ^6.1 ISO 8196-2|IDF 128-2. (2009) Milk — Definition and evaluation of the overall accuracy of alternative methods of milk analysis — Part 2: Calibration and quality control in the dairy laboratory.

[:4-7] 7.0 ^7.1 Kuselman, I., Weisman, A. & Wegscheider, W. (2002) Traceable property values of in-house reference materials. Accred Qual Assur 7 p122–124.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

Anonymous

Search

Section 12 – Milk Analysis

Milk Analysis

Field of application

Reference methods

Routine (instrumental) methods

Specific recommendations for controlling the quality of DHI milk samples

Bottles

Preservatives

Quality control in DHI laboratories

Quality control on reference methods

External control

Internal control

Quality control on routine methods

External control

Internal control

Requirements for analytical quality control and quality assurance tools

Interlaboratory proficiency tests

Reference materials

Choice of AQA service suppliers

Appendix 1. Analytical quality control in milk testing laboratories

Components of quality control and recommended minimum frequencies

Frequencies and limits for routine methods

Checking

Appendix 2. Methods

International reference methods

Secondary international reference methods

Standardized routine methods

Procedure 1: Protocol for evaluation of milk analysers for granting ICAR certification

Foreword

Introduction

Rules of the certification

Stages of the evaluation and general principles

Field of validity of the certification

Course of operations of a technical evaluation:

Introduction to the principle of the evaluation (explanatory note)

Minimum necessary assessments for an evaluation

Assessment of preliminary instrumental fittings

Daily precision (repeatability and short-term stability)

Carry-over effect

Linearity

Measurement limits

Lower limits

Definition

Limit values to fulfill

Standard deviations

Upper limit

Evaluation of the overall accuracy

Assessment of repeatability

Assessment of accuracy of the mean

Assessment of accuracy

Assessment of exactness of calibration

Additional informative investigations

Ruggedness

Effect of major milk components (interactions)

Effect of biochemical changes in components

Lipolysis

Proteolysis

Lactic souring

Effect of sample history and handling conditions

Effect of chemicals added (preservatives)

Effect of milk intake temperature

Effect of storage conditions (i.e. time and temperature)

Practical conveniences (Phase II)

Speed

Robustness

Monitoring and servicing facilities

Validation of precision in routine conditions

Report and certification delivery

Appendix 1: Usual statistical formulas for method evaluations

Application in the assessment of the precision

Application in the assessment of the accuracy

Appendix 2: Examples of calculation and presentation

Assessment of preliminary instrumental fittings

Daily precision: Example of fat analysed by infra red spectroscopy (cf. IDF 141)

Check homogeneity of variances within checks

Assessment of preliminary instrumental fittings

Carry over effect: Example of fat analysed by infra red spectroscopy (cf. IDF 141)

Assessment of linearity: Example of fat analysed by infra red spectroscopy (cf. IDF 141)