NIRS Analysis Has Long and Credible History

By Bill Mahanna

I had the pleasure of attending a presentation by John Shenk, one of the early pioneers in the application of near-infrared (NIR) technology to the analysis of forage and grains.

Shenk spoke at the Feb. 13-14 joint conference of the NIRS Forage & Feed Consortium, FeedAC and the National Forage Testing Assn. (NFTA; Feedstuffs, April 14).

Listening to presentations by Shenk and Paolo Berzaghi (NIRSC technical consultant) about NIR calibration development and their interaction with laboratory managers and technicians in attendance made me wish more nutritionists were in the audience so they could walk away with the sense of confidence in NIR values being generated by commercial and university laboratories.

What is often forgotten about NIR-predicted values is that NIR is a secondary method based on a regression against a primary (or reference) method. Consequently, the NIR value can never be more accurate than the primary reference method. We all need to remember the limitations and laboratory errors associated with methods such as neutral detergent fiber (NDF) so that we are not unfairly blaming NIR spectroscopy (NIRS) prediction models for errors associated with the original reference chemistry.

NIRS is not new
NIRS has been discussed in literature since 1939, but it was not until 1968 that Karl Norris and co-workers with the Instrumentation Research Lab of the U.S. Department of Agriculture first applied the technology to agricultural products.

They observed that cereal grains exhibited specific absorption bands in the NIR region and suggested that NIR instruments could be used to measure grain protein, oil and moisture. Research in 1976 demonstrated that absorption of other specific wavelengths was correlated with chemical analysis of forages (Shenk, 2001).

Shenk and his research team utilized a custom-designed spectro-computer system in 1977 to provide a rapid and accurate analysis of forage quality. Early in 1978, their group developed a portable instrument for use in a mobile van to deliver nutrient analysis of forages directly on farm and at hay auctions. This evolved into the use of university extension mobile NIR vans in Pennsylvania, Minnesota, Wisconsin and Illinois.

In 1978, the USDA NIRS Forage Network was founded to develop and test computer software to advance the science of NIRS grain and forage testing. By 1983, several commercial companies had begun marketing NIR instruments and software packages for forage and feed analysis. By 1983, several commercial companies had begun marketing NIR instruments and software packages for forage and feed analysis (Shenk, 2001).

How NIRS works
NIRS is based on the interaction of physical matter with light in the near-infrared spectral region (700-2,500 nm). Sample preparation and presentation to the NIR instrument vary widely. Though dried, finely ground samples are often employed, whole grains or fresh, unground samples also can be scanned. Instruments can be stationary in a laboratory or mobile (e.g., on board a silage chopper).

Monochromatic light produced by an NIR instrument interacts with plant material in a number of ways, including as reflection, refraction, absorption and diffraction. Vibrations of the hydrogen bonded with carbon, nitrogen or oxygen cause molecular "excitement" responsible for absorption of specific amounts of radiation of specific wavelengths. This allows labs to relate specific chemical bond vibrations (generating specific spectra) to concentration of a specific feed component (e.g., starch).

Spectroscopy is possible because molecules react the same way each time they are exposed to the same radiation.

NIR instruments are much less sensitive in quantifying individual inorganic elements (e.g., calcium, phosphorus or magnesium) or mixtures (e.g., ash) because they are measuring the influence of these "contaminating materials" on the covalent bonds.

Building a 'calibration'
The individual laboratory or consortium that develops a prediction model uses software packages to perform the mathematical calculations necessary to associate the NIR spectra of reference samples with the reference chemistry of those reference samples. This mathematical process is called "chemometrics." The mathematical equations developed are termed "prediction models," although they are also called "calibrations."

The robustness of an NIR prediction model is, in part, determined by the size and representative nature of the calibration population samples that will be analyzed by reference methods. The sample population should represent the full diversity of plant materials to be scanned.

For instance, if the goal is to develop a prediction model for crude protein in corn grain, then samples of corn from diverse genetic and environmental backgrounds need to be included in the population to be analyzed by the chosen reference method in a lab with high NFTA performance statistics. When a particular analytical methodology may not exist (e.g., for prediction of ethanol yield from corn fermentation), laboratories may develop an entirely new reference method.

Numerous samples should be scanned by NIR and assayed by wet chemistry procedures to obtain good calibration statistics. A "proof-of-concept" model will utilize 50-60 samples; fully developed prediction models can be built from no fewer than 80-100 samples, but this number can be greater (1,000s) depending upon the error terms associated with each analyte. The final number of samples required is dependent upon the analytical and spectral diversity within the reference samples selected for developing the prediction model (Sevenich, 2008).

To reduce total error, it is desirable to have multiple replicates of the sample analyzed by the reference method and scanned multiple times with the specific NIR instrument. If the calibration set is being developed for a dried, ground sample NIR instrument, then drying conditions must be standardized. Spectra production is quite sensitive to differences in sample particle size and shape. As a direct result, consistent sample preparation (e.g., grinding) is critical.

One of the largest sources of error in NIR predictions between labs that use the same calibration is a result of differences in how the labs prepare the sample (e.g., different type or worn grinders; Berzaghi, 2008).

To develop robust NIRS prediction models and valid results, laboratories must:

minimize sources of error in the entire process
obtain values for reference samples using analytical methods that have high precision and accuracy
standardize sample preparation and analytical procedures
standardize the NIRS instrument
perform routine instrument maintenance
analyze only samples representative of the original population and
obtain routine diagnostics of all associated instruments and undergo yearly prediction model (calibration) updates (Sapienza, 2008; Berzaghi, 2008).

Questions for the lab
As users of NIR-predicted values, we should all feel comfortable asking our chosen analytical partners questions about NIR prediction model and wet chemistry statistics. A better understanding of standard errors or confidence intervals for lab values can help instill confidence in how to use these values in a similar way that other statistics (e.g., P-values) help us determine the confidence we put in research trial summaries.

Here is a list of statistics that reputable NIR laboratories will be able to discuss (Ruser, 2007; Allen, 2008; Sevenich, 2008; Owens, 2008):

Number of samples in the calibration set (N) is influenced by the natural variation in the trait of interest. The narrower the range, the more difficult it is to detect differences. Typically, 80-100 samples are required for developing an initial calibration, with up to multiple-hundreds of samples in a "mature" calibration.
Standard error of calibration (SEC) defines how well the NIRS prediction model predicts the reference values (calibration sample set) that was used to build the model. Low SEC values are desired. For example, if the reference value is 30 and the SEC is three, this means 66.7% of the NIR predicted values should fall within the range of 27-33.
Standard error of prediction (SEP) defines how well the NIRS prediction model predicts values for an independent (validation) sample set. Low SEP values are desired.
Standard error of cross validation (SECV) is an index of how well the prediction model predicts the reference values in the calibration set when samples are selectively removed from the calibration process. Low SECV values are desired. SECV should closely mirror SEC. Values that differ significantly indicate that the prediction model is weak.
Relative prediction deviation (RPD) is the relationship between the standard deviation (SD) of the entire population divided by SEC. This is sometimes referred to as "relative percent difference" in the older literature. High RPD values are desired. For example, if SD of the calibration population is nine and SEC is three; then RPD = nine divided by three = three. RPD values of two to three allow for adequate screening. Values between three and five allow for improved separation. Values exceeding five indicate that the prediction model is almost perfect.
Regression coefficient (R2 or RSQ) is the best fit line when predicted values are plotted against the associated reference values. High R2 values are desired. An R2 of 1.0 means 100% of the analyte variance is explained by the prediction equation.
SD (or standard error) of the reference method is determined from replicate analyses of reference samples. Low SD values are desired. The error of the reference method depends upon the chemical method being employed.

To characterize reference methods, specific categories (loose, moderate and tight) can be used. Digestibility, with an SD of about two units, is an example of a loose fit. NDF as a percentage of dry matter with a value of 1.0-1.5 is an example of a moderate fit. Crude protein as a percentage of dry matter with an SD of 0.3-0.5 is an example of a tight fit.

When the reference method is imprecise, the precision of predicting composition of unknown samples also will be imprecise. This also will be reflected as greater NIR SEP and lower R2 values (Sapienza, 2008).

The Table illustrates a sliding scale of how "robustness" or "goodness of fit" of an NIRS prediction model varies with SD of the reference method. The categories that describe the goodness of fit of the prediction models are favorable, moderately favorable and unfavorable (Sapienza, 2008).

Table - Goodness of fit - Reference method versus NIRS prediction model

The size of SEP generally varies directly with SD of the reference method. A reference method must have a low SD if NIR is expected to provide useful information or be a stand-alone analytical method. An example of a favorable prediction model would be predicting crude protein as a percent of dry matter with an SEP of 0.3-0.5 and an R2 of 0.95.

In contrast, an example of an unfavorable prediction model might be NDF as a percentage of dry matter with an SEP of two to three and an R2 of 0.80 (Sapienza, 2008).

The Bottom Line
NIR analysis as an analytical technique has a long and credible history. NIR is a secondary method that never can be more accurate than the reference method upon which it is based. Statistically robust prediction models allow for a rapid and repeatable assay procedure for nutritional values that helps the livestock industry detect and manage variability in composition among and within feedstuffs.

The cost effectiveness of NIR analysis allows the total analytical error (sampling and laboratory) to be reduced because a larger number of subsamples or sequential samples can be assayed with a limited analytical budget than is possible using the more expensive wet chemistry approaches.

To enhance trust, nutritionists, producers and laboratories are encouraged to communicate more fully and openly so that NIR prediction model and wet chemistry statistics are understood more clearly.

References

Allen, R. 2008. Pioneer NIR spectroscopist. Personal communications.
Berzaghi, P. 2008. NIRSC technical consultant. University of Padova, Italy. Personal communications.
Owens, F. 2008. Pioneer research scientist. Personal communications.
Ruser, B. 2007. Pioneer research spectroscopist. Personal communications.
Sapienza, D. 2008. Sapienza Analytica LLC. Personal communications.
Sevenich, D. 2008. Pioneer NIR spectroscopist. Personal communications.
Shenk, J.S. 2001. Chapter 16: Application of NIR Spectroscopy to Agricultural Products. In: D.A. Burns and R.W. Ciurczak. Handbook of Near-Infrared Analysis. CRC Press.

This article was originally published in June 2008 Feedstuffs issue, and is reproduced with their permission.

My Account

Country Selector

My Account

NIRS Analysis Has Long and Credible History