Data Science Spotlight: High Throughput Quantification of Bean Nutrition

This week's Data Science Spotlight highlights Darren Drewry, one of TDAI's core faculty members, and his work in studying the make-up of beans utilized in many plant-based protein products.
- What big challenges does this research aim to solve?
Legumes are the major source of protein in the developing world, where animal-based protein sources are limited and are central to the emerging plant-based food industry. Phaseolus beans, also known as “common beans” or “dry beans”, play an important nutritional and economical role in the United States and throughout the world, largely driven by their high protein content, amino acid profiles that are complementary to cereals, and by providing a significant source of fiber, vitamins and other micronutrients. This project has been focused on developing a proof-of-concept demonstration of the application of high-throughput phenotyping methods to quantify the nutritional resources held in genebanks. Genebanks hold valuable stores of genetic variability in key crop species but are often limited in terms of knowledge of the traits that are important to crop breeders who are developing the next generations of crop varieties.
- What is your technological approach?
This project focuses on the demonstration of spectroscopic approaches to illuminate the nutritional variability contained within the common/dry bean (Phaseolus) genebank maintained by the USDA-ARS in Pullman, Washington. We have developed a hyperspectral reflectance library of >1000 common bean accessions across the main bean species used by the commercial production and breeding communities. We are now evaluating the spectral and nutritional variation within our sample set and developing machine learning models to relate spectral variability to nutritional contents.
- How has this research advanced what has already been done before?
Our work is utilizing the full visible through shortwave infrared (VSWIR; 350 – 2500nm) spectrum to evaluate spectral reflectance variability of Phaseolus beans across this wide spectral domain, spanning multiple species and the genetic diversity contained in the USDA-ARS genebank.
- What do you plan on doing with this new technology?
We hope to expand our current dataset to develop a complete spectral library of the approximately 18,000 common bean accessions held in the USDA-ARS genebank. This will allow us to develop an understanding of the spectral and nutritional variability contained within this repository. Beyond this current work, we hope to expand our spectral library development to other high-throughput techniques that can further our understanding of common bean nutrition.
- How can this research be applied in everyday practices?
These datasets will help bean breeders and researchers identify common bean lines to utilize in their efforts to study and develop more nutritious bean varieties in the future.