Brief description: 

Big data-driven genomic prediction and selection algorithms (GS) designed as a solution to the technological limitations (costly and time-consuming) met with current breeding approaches. The intention is to proceed with a value proposition to plant breeding entities locally and globally. The working principles is bringing about incremental and disruptive changes in the primary industry through advanced smart crop improvement technologies. We leverage advances in data science and genomics, which will allow to dramatically accelerate the development of superior cultivars, cutting costs in a way farmers and breeders have never been able to do in the past. The GS’s bread and butter are the yearly incremental changes by implementing technologies to get crops to perform better than they did in previous breeding cycles. Likewise, GS is well poised to provide farmers with better, superior, and consumer-preferred cultivars year after year, allowing them to make more profits and stay in business the longest possible.

Main Features: 

The Big data-driven genomic selection and prediction analytics has been tested and validated in several crops including sorghum ecosystems using high-density genetic markers and big data (IoT)-aided crop management to predict the performance of the crops of interest. The predicted performance is then used in place of direct phenotyping to evaluate and select among individuals. In virtue of the above, the selection is no longer constrained by time required to phenotypically develop a cultivar. Big data- driven genomic selection and prediction technology is expected to result in significantly increased genetic gain (productivity) by unit time and cost. Key performance indicators for this purpose include cutting time and cost to cultivar development four and five times, respectively, relative to conventional practices.

Areas of Application: 


Market Trends and Opportunities: 

There is a need for: (1) cost-effective, high-resolution solutions capable of expediting breeding activities in order to simplify breeding scheme, shorten the time to cultivar development; selecting for genetic merit estimated through genomic modelling in order to sustainably improve productivity and profits; (2) farmer-customized GS for a trait of interest, or several traits of interest for the farmer aggregated in Index; (3) closing the gap between agricultural business planning and the responsible and sustainable maximization of the profit deriving mainly from increased crop productivity and efficiency of resource use, resilience to climate change challenges, reduced uncertainty of management decisions, accounting for environmental standards and regulations.

The DataBio Genomics Prediction Models (GS) has been implemented as a solution to overcome the technological limitations met with current breeding approaches. Indeed, under historical slant, phenotypic selection (PS) and marker-aided breeding (MAS) represent present-time main approaches upon which world agriculture relies heavily. Although PS allowed early green revolution in the mid-twentieth century, it is by now recognized that its contribution reached a plateau. On the other hand, thousands of marker-trait associations uncovered in the MAS process have not been routinely exploited due mainly to intrinsic limitations of this technology. It is out of this context that this solution was designed. GS is a new paradigm in agriculture and showed superior results relative to other approaches implemented thus far. Different assumptions of the distribution of marker effects are accommodated in order to account for different models of genetic variation including but not limited to: (1) the infinitesimal model, (2) finite loci model, (3) algorithms extending Fisher’s infinitesimal model of genetic variation to account for non-additive genetic effects.

Customer Benefits: 

The potential benefits of the GS technology involve direct contribution primarily towards the following three out of 17 UN Sustainable Development Goals (SDGs). SDG 2 (zero hunger): GS will sustainably and responsibly increase yields to produce healthy foods and other products for human consumption, targeting specialized and niche markets to improve incomes to foster community resilience and food security; SDG 12 (responsible consumption and production): GS will develop new breeding technologies and plant ideotypes that expedite (four times) and cut costs (five times) to the development of climate change resilient cultivars requiring low inputs and hence, reduced environmental load; SDG 13 (climate action): GS implements genomic, phenomic, and next

generation technologies in crop physiology to develop modern cultivars requiring less inputs and hence, adapting to and mitigating climate change. GS intends to develop novel climate resilient ideotypes and breeding technologies to significantly sustain agricultural productivity and value

chains in Europe and in the world.

The genomic data and relevant analytics will be directly useful to farmers and farming cooperatives. due not only to increased yields, but also to the reduction of farming costs and risks, and the early variety release. Higher yields are expected in virtue of high predicting ability of the genomic algorithms deployed to select for superior and resilient genotypes. The breeding costs will be drastically cut in virtue of the adoption of simplified breeding schemes, reduction of phenotypic evaluations, and shortening the breeding cycle, and using lesser agricultural inputs. Early variety release is an intrinsic GS property, particularly in virtue of genetic merit-driven intercrosses. Therefore, farmers can grow a better variety sooner due to rapid variety development and release, making more income. The implementation of accurate models under appropriate breeding scenarios, and the use of high quality NGS sequencing machines and HPC system was anticipated to ensure that the proposed solutions can be dependable and replicable to other crop types and market segments.

Technological novelty: 

The main problem modelled is the performance of new and unphenotyped crop populations, using whole-genome molecular information, integrating quantitative and population genetics, driven by big data streaming from large scale high-throughput genomics and phenomics platforms. This technology is expected to significantly improve genetic gain by unit of time and cost, allowing farmers to grow a better variety sooner relative conventional approaches, minimizing production risks, making more income and staying longer in the business.



Ephrem Habyarimana
BDV Reference categories: 
Data Analytics
Agriculture, forestry and fishing
Readiness Level: