Abstract

   Sorghum, a genetically diverse C[4] cereal, is an ideal model to study
   natural variation in photosynthetic capacity. Specific leaf nitrogen
   (SLN) and leaf mass per leaf area (LMA), as well as, maximal rates of
   Rubisco carboxylation (V[cmax]), phosphoenolpyruvate (PEP)
   carboxylation (V[pmax]), and electron transport (J[max]), quantified
   using a C[4] photosynthesis model, were evaluated in two field-grown
   training sets (n = 169 plots including 124 genotypes) in 2019 and 2020.
   Partial least square regression (PLSR) was used to predict V[cmax] (R^2
   = 0.83), V[pmax] (R^2 = 0.93), J[max] (R^2 = 0.76), SLN (R^2 = 0.82),
   and LMA (R^2 = 0.68) from tractor-based hyperspectral sensing. Further
   assessments of the capability of the PLSR models for V[cmax], V[pmax],
   J[max], SLN, and LMA were conducted by extrapolating these models to
   two trials of genome-wide association studies adjacent to the training
   sets in 2019 (n = 875 plots including 650 genotypes) and 2020 (n = 912
   plots with 634 genotypes). The predicted traits showed medium to high
   heritability and genome-wide association studies using the predicted
   values identified four QTL for V[cmax] and two QTL for J[max].
   Candidate genes within 200 kb of the V[cmax] QTL were involved in
   nitrogen storage, which is closely associated with Rubisco, while not
   directly associated with Rubisco activity per se. J[max] QTL was
   enriched for candidate genes involved in electron transport. These
   outcomes suggest the methods here are of great promise to effectively
   screen large germplasm collections for enhanced photosynthetic
   capacity.

1. Introduction

   Sorghum (Sorghum bicolor L. Moench), a C[4] pathway species and the
   world's fifth most produced cereal [[43]1], is adapted to a range of
   environments and retains high photosynthetic efficiency in diverse
   conditions [[44]2–[45]4]. These characteristics make it a crop of
   interest for the dual challenge of meeting increasing demands for food
   and adapting to the effects of climate change [[46]5, [47]6]. In
   addition to the C[4] pathway, which confers adaptation to hot and dry
   environments, the natural genetic diversity of sorghum provides
   potential to identify genotypes or genetic loci associated with greater
   photosynthetic capacity [[48]7]. However, in order to select the
   photosynthetically favourable genotypes adapted to contrasting
   environments, tools are required to quantify the biochemical parameters
   underpinning photosynthetic capacity in a high-throughput manner,
   removing the phenotyping bottleneck with the traditional gas exchange
   approach.

   Photosynthesis is the process of converting captured solar radiation
   into chemical energy by fixing carbon dioxide (CO[2]) to form
   carbohydrates and biomass. Improving photosynthetic capacity is seen as
   a major target to further improve crop yields [[49]2, [50]3, [51]8].
   Screening germplasm to directly breed for improved photosynthetic
   responses to environment conditions is constrained by the complexity of
   measuring such responses and requires development of higher-throughput
   indirect phenotyping techniques.

   In the C[4] photosynthetic pathway, the biochemical processes in the
   mesophyll cells are coordinated with a CO[2] concentrating mechanism in
   the bundle-sheath cells [[52]9, [53]10]. In the mesophyll, CO[2] is
   initially fixed by phosphoenolpyruvate (PEP) carboxylase into C[4]
   acids, which are then decarboxylated in the bundle sheath cells leading
   to high CO[2] levels and hence more efficient carboxylation of
   Ribulose-1,5-bisphosphate (RuBP) by Ribulose 1,5-bisphosphate
   carboxylase-oxygenase (Rubisco) [[54]11, [55]12]. The energy for the
   regeneration of RuBP in the bundle sheath and PEP in the mesophyll
   comes from chloroplast electron transport [[56]11]. Due to their key
   roles in the photosynthetic pathway, the maximal rates of Rubisco
   carboxylation (V[cmax], μmol m^−2s^−1), PEP carboxylation (V[pmax],
   μmol m^−2s^−1), and maximal electron transport rate (J[max], μmol
   m^−2s^−1) largely determine photosynthetic capacity of C[4] plants and
   therefore underpin crop productivity. Simulations using a diurnal
   canopy photosynthesis model predict that canopy growth rate of C[4]
   cereals responds largely to changes in J[max] [[57]13]. Quantification
   of these biochemical parameters is hence of value for selecting
   enhanced photosynthesis and growth. This is traditionally achieved by
   conducting gas exchange measurements and fitting observed
   photosynthetic responses to CO[2] or light with the Rubisco-activity or
   electron-transport limited equations in the C[4] photosynthesis model
   [[58]11, [59]14]. However, this method is very time-consuming and not
   suitable for high-throughput screening of large germplasm collections.

   The capacity of leaves to convert absorbed CO[2] and radiation into
   biomass also depends on key leaf physiological and structural
   properties [[60]15]. Two such properties are specific leaf nitrogen
   (SLN, g m^−2) and leaf mass per leaf area (LMA, g m^−2), and both of
   these are known to be closely associated with photosynthetic capacity
   [[61]16, [62]17]. Because nitrogen is a key element in photosynthetic
   machinery, such as chloroplasts, plant nitrogen status closely links
   with leaf photosynthetic rates and canopy radiation use efficiency
   [[63]18–[64]20] and is hence an important parameter in canopy
   performance modelling [[65]13, [66]21]. The relationship between leaf
   nitrogen content and maximal net photosynthesis rate is influenced by
   LMA which is strongly associated with leaf lifespan and thus affecting
   the rates of the photosynthetic parameters [[67]15, [68]16, [69]22].
   However, conventional measurements of SLN and LMA are destructive and
   slow, limiting their potential to identify germplasm with higher
   photosynthetic capacity in large breeding programs.

   High-throughput plant phenotyping technologies enable the collection of
   plant biochemical and physiological traits rapidly and nondestructively
   at large scale [[70]23–[71]26]. Various vegetation indices, which are
   usually calculated using a few selected wavelengths, have been
   correlated with plant structural traits (e.g., leaf area index and
   biomass) or leaf pigment concentration (e.g., chlorophyll). Typical
   canopy size indicators include normalized difference vegetation index
   (NDVI) [[72]27, [73]28] and optimized soil adjusted vegetation index
   (OSAVI) [[74]29]. Chlorophyll content, on the other hand, has been
   indicated by indices, such as normalized difference red edge (NDRE)
   [[75]30] and chlorophyll vegetation index (CVI), which is an indirect
   measure of nitrogen content [[76]31]. Adjustments to these vegetation
   indices have also been reported. For example, replacing red bands with
   red edge when calculating some indices exhibited better performance in
   estimating chlorophyll content [[77]32].

   More recently, hyperspectral imaging sensors with wavelengths in the
   visible (400-700 nm), near infrared (700-1000 nm), and shortwave
   infrared (1000-2500 nm) domain have advanced the development of
   high-resolution spectroscopy techniques. This has led to significant
   increases in the accuracy and the types of physiological properties
   that can be retrieved [[78]26, [79]33]. The linkage between
   photosynthetic capacity and hyperspectral features therefore
   constitutes a promising avenue to predict photosynthetic performance of
   plants across broad scales [[80]20, [81]34–[82]36]. Various studies
   have exploited the plethora of bands (>270) and the much narrower band
   width (<6 nm) available from current hyperspectral sensors to better
   quantify biochemical and physiological properties in crops [[83]35,
   [84]37]. However, most of the studies so far use hyperspectral
   reflectance to estimate leaf photosynthetic capacity in C[3] crops
   [[85]34, [86]35, [87]37–[88]41], and similar studies are much rarer for
   C[4] crops. At least one study focused on V[cmax], V[pmax], leaf
   nitrogen content, and specific leaf area from whole spectra reflectance
   (500-2400 nm) using partial least square regression (PLSR) in C[4] crop
   maize [[89]42]. However, J[max] that quantifies the rate of
   electron-transport limited photosynthetic rate [[90]11] is also
   important in determining daily biomass growth [[91]13], but has not
   previously been targeted.

   A more comprehensive study on quantifying the key parameters of
   photosynthesis V[cmax], V[pmax], and J[max] in a C[4] crop species is
   proposed. In addition, a high-throughput method to predict key
   parameters linked to photosynthetic capacity from canopy-level
   hyperspectral measurements will aid in the selection of genetic
   material with improved photosynthetic capacity at a large scale. To our
   knowledge, there are no published previous attempts to estimate the
   full set of key parameters known to limit C[4] photosynthesis, at
   canopy level, using hyperspectral reflectance. Additionally, next
   generation sequencing techniques have provided a high-throughput and
   cost-efficient tool for detecting genomic regions associated with crop
   traits of interest via genome-wide association studies (GWAS)
   [[92]43–[93]45]. Combining the techniques of hyperspectral sensing and
   GWAS would greatly facilitate the improvement of photosynthetic
   capacity and ultimate crop performance, which to date has rarely been
   explored.

   The main objective of this study was to estimate traits associated with
   photosynthetic capacity from proximal hyperspectral sensing of sorghum
   canopies. Specifically, we aimed to (i) develop algorithms to predict
   photosynthetic parameters (V[cmax], V[pmax], and J[max]), SLN, and LMA
   from proximal hyperspectral canopy reflectance captured with a
   spectrometer attached to a mobile phenotyping platform in two
   field-grown training sets; (ii) extrapolate the algorithms to GWAS
   trials grown adjacent to the training sets using a fully genotyped
   sorghum diversity panel; (iii) evaluate the heritability of the
   predicted traits; and (iv) undertake GWAS to detect genomic loci
   associated with the key photosynthetic parameters and identify
   potential candidate genes to assess the usefulness and robustness of
   the approaches used in this study.

2. Materials and Methods

2.1. GWAS Trials

   Two field experiments were conducted during two consecutive summer
   seasons (2019 and 2020) at Gatton Research Station (GAT), Gatton,
   Queensland, Australia (27°33′S, 152°20′E, 94 m above sea level). GAT1
   and GAT2 were sown on 14 January 2019 and 12 November 2019,
   respectively. Both trials were designed using partial replication with
   spatially randomised genotypes arranged in rows and columns. There were
   875 plots, including 650 genotypes in GAT1, and 912 plots, including
   634 genotypes in GAT2, with 70 genotypes in common between trials
   ([94]Table 1). The genotypes in GAT1 were all inbred lines (n = 649)
   from a sorghum diversity panel comprising world-wide collections
   [[95]43], and one hybrid was also included. In GAT2, 89% genotypes were
   hybrids from the Queensland breeding program, and the rest were inbred
   lines from the sorghum diversity panel. Each plot (4.5 m length and 3 m
   width) sown to a genotype consisted of four rows. Both trials were
   planted with a GPS precision planter at a population density of 108,000
   plants ha^−1. For both trials, 150 kg of nitrogen per hectare was
   applied preplanting, and plots were irrigated regularly to provide
   nutrient and water nonlimiting conditions. The temperature,
   photosynthetic photon flux (PPF), and relative humidity (RH) from 6 am
   to 6 pm for the duration of each trial are shown in [96]Table 1.

Table 1.

   Top: mean and maximum daily temperatures, mean daily photosynthetic
   photon flux, and relative humidity during the two GWAS trials and two
   training sets in 2019 and 2020; bottom: number of plots and genotypes
   used in each experiment; and the genotypes in common between trials are
   in italic.
   Year Temperature (°C) PPF (μmol s^−1m^−2) RH (%)
   Mean Maximum Mean
   2019 26.84 38.98 743.11 62.86
   2020 29.22 38.52 1000.95 56.1
   Trials TS1 TS2 GAT1 GAT2
   TS1 80 plots (60 genotypes) 19 genotypes 60 genotypes 36 genotypes
   TS2 108 plots (93 genotypes) 30 genotypes 92 genotypes
   GAT1 875 plots (650 genotypes) 70 genotypes
   GAT2 912 plots (634 genotypes)
   [97]Open in a new tab

   Note: photosynthetic photon flux (PPF) and relative humidity (RH); the
   trials in 2019 including the training set TS1 and the GWAS trial GAT1;
   the trials in 2020 including the training set TS2 and the GWAS trial
   GAT2.

2.2. Training Sets

   Adjacent to each of the GWAS trials, a training set comprising a
   representative sample of the lines in the GWAS trials was used to
   collect ground truth data for association with hyperspectral
   measurements. Completely randomised block designs (row-column) were
   also used in the training sets. The middle two rows (0.63 m row
   spacing) of each four-row plot were used for the ground truth data
   collection while the outside two rows (0.75 m row spacing) were guard
   rows. The training set in 2019 (TS1) consisted of 80 plots comprising
   60 genotypes which were all inbred lines and also included in GAT1. In
   the training set of 2020 (TS2), there were 108 plots with 93 genotypes
   of which 63 (68%) were hybrids. There were 19 genotypes in common
   between TS1 and TS2. Due to differences in germination and vigour of
   the diverse germplasm used, there was substantial variability in final
   plant establishment in both trials. The ground truth measurements were
   only taken from plots which had good establishment, which reduced the
   number of possible observations that could be used to develop the
   models. To maximise the number and the range of observations, the
   ground truth data from TS1 and TS2 were pooled.

2.3. Ground Truth Measurements in the Training Sets

   In both trials, gas exchange measurements were taken under mostly
   cloudless conditions (between 9 am and 12 pm) between 35 and 50 days
   after sowing (DAS)), which was during the active vegetative growth
   period for all genotypes and hence before the switch to reproductive
   growth which may introduce physiological and metabolic changes, but
   after full canopy closure. This period is known to be the most critical
   period for grain production in sorghum [[98]46]. In total, 75 CO[2]
   (ACi) and 75 light (Ai) response curves were collected across TS1 (n =
   31 plots comprising 29 inbred lines) and TS2 (n = 44 plots comprising
   30 hybrid and 10 inbred lines) with six inbred lines in common between
   TS1 and TS2. One plant per plot was randomly selected for gas exchange
   measurements. The ACi curves were performed on the last or second last
   fully expanded leaf using a LI-6400 (LI-COR, Inc., Lincoln, Nebraska
   USA) with a 6400-02B Red/Blue LED light source illuminating a leaf
   chamber of 6 cm^2. To measure ACi curves, photosynthetically active
   radiation (PAR) was set at 1800 μmol photons m^−2s^−1, flow rate
   through the chamber at 500 μmol mol^−1, and temperature was set to leaf
   temperature measured at the commencement of each curve. Vapour-pressure
   deficit (VPD) was generally held at around 3.0 kPa, by adjusting the
   scrubbing of the incoming air via the desiccant. For each ACi curve,
   the reference CO[2] levels were set to the sequences of 200, 100, 50,
   250, 400, 650, 800, 1000, 1200, and 1400 ppm, with a duration of 1-5
   min for each step. Measurements were made at each CO[2] supply point
   when gas exchange had equilibrated, at which point, the coefficient of
   variation for the CO[2] concentration differential between the sample
   and reference analysers was below 1%. The light levels for the Ai
   curves were set at 2000, 1500, 1000, 500, 250, 120, 60, 30, 15, and 0
   μmol m^−2s^−1. The other controls were set as follows: reference CO[2]
   (constant at 400 μmol mol^−1), flow (500 μmol mol^−1), temperature was
   set to leaf temperatures, and humidity was controlled by scrubbing
   incoming air to maintain a VPD around 3.0 kPa. The duration for every
   light level was 1-3 min. Sample and reference analysers were matched
   before each data point was logged.

   A small square section of the leaf (1.6 cm^2) was collected with a leaf
   punch from the same leaf section as was used for gas exchange
   measurements. The leaf sections were dried at 80°C and weighed to
   calculate LMA (g m^−2). Percent nitrogen of each sample was determined
   with a continuous flow isotope ratio mass spectrometer (CF-IRMS), and
   SLN (g m^−2) was calculated by multiplying percent nitrogen with LMA.
   Across the two training sets, 129 SLN and 169 LMA observations (plots)
   were obtained, involving 124 unique genotypes.

   To generate a maximised dataset and enhance robustness of associating
   the ground truth data taken in a plot with hyperspectral measurements
   obtained from the same plot, individual plots, rather than genotypes,
   were considered as an observational unit.

2.4. Canopy Hyperspectral Measurements

   Hyperspectral data captured before anthesis and around the same time as
   the ground-truthing data (at 58 and 52 DAS in 2019 and 2020,
   respectively) was used to associate with the ground truth data. At this
   stage of sorghum crop growth, canopies are fully closed and nitrogen
   content of individual leaves is expected to be at a maximum as all
   mainstem leaves are fully expanded, but, prior to any translocation of
   nitrogen during senescence [[99]47]. A tractor-based field phenotyping
   platform (GECKO; developed at The University of Queensland) which
   enables simultaneous crop canopy proximal sensing was used [[100]48].
   The tractor moves at a constant 1.1 metres per second and is integrated
   with a GPS real-time kinematic system with 2 cm accuracy to locate
   sampling plots (individual size of 4.5 × 3 m). A microhyperspectral
   imager (Micro-Hyperspec VNIR model, Headwall Photonics, Fitchburg, MA,
   USA) mounted on this phenotyping platform (3 m above ground and~1.7 m
   above the canopy) was used to obtain the spectral response of each
   pixel (5 × 5 mm) at 272 spectral wavelengths between 395 and 997 nm
   (visible and near infrared). The resolution was approximately 2.2 nm
   with 6.0 nm Full Width Half Maxima. A radiometric calibration (dark
   signal calibration) of the hyperspectral camera was performed weekly. A
   spectral calibration using the nominal white and spectral diffusers
   with specific band sets focused on the highest possible spectral
   resolution was conducted every three months by comparing their
   respective responses in almost identical illumination conditions. An
   automated software data calibration pipeline was used to convert raw
   digital numbers to reflectance values at each pixel. Pixel reflectance
   was calculated by the ratio between pixel radiance from the
   microhyperspectral imager and the reference pixel radiance from an
   upward sensor measuring incoming radiance. To segment plants from soil
   and remove background noise from lower canopy levels, a threshold of
   NDVI > 0.5 was applied for each pixel based on the fractional
   vegetation cover [[101]27, [102]36, [103]49], which could ensures only
   spectral information from green leaves is retained for the reflectance
   calculations and shadows and other background noise are excluded from
   the hyperspectral images. After masking by NDVI > 0.5, plant pixels
   within a plot were averaged to calculate reflectance of each plot. All
   hyperspectral data was collected from 9 am to 12 pm to minimise the
   effects of relative orientation of the sun, and no adjustments were
   made for the sensor or the distribution of leaf angles in the masking.
   As an example, images, radiance, and reflectance pre- and postmasking
   by NDVI > 0.5 for plot 361 in 2020 are shown in [104]Figure 1.

Figure 1.

   Figure 1
   [105]Open in a new tab

   An example (plot 361 in training set 2) of plant canopy area (a) before
   and (c) after masking by (b) NDVI > 0.5; averaged plot radiance and
   reflectance before and after masking by NDVI > 0.5 (d).

   A set of hyperspectral vegetation indices known to be associated with
   photosynthesis was computed from the plot reflectance involving 16
   wavelengths as shown in [106]Figure 1. The equations used to calculate
   the indices in this study were summarised in [107]Table 2.

Table 2.

   Summary of the equations for the set of vegetation indices associated
   with photosynthesis.
   Acronym Indices Traits associated Equations References