Hostname: page-component-6766d58669-bkrcr Total loading time: 0 Render date: 2026-05-16T09:51:59.469Z Has data issue: false hasContentIssue false

Maize yield forecasting by linear regression and artificial neural networks in Jilin, China

Published online by Cambridge University Press:  20 May 2014

K. MATSUMURA*
Affiliation:
Department of Applied Informatics, School of Policy Studies, Kwansei Gakuin University, Hyogo 669-1337, Japan Department of Earth, Ocean and Atmospheric Sciences, University of British Columbia, 2020-2207 Main Mall, Vancouver, BC V6T 1Z4, Canada
C. F. GAITAN
Affiliation:
Department of Earth, Ocean and Atmospheric Sciences, University of British Columbia, 2020-2207 Main Mall, Vancouver, BC V6T 1Z4, Canada
K. SUGIMOTO
Affiliation:
Graduate School of Environmental Studies, Nagoya University, Furocho D2-1, Chigusaku, Nagoya 464-8603, Japan
A. J. CANNON
Affiliation:
Pacific Climate Impacts Consortium, University of Victoria, PO Box 3060 Stn CSC, Victoria, BC V8W 3R4, Canada
W. W. HSIEH
Affiliation:
Department of Earth, Ocean and Atmospheric Sciences, University of British Columbia, 2020-2207 Main Mall, Vancouver, BC V6T 1Z4, Canada
*
* To whom all correspondence should be addressed. Email: kanichi1@mbox.kyoto-inet.or.jp
Rights & Permissions [Opens in a new window]

Summary

Forecasting the maize yield of China's Jilin province from 1962 to 2004, with climate conditions and fertilizer as predictors, was investigated using multiple linear regression (MLR) and non-linear artificial neural network (ANN) models. Yield was set to be a function of precipitation from July to August, precipitation in September and the amount of fertilizer used. Fertilizer emerged as the dominant predictor and was non-linearly related to yield in the ANN model. Given the difficulty of acquiring fertilizer data for maize, the current study was also tested using the previous year's yield in the place of fertilizer data. Forecast skill scores computed under both cross-validation and retroactive validation showed ANN models to significantly outperform MLR and persistence (i.e. forecast yield is identical to last year's observed yield). As the data were non-stationary, cross-validation was found to be less reliable than retroactive validation in assessing the forecast skill.

Information

Type
Climate Change and Agriculture Research Papers
Copyright
Copyright © Cambridge University Press 2014 
Figure 0

Fig. 1. China's Jilin province, with temperature and precipitation data obtained near the capital Changchun (43°53′N, 125°19′E, 212 m a.s.l.) (map produced from ESRI Data and Maps, ESRI 2008).

Figure 1

Fig. 2. Standardized annual time series (1961–2004) used in the current study: (a) maize yield of Jilin province (solid), average temperature from April to October (dashed) and fertilizer use (dotted) and (b) precipitation from July to August (dotted) and precipitation in September (dashed).

Figure 2

Table 1. Predictors used in the multiple linear regression (MLR) and artificial neural network (ANN) models. Precipitation from July to August is denoted by prcp7+8 and precipitation in September by prcp9, while t−1 denotes the previous year

Figure 3

Table 2. Regression coefficients (with corresponding P-values underneath) for the various predictors in model MLR1, during each of the four validation periods, with the correlation between the observed and the model values presented in the rightmost column. Note the listed years are the validation years, with the model trained using data from the non-validation years

Figure 4

Table 3. Regression coefficients and P-values in model MLR2 computed for the four validation periods, with the forecast correlation score given in the rightmost column

Figure 5

Fig. 3. The root mean squared error skill score of the four artificial neural network (ANN) models as the number of hidden neurons (HN) varied from 1 to 5 (HN1–HN5), with the 95% confidence intervals obtained from bootstrapping.

Figure 6

Fig. 4. Maize yield forecasts by multiple linear regression (MLR) and ANN v. observed yield (1962–2004), where in (a) the predictors used were prcp7+8, prcp9 and fertilizer and in (b) with fertilizer replaced by the previous year's yield.

Figure 7

Fig. 5. Maize yield from ANN as a function of a single predictor (as other predictors are held constant at their mean values), with the varying predictor being (a) prcp7+8, (b) yield(t−1) and (c) fertilizer. In (d), the MLR1 yield is shown as a function of fertilizer. ANN1 was used in (a) and (c) and ANN2 in (b). The straight line in (b) is when the current yield equals the previous year's yield.

Figure 8

Fig. 6. Schematic diagram illustrating (a) the fitting of a linear regression model (dashed line) to data described by a non-linear relation (solid curve) and (b) the fitting of the linear model to the training data to the left of the vertical line, and then extrapolating to the right (dotted line) for the validation data, resulting in even more excessively high forecasted yields relative to the true relation (solid curve). Here x represents the amount of fertilizer and y the maize yield, and data points (not plotted) are assumed to be scattered around the solid curve.