Alan Ker, University of Guelph
Is There Too Much History in Historical Yield Data?
Date and Location
Thursday, December 7, 2017, 4:10 PM - 5:30 PM
ARE Library Conference Room, 4101 Social Sciences and Humanities
County crop yield data from United States Department of Agriculture - National Agricultural Statistics Service (USDA-NASS) has and continues to be extensively used in the literature as well as practice. The most notable example is crop insurance; Risk Management Agency (RMA) uses this data to rate and conduct claims for their area yield and revenue programs. Examples from the literature include investigation of rating methodologies, issues related to land use, modeling the climate-yield relationship, and productivity analysis. In many of these applications --- and certainly with respect to RMA and the crop insurance literature --- yield data are detrended and adjusted for possible heteroskedasticity and then assumed to be independent and identically distributed. For most major crop-region combinations, county yield data exist from 1955 onwards and reflect very significant innovations in both seed and farm management technologies. Despite correcting for changes in the first two moments of the yield data generating process (dgp), these innovations raise doubt regarding the identically distributed assumption. This manuscript considers the question of how much historical yield data should be used in empirical analyses. The answer is obviously dependent on the empirical application, crop-region combination, econometric methodology, and chosen loss function. Nonetheless, we attempt to tackle this question in three ways using county-level yield data for corn, soybean, and winter wheat. First we use distributional tests to assess if and when the adjusted yield data may result from different dgps. Second, we consider the application to crop insurance by using an out-of-sample rating game --- commonly employed in the literature --- to compare rates from the full versus restricted data sets. Third, we estimate flexible time-varying dgps and then simulate to quantify the additional error when the identically distribution assumption is imposed. Overall, the results indicate that despite accounting for time-varying movements in the first two moments, using yield data more than 30 years old can substantially increase estimation error. Given that discarding data is unappetizing --- particularly so in applications with relatively small T --- we investigate three methodologies that can re-incorporate the discarded data while both explicitly acknowledging the unknown dgps are different and not requiring knowledge about the extent or form of those differences. Our results suggest gains in efficiency can be realized by using these methodologies. While our results are most applicable to the crop insurance literature, we certainly feel they suggest proceeding with caution when using historical yield data in other applications as well.
Contact Us2116 Social Sciences and Humanities
University of California, Davis
One Shields Avenue
Davis, CA 95616
Main Office: 530-752-1515
Student Advising Services: 530-754-9536
DeLoach Conference Room: 530-752-2916
Main Conference Room: 530-754-1850