Journal Home > Volume 11 , Issue 1

Remotely sensed data are frequently used for predicting and mapping ecosystem characteristics, and spatially explicit wall-to-wall information is sometimes proposed as the best possible source of information for decision-making. However, wall-to-wall information typically relies on model-based prediction, and several features of model-based prediction should be understood before extensively relying on this type of information. One such feature is that model-based predictors can be considered both unbiased and biased at the same time, which has important implications in several areas of application. In this discussion paper, we first describe the conventional model-unbiasedness paradigm that underpins most prediction techniques using remotely sensed (or other) auxiliary data. From this point of view, model-based predictors are typically unbiased. Secondly, we show that for specific domains, identified based on their true values, the same model-based predictors can be considered biased, and sometimes severely so.

We suggest distinguishing between conventional model-bias, defined in the statistical literature as the difference between the expected value of a predictor and the expected value of the quantity being predicted, and design-bias of model-based estimators, defined as the difference between the expected value of a model-based estimator and the true value of the quantity being predicted. We show that model-based estimators (or predictors) are typically design-biased, and that there is a trend in the design-bias from overestimating small true values to underestimating large true values. Further, we give examples of applications where this is important to acknowledge and to potentially make adjustments to correct for the design-bias trend. We argue that relying entirely on conventional model-unbiasedness may lead to mistakes in several areas of application that use predictions from remotely sensed data.


menu
Abstract
Full text
Outline
About this article

Why ecosystem characteristics predicted from remotely sensed data are unbiased and biased at the same time – and how this affects applications

Show Author's information Göran Ståhla( )Terje GobakkenbSvetlana SaarelabHenrik J. PerssonaMagnus EkströmaSean P. HealeycZhiqiang YangcJohan HolmgrenaEva LindbergaKenneth NyströmaEmanuele PapucciaPatrik UlvdalaHans Ole ØrkabErik NæssetbZhengyang HoudHåkan OlssonaRonald E. McRobertse
Department of Forest Resource Management, Swedish University of Agricultural Sciences, 901 83, Umeå, Sweden
Faculty of Environmental Sciences and Natural Resource Management, Norwegian University of Life Sciences, Ås, Norway
USDA Forest Service, Rocky Mountain Research Station, Ogden, UT, USA
The Key Laboratory for Silviculture and Conservation of Ministry of Education, Beijing Forestry University, Beijing, 100083, China
Department of Forest Resources, University of Minnesota, St. Paul, MN, USA

Abstract

Remotely sensed data are frequently used for predicting and mapping ecosystem characteristics, and spatially explicit wall-to-wall information is sometimes proposed as the best possible source of information for decision-making. However, wall-to-wall information typically relies on model-based prediction, and several features of model-based prediction should be understood before extensively relying on this type of information. One such feature is that model-based predictors can be considered both unbiased and biased at the same time, which has important implications in several areas of application. In this discussion paper, we first describe the conventional model-unbiasedness paradigm that underpins most prediction techniques using remotely sensed (or other) auxiliary data. From this point of view, model-based predictors are typically unbiased. Secondly, we show that for specific domains, identified based on their true values, the same model-based predictors can be considered biased, and sometimes severely so.

We suggest distinguishing between conventional model-bias, defined in the statistical literature as the difference between the expected value of a predictor and the expected value of the quantity being predicted, and design-bias of model-based estimators, defined as the difference between the expected value of a model-based estimator and the true value of the quantity being predicted. We show that model-based estimators (or predictors) are typically design-biased, and that there is a trend in the design-bias from overestimating small true values to underestimating large true values. Further, we give examples of applications where this is important to acknowledge and to potentially make adjustments to correct for the design-bias trend. We argue that relying entirely on conventional model-unbiasedness may lead to mistakes in several areas of application that use predictions from remotely sensed data.

Keywords: Bias, Design-based inference, Model-based inference

References(47)

Arnab, R., 2017. Survey Sampling Theory and Applications. Academic Press.

DOI

Andersen, H.E., Strunk, J., Temesgen, H., 2011. Using airborne light detection and ranging as a sampling tool for estimating forest biomass resources in the Upper Tanana Valley of Interior Alaska. West. J. Appl. Finance 26 (4), 157–164.

Barnett, A.G., Van Der Pols, J.C., Dobson, A.J., 2005. Regression to the mean: what it is and how to deal with it. Int. J. Epidemiol. 34 (1), 215–220.

Barth, A., Lind, T., Ståhl, G., 2012. Restricted imputation for improving spatial consistency in landscape level data for forest scenario analysis. For. Ecol. Manag. 272, 61–68.

Breidenbach, J., Astrup, R., 2012. Small area estimation of forest attributes in the Norwegian National Forest Inventory. Eur. J. For. Res. 131, 1255–1267.

Brewer, K., 2013. Three controversies in the history of survey sampling. Surv. Methodol. 39 (2), 249–263.

Cassel, C.M., Särndal, C.E., Wretman, J.H., 1977. Foundations of Inference in Survey Sampling. Wiley, New York.

Chambers, R., Clark, R., 2012. An Introduction to Model-Based Survey Sampling with Applications. Oxford University Press, New York.

DOI

Dubayah, R., Armston, J., Healey, S.P., Bruening, J.M., Patterson, P.L., Kellner, J.R., Duncanson, L., Saarela, S., Ståhl, G., Yang, Z., Tang, H., Blair, J.B., Fatoyinbo, L., Goetz, S., Hancock, S., Hansen, M., Hofton, M., Luthcke, S., 2022. GEDI launches a new era of biomass inference from space. Environ. Res. Lett. 17 (9), 095001.

Dumelle, M., Higham, M., Ver Hoef, J.M., Olsen, A.R., Madsen, L., 2022. A comparison of design-based and model-based approaches for finite population spatial sampling and inference. Methods Ecol. Evol. 13 (9), 2018–2029.

Ehlers, S., Grafström, A., Nyström, K., Olsson, H., Ståhl, G., 2013. Data assimilation in stand-level forest inventories. Can. J. For. Res. 43 (12), 1104–1113.

Ehlers, S., Saarela, S., Lindgren, N., Lindberg, E., Nyström, M., Persson, H.J., Olsson, H., Ståhl, G., 2018. Assessing error correlations in remote sensing-based estimates of forest attributes for improved composite estimation. Rem. Sens. 10 (5), 667.

Galton, F., 1886. Regression towards mediocrity in hereditary stature. J. Anthropol. Inst. G. B. Ireland 15, 246–263.

Gao, Y., Lu, D., Li, G., Wang, G., Chen, Q., Liu, L., Li, D., 2018. Comparative analysis of modeling algorithms for forest aboveground biomass estimation in a subtropical region. Rem. Sens. 10 (4), 627.

Gilichinsky, M., Heiskanen, J., Barth, A., Wallerman, J., Egberth, M., Nilsson, M., 2012. Histogram matching for the calibration of k NN stem volume estimates. Int. J. Rem. Sens. 33 (22), 7117–7131.

Gregoire, T.G., 1998. Design-based and model-based inference in survey sampling: appreciating the difference. Can. J. For. Res. 28 (10), 1429–1447.

Gregoire, T.G., Ståhl, G., Næsset, E., Gobakken, T., Nelson, R., Holm, S., 2011. Model-assisted estimation of biomass in a LiDAR sample survey in Hedmark County, Norway. Can. J. For. Res. 41 (1), 83–95.

Hansen, M.C., Potapov, P.V., Moore, R., Hancher, M., Turubanova, S.A., Tyukavina, A., Thau, D., Stehman, S.V., Goetz, S.J., Loveland, T.R., Kommareddy, A., Egorov, A., Chini, L., Justice, C.O., Townshend, J., 2013. High-resolution global maps of 21st-century forest cover change. Science 342 (6160), 850–853.

Hao, L., Naiman, D.Q., 2007. Quantile Regression (No. 149). Sage, London.
DOI

Heckel, K., Urban, M., Schratz, P., Mahecha, M.D., Schmullius, C., 2020. Predicting forest cover in distinct ecosystems: the potential of multi-source Sentinel-1 and-2 data fusion. Rem. Sens. 12 (2), 302.

Heeringa, S.G., West, B.T., Berglund, P.A., 2017. Applied Survey Data Analysis. CRC press. https://doi.org/10.1201/9781315153278.

DOI

Hou, Z., Mehtätalo, L., McRoberts, R.E., Ståhl, G., Tokola, T., Rana, P., Siipilehto, J., Xu, Q., 2019. Remote sensing-assisted data assimilation and simultaneous inference for forest inventory. Remote Sens. Environ. 234, 111431.

Hou, Z., McRoberts, R.E., Zhang, C., Ståhl, G., Zhao, X., Wang, X., Li, B., Xu, Q., 2022. Cross-classes domain inference with network sampling for natural resource inventory. For. Ecosyst. 9, 100029.

Langner, A., Achard, F., Grassi, G., 2014. Can recent pan-tropical biomass maps be used to derive alternative Tier 1 values for reporting REDD+ activities under UNFCCC? Environ. Res. Lett. 9 (12), 124008.

Lindgren, N., Nyström, K., Saarela, S., Olsson, H., Ståhl, G., 2022. Importance of calibration for improving the efficiency of data assimilation for predicting forest characteristics. Rem. Sens. 14 (18), 4627.

Lämås, T., Sängstuvall, L., Öhman, K., Lundström, J., Årevall, J., Holmström, H., Nilsson, L., Nordstrom, E., Wikberg, P., Wikstrom, P., Eggers, J., 2023. The multi-faceted Swedish Heureka forest decision support system: context, functionality, design, and 10 years experiences of its use. Front. For. Glob. Change 6, 1163105.

McRoberts, R.E., Næsset, E., Gobakken, T., Bollandsås, O.M., 2015. Indirect and direct estimation of forest biomass change using forest inventory and airborne laser scanning data. Remote Sens. Environ. 164, 36–42.

McRoberts, R.E., Næsset, E., Gobakken, T., Chirici, G., Condés, S., Hou, Z., Saarela, S., Chen, Q., Ståhl, G., Walters, B.F., 2018. Assessing components of the model-based mean square error estimator for remote sensing assisted forest applications. Can. J. For. Res. 48 (6), 642–649.

Mitchard, E.T., Saatchi, S.S., Baccini, A., Asner, G.P., Goetz, S.J., Harris, N.L., Brown, S., 2013. Uncertainty in the spatial distribution of tropical forest biomass: a comparison of pan-tropical maps. Carbon Bal. Manag. 8, 1–13.

Mohren, G.M.J., 2003. Large-scale scenario analysis in forest ecology and forest management. For. Pol. Econ. 5 (2), 103–110.

Næsset, E., Gobakken, T., Solberg, S., Gregoire, T.G., Nelson, R., Ståhl, G., Weydahl, D., 2011. Model-assisted regional forest biomass estimation using LiDAR and InSAR as auxiliary data: a case study from a boreal forest area. Remote Sens. Environ. 115 (12), 3599–3614.

Penner, M., Pitt, D.G., Woods, M.E., 2013. Parametric vs. nonparametric LiDAR models for operational forest inventory in boreal Ontario. Can. J. Rem. Sens. 39 (5), 426–443.

Persson, H.J., Ståhl, G., 2020. Characterizing uncertainty in forest remote sensing studies. Rem. Sens. 12 (3), 505.

Quah, D., 1993. Galton's fallacy and tests of the convergence hypothesis. Scand. J. Econ., 427–443.

Réjou-Méchain, M., Barbier, N., Couteron, P., Ploton, P., Vincent, G., Herold, M., Mermoz, S., Saatchi, S., Chave, J., de Boissieu, F., Feret, J., Takoudjou, S.M., Pélissier, R., 2019. Upscaling forest biomass from field to satellite measurements: sources of errors and ways to reduce them. Surv. Geophys. 40, 881–911.

Ruckelshaus, M., Hartway, C., Kareiva, P., 1997. Assessing the data requirements of spatially explicit dispersal models. Conserv. Biol. 11 (6), 1298–1306.

Saarela, S., Varvia, P., Korhonen, L., Yang, Z., Patterson, P.L., Gobakken, T., Næsset, E., Healey, S.P., Ståhl, G., 2023. Three-phase hierarchical model-based and hybrid inference. MethodsX 11, 102321.

Solberg, S., Weydahl, D.J., Nasset, E., 2010. Simulating X-band interferometric height in a spruce forest from airborne laser scanning. IEEE Trans. Geosci. Rem. Sens. 48 (9), 3369–3378.

Särndal, C.E., Swensson, B., Wretman, J., 2003. Model Assisted Survey Sampling. Springer, Berlin, Heidelberg.

Shukla, G.K., 1972. On the problem of calibration. Technometrics 14 (3), 547–553.

Tellinghuisen, J., 2000. Inverse vs. classical calibration for small data sets. Fresen. J. Anal. Chem. 368, 585–588.

Tian, Y., Nearing, G.S., Peters-Lidard, C.D., Harrison, K.W., Tang, L., 2016. Performance metrics, error modeling, and uncertainty quantification. Mon. Weather Rev. 144 (2), 607–613.

Tomppo, E., Olsson, H., Ståhl, G., Nilsson, M., Hagner, O., Katila, M., 2008. Combining national forest inventory field plots and remote sensing data for forest databases. Remote Sens. Environ. 112 (5), 1982–1999.

Thompson, S.K., 2012. Sampling, vol. 755. John Wiley & Sons, New Jersey.

Wang, R., Chen, J.M., Liu, Z., Arain, A., 2017. Evaluation of seasonal variations of remotely sensed leaf area index over five evergreen coniferous forests. ISPRS J. Photogrammetry Remote Sens. 130, 187–201.

Wilhelmsson, P., Sjödin, E., Wästlund, A., Wallerman, J., Lämås, T., Öhman, K., 2021. Dynamic treatment units in forest planning using cell proximity. Can. J. For. Res. 51 (7), 1065–1071.

Wu, C., Thompson, M.E., 2020. Sampling Theory and Practice. Springer, Cham.
DOI
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 02 August 2023
Revised: 21 November 2023
Accepted: 23 December 2023
Published: 03 January 2024
Issue date: February 2024

Copyright

© 2024 The Authors.

Acknowledgements

This work is part of the programme Mistra Digital Forests and of the Center for Research-based Innovation SmartForest: Bringing Industry 4.0 to the Norwegian forest sector (NFR SFI project no. 309671, smartforest.no).

Rights and permissions

This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

Return