Translator Disclaimer
Open Access Paper
26 August 2020 Assess citizen science based land cover maps with remote sensing products: the Ground Truth 2.0 data quality tool
Joan Masó, Núria Julia, Alaitz Zabala, Ester Prat, Johannes van der Kwast, Cristina Domingo-Marimon
Author Affiliations +
Proceedings Volume 11524, Eighth International Conference on Remote Sensing and Geoinformation of the Environment (RSCy2020); 115241M (2020)
Event: Eighth International Conference on Remote Sensing and Geoinformation of the Environment (RSCy2020), 2020, Paphos, Cyprus
One of the main concerns in adopting citizen science is data quality. Derived products inherit intrinsic limitations of the capture methodology as well as the uncertainties in observations. OpenStreetMap tools are designed to minimize uncertainties in positional accuracy by ensuring a good co-registration of the observations with imagery or direct use of GPS. When thematically annotating objects contributed by citizens, uncertainty increases. During the H2020 GroundTruth 2.0 project two land-cover products derived from OSM were analyzed; one created by the University of Heidelberg ( and another elaborated by University of Coimbra ( To be able to assess the quality of both maps, a third product derived from remote sensing was introduced as a reference map. In GroundTruth 2.0 a tool to show and compare maps as part of the MiraMon Map Browser was developed. The objective was to allow final users to auto-evaluate the quality of their region of interest. The confusion matrix has been used as a method to derive overall commission and omission estimators as well as the Kappa coefficient. Most of the discrepancies between OSM and remote sensing (RS) derived maps are related to different approaches used during data capturing. The data quality tool assesses the quality of individual observations exposed using the OGC standard and describes the quality in an interoperable approach based on QualityML.



Metadata was introduced in order to easily find and determine geospatial datasets coming from different sources such as local, regional or national governments, private sector, etc,1. Discovery metadata focuses on the bits of information necessary to allow catalogues to manage simple queries and make data known. However once data is discovered, the answer to the question “what dataset will be more appropriated for a specific purpose?”, is related with the data quality information included in the metadata2. Providers try to convey the information about data quality into a set of comparable quality indicators describing different aspects of the dataset, including the geometric dimension, the thematic dimension and the temporal dimension3. These dimensions combined with the diversity of geospatial products results in long lists of commonly used quality indicators. ISO 19157 tries to standardize these measurements in an Annex that describes about 80 indicators. This ISO defines the quality component that is going to be measured, the individual uncertainties or errors that are going to be collected or measured and the statistics or metrics that are going to be used to create the indicator. UncertML provides a list of uncertainties that can be used for individual measurements while QualityML, collecting the statistics and metrics from ISO19157 and other sources, is used to elaborate the indicators and organizing them in a rational way4. Therfore, QualityML is a dictionary of quality measurements and an encoding system to expose them.

Sometimes, the data quality description in the producers’ metadata is not enough for the final user, who will often need a more practical approach based on user centric descriptions on previous usage experiences and associated difficulties5. The Open Geospatial Consortium has published an standard to formalize the feedback of the users about geospatial artifacts called the Geospatial User Feedback (GUF)6.

In Citizen Science, users are considered feedback providers but also data producers. In this case, the production becomes distributed. Despite the lack of a central responsible party and the heterogeneity of the participants in data collection, some simple best practices can dramatically improve the quality of the data set. Simple filtering techniques can be applied to validate data input (e.g. syntactical validation done by the data input user interfaces or usage of controlled vocabularies). More complex procedures involve the creation of a social network that reviews the data and associate a level of perceived trustworthiness of some contributors7. The lack of data quality assessment measures conducted by the responsible of the repository combined with the lack of simple data visualization tools (that make errors prominent) can also reduce the confidence in citizen science datasets8.


Data quality measures for land-cover maps

Commonly, the focus of data quality is in the precision of the position of the objects measured as one can see by the number of measures detailed by the ISO 191157 standards for geospatial quality. Land-cover (LC) maps are different in nature. Typically they define a list of land-use or land-cover classes and they cover the territory in a continuous way assigning a class name to all the space. This way, the thematic quality is as important as the positional quality and sometimes both qualities cannot be decoupled. The procedures to quality control a land-use and land-cover (LULC) map have been clearly defined and they are mainly based on comparing the map with a set of measures considered correct and collected at the ground9. The basis practice is the application of a confusion or error matrix. This matrix provides a table where the names of the classes are the titles of both rows and columns. Columns express the ground truth values and rows the categories classified as in the map. The confusion matrix provides a great deal of information, not only in terms of the average quality of the map but also on the individual classes that are more sensible to mistakes because they are more often confused with other classes. In addition, the confusion matrix is relatively easy to create by the producer and relatively easy to interpret by the user. Precisely, one problem with the confusion matrix is its nature, becoming too verbose and detailed to be easy compared. As a remedy, several indicators that can be computed from the confusion matrix have been suggested. One of the best known is the Kappa index. Even if a recent publication suggested that Kappa indices are useless and misleading for the practical applications in remote sensing10, this publication and others still propose better indices based on the confusion matrix numbers11.

This paper acknowledges the fact that previous quality indicators (e.g. based on the Kappa index distributed by the producers) can be wrong and proposes a tool that allows users to actually create their own confusion matrices, calculate the newly proposed derivative indices for the area they are interested in and publish these indices linked to the QualityML definitions.


Creating land-cover maps from OSM

The democratization of high-quality location determination using low cost GPS hardware recently integrated in smart phones has made possible all sorts of crowdsourcing, Volunteered Geographic Information (VGI) and Citizen Science projects. In fact, the contributor can focus on providing good quality thematic attributes to the positions that the GPS records. One popular project that has been around since 2004 is OpenStreetMap (OSM). OSM’s aim is to create a set of map data that is free to use, editable, and licensed under open copyright schemes (Open Database License (ODbL))12. OSM provides an open source alternative to topographic maps in a single project that covers the whole world. Each feature must have at least one attribute (tag) describing it. A tag consists in a key-value-pair (e.g. landuse = forest), where keys describe a general topic or type of an attribute (e.g. landuse) and values give a value for that key (e.g. forest). These data tags are the bases for the map available at Since OSM tags do not directly correspond to LULC classes, there are some gaps in the resulting product that can be covered by combining the OSM data with other data sources such as free remote sensing data13. The product is available here: A second team is developing OSM2LULC as part of the open source GeoData Algorithms for Spatial Problems (GASP) Python package. The method starts with line-based OSM features that are converted into polygons and associating OSM features to a LULC class14 using decision rules. The final product is available here:



Some efforts were done to propose solutions that can be applied to the citizen science dataset reusing ISO 19157 and reference datasets to calculate indicators covering in different aspects of data quality, such as the architecture proposed in the COBWeb project15. In some cases these processes are offered as chainable Web Processing services (WPS)16, but setting up a WPS and use it in a WPS client is outside of the capacities of most users. In addition, when these estimations are calculated, this information is not distributed as part of the metadata accompanying that data and it might be lost in the process of data sharing. In the Ground Truth 2.0 project, our proposal was to provide a web map client that enables the user to assess the quality of different products by itself and to choose what project would fit better its purpose.


A technical approach in the client side

In this paper, we propose a strategy for map browsers consisting of implementing raster layer visualization as a Web Map Service (WMS) request that instead of dealing with static images in PNG or JPEG format for each layer, uses AJAX to do WMS requests that response binary arrays of a specific number of bytes (for example short integers of 16 bits). The proposed solution empowers the client as it is able to store the actual values of each band in memory and then it can use JavaScript code to operate with the data in many ways. First, the browser is able to build personalized views, enhance contrast, present histograms etc. at the client side. Secondly the user is able to perform spatial filters, generate animations and graphics of a time series, and perform complex calculations using several bands from the different available datasets17. Vector layers can be requested as Web Feature Service (WFS) or Sensor Observation Service (SOS) requests receiving a GeoJSON file as a result. In the Ground Truth 2.0 this approach was expanded to calculate some quality indicators of the data.

In the case of two raster datasets, one to assess quality and the other one considered the reference data, both are retrieved as two binary arrays. The process of creating a confusion matrix starts by creating a combination of the two layers in a single composite layer which pixels contain classes that correspond to all possible permutations of the legend of the two maps. The second step consists in creating a histogram as the count of the number of pixels in each combined class of the composite layer. This one dimensional histogram is rearranged in a 2 dimensional matrix that has the classes of the first layer as rows and the classes of the second layer as columns, resulting in a confusion matrix.

In the case of a raster dataset and one vector dataset that is composed by a list of ground truth point measurements, the confusion matrix is created by determining the value in the raster file of the geometrical position of each ground truth point and presenting these values in comparison with the vector file values in the form of a matrix.

From this confusion matrix, other derived indicators can be computed. They are presented as quality calculations in the QualityML format.


The Llobregat delta area in three different maps

To demonstrate the feasibility of this methodology, we selected an area of about 433 km2 in the delta of the Llobregat river in Catalonia (North East of Spain). This area presents a big variety of classes and our research group recently made a LULC map from remote sensing, giving us a dataset to compare with. We requested the OSMlanduse and the OSM2LULC products from the Heidelberg and the Coimbra teams respectively. Both OSM derived products were using a CORINE Land cover level 2 legend but our map had a legend based on other legacy maps. The first thing we needed to perform was harmonizing the legends of the three maps by reclassifying the remote sensing based one to the nomenclatures used by the other two LULC products.

Then, we included the three maps in the MiraMon WMS server and in the MiraMon web map browser. The visualization of the map is done dynamically on the client side so it is possible to present the same map in more than one style. This characteristic was used to actually resent the data with the CORINE land cover level 1 and the CORINE land cover level 2. In Fig 1 one can see the 3 LULC maps represented with the CORINE land cover level 1. The first thing that is appreciated is that OSM is an object based map that was not designed to cover the space in a continuous way. This results in many areas that are not classified. This effect is inexistent with the LULC remote sensing based map that provides a comprehensive coverage of the selected area. Secondly, visual differences between maps that will result in non-diagonal values in the confusion matrices are easily identified.

Fig 1.

The three land-use land-cover maps to compare. a) OSM2LULC b) OSMlanduse c) remote sensing map d) legend applied to the 3 maps.


The process of creating a confusion matrix starts by requesting the combination of both maps in a single layer. In Fig. 2, the land-use map based on OSM2LULC (Coimbra version) is compared with the remote sensing based map (CREAF-RS version). The result of the combination is shown in Fig. 3. In this example we only use the first level of the CORINE legend and although the number of combinations is 25, there are only 5 colors present, corresponding to the classes that are the same in both maps.

Fig 2.

Request for a layer combination of two land-use maps in the MiraMon GIS software.


Fig 3.

Layer combination of both land-use maps in the MiraMon GIS software.


Once the combination is done, we can request a confusion matrix. The diagonal values of the matrix (represented in green) correspond to the areas that have the same value in both maps. The non-diagonal values are the areas that have different classes in both maps. We can also see some information about the most similar classes (artificial surfaces and forest and semi natural areas) as well as the Kappa coefficient that is 0.81 (in a Kappa coefficient, the closer to 1 the better).

Fig 4.

Request for the confusion matrix result in the MiraMon GIS software.


Preliminary analysis of the data collected detects that in general the land-use maps provided by both teams have reasonably good overall accuracy of about 80%. There are classes that present more confusions than others. For example, the urban fabric class is frequently confused with Industrial, commercial and transport units. We also have observed that sometimes validating our land-use maps with land-cover classes detects discrepancies due to the intrinsic differences between the human interpretation of classes that tend to emphasize land-use aspects, while remote sensing classes are mainly showing the land cover. For example, a big park in the city is identified as artificial in the OSM Land-use classes while it is seen as a forest area from remote sensing due to its green land-cover aspects (see Fig. 5). Actually both interpretations can be considered as correct.

Fig 5.

Zoom to an area of discrepancies due to different interpretation of land use and land cover.



The Llobregat delta area compared to ground truth data

To do this we have used a web tool generated by IIASA called LACO-Wiki ( LACO-Wiki encapsulates the process of accuracy assessment and validation into a website that presents a set of four simple steps including uploading a land-cover map (see Fig. 6), creating a sample from the map (see Fig. 7), interpreting the sample with very high resolution imagery (see Fig. 8) and generating a report with accuracy measures18. The steps 1, 2 and 4 are done by the owner of the campaign but step 3 is done by volunteers. The graphical user interface is designed in a way that it is easy, clear and simple to respond to each validation. The interface offers only one point at a time on top of the selected imagery and reports on the class that has been associated with it. The volunteer can diagnose if the classification is correct or incorrect and in case it finds it incorrect, an alternative category can be provided (see Fig. 8). After the answer is provided a new point is presented. With some practice, a single point can be done in half a minute and a set of 100 points can be processed in 30 minutes.

Fig 6.

Adding the OSM landuse dataset in the LACOWiki.


Fig 7.

Procedure for creating a validation samples. Validation samples are creating random, stratified or systematic sampling.


Fig 8.

A point that is classified as Urban fabric is interpreted as incorrect and class Artificial, non-agricultural vegetated areas is proposed instead.


As a result, step 4 provides a point cloud containing information on the classified categories in each land-cover map and the actual class that volunteers have interpreted from the imagery that the LACOWiki offers as a background (see Fig. 9). In addition, LACOWiki also provides an Excel file with the confusion matrix generated by the points (see Fig. 10).

Fig 9.

Result of the Coimbra validation campaign as a map in the MiraMon GIS software.


Fig 10.

Result of the Coimbra validation campaign as a confusion matrix as provided by LACOWiki.




There is not much information about data quality in the metadata that producers provide. Most of the times producers prefer to release comprehensible data quality reports that require deep knowledge of the characteristics of the data and the production process. These reports are very specific and almost impossible to compare. The approach based on quality indicators proposed by ISO19157 provides a better approach in the future. In view of this situation we should provide users with the right tools to compare datasets. One possible approach is to generate Geospatial User Feedback based on experiences of the users and share them in a platform. This paper proposes an approach to enable users to run some of the ISO19157 quality measures by themselves and to create their own indicators in a web map browser at the same time that they are inspecting the data. The indicators have the special scope of the visible area, providing results adapted to the user area of interest.

The implementation is possible by combining some functionalities added in HTML5 such as AJAX, binary arrays and canvas. Using a WMS service in a new way, it is possible to transmit arrays of real values at screen resolution to the client, enabling the client to make some analytical operations, in particular, to determine quality indicators. The paper focuses on the implementation of the confusion matrix and the derived aggregated indicators and applies them to land-cover maps produced by transformation of OSM data and from remote sensing.

The comparison between OSM derived land-cover products and remote sensing derived land cover returns generally a good match. Most discrepancies are related to the different nature of the data capturing procedure. OSM based maps are more focused on interpretation of land use while remote sensing is only sensible to land cover characteristics. Urban usages are the most affected by these discrepancies. The results suggest that a combination of OSM and remote sensing should reduce the uncertainty of an integrated product as already suggested by other authors.

Calculated quality indicators are expressed in QualityML that characterizes the dimension of the quality indicator, the individual uncertainties or errors measured and the aggregation statistics or metrics. In a future development, the formal QualityML description obtained by a user will be shared with other users of the same map browser using the Geospatial User Feedback standards implemented in NiMMbus system. The GUF standard already considers the possibility to share quality assessment results in the form of quality reports.

The results described here are applicable to confusion matrices and thematic maps but the approach is of a general use. We will continue adding quality measures to the ones already implemented in the web map browser. In particular there are some quality measures that are particularly specific for citizen science that take advantage of the redundancy created by different users that should be useful. The open source code of the MiraMon web map browser presented here can be accessed from:


We thank the 2 providers of the OSM land-cover map Michael Schultz (University of Heidelberg) and Cidália Fonte and Joaquim Patriarca (University of Coïmbra) the provision of the OSM land-use land-cover maps and the LACO-Wiki responsible their support in the validation website. This work has been done under the H2020 project Ground Thuth 2.0 WeObserve and NextGEOSS projects. Ground Thuth 2.0, WeObserve and NextGEOSS projects have received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 689744, No 776740 and No 730329 respectively.



Hariharan, R., Shmueli-Scheuer, M., Li, C. and Mehrotra, S., “Quality-driven approximate methods for integrating GIS data,” in Proceedings of the 13th annual ACM international workshop on Geographic information systemsAssociation for Computing Machinery, 97 –104 (2005). Google Scholar


Devillers, R., Bédard, Y. and Jeansoulin, R., “Multidimensional Management of Geospatial Data Quality Information for its Dynamic Use Within GIS,” Photogrammetric Engineering & Remote Sensing, 81 (2), 205 –215 (2005). Google Scholar


Devillers, R., Gervais, M., Bédard, Y. and Jeansoulin, R., “Spatial data quality: from metadata to quality indicators and contextual end-user manual,” OEEPE/ISPRS Joint Workshop on Spatial Data Quality Management, 45 –55 (2002). Google Scholar


Moroni, D. F., Ramapriyan, H., Peng, G., Hobbs, J., Goldstein, J., Downs, R., Wolfe, R., Shie, C.-L., Merchant, C. J., Bourassa, M., Matthews, J. L., Cornillon, P., Bastin, L., Kehoe, K., Smith, B., Privette, J. L., Subramanian, A. C., Brown, O. and Ivanova, I., “Understanding the Various Perspectives of Earth Science Observational Data Uncertainty,” report, ESIP(2019). Google Scholar


Goodchild, M. F., “Beyond metadata: towards user-centric description of data quality,” in presented at 5th International Symposium for Spatial Data Quality (ISSDQ 2007), (2007). Google Scholar


Masó, J. and Bastin, L., “OGC Geospatial User Feedback Standard: Conceptual Model., Ver.1.0, 15-097r1,” OGC 15-097r1, (2016). Google Scholar


Hunter, J., Alabri, A. and Ingen, C. van., “Assessing the quality and trustworthiness of citizen science data,” Concurrency and Computation: Practice and Experience, 25 (4), 454 –466 (2013). Google Scholar


Alabri, A. and Hunter, J., “Enhancing the Quality and Trust of Citizen Science Data,” in 2010 IEEE Sixth International Conference on e-Science, 81 –88 (2010). Google Scholar


Strahler, A., Boschetti, L., Foody, G., Friedl, M., Hansen, M., Herold, M., Mayaux, P., Morisette, J., Stehman, S. and Woodcock, C., “Global Land Cover Validation: Recommendations for Evaluation and Accuracy Assessment of Global Land Cover Maps,” (2008). Google Scholar


Pontius Jr, R., “G. and Millones, M., “Death to Kappa: birth of quantity disagreement and allocation disagreement for accuracy assessment,” International Journal of Remote Sensing, 32 (15), 4407 –4429 (2011). Google Scholar


Salk, C., Fritz, S., See, L., Dresel, C. and McCallum, I., “An Exploration of Some Pitfalls of Thematic Map Assessment Using the New Map Tools Resource,” 3,” Remote Sensing, 10 (3), 376 (2018). Google Scholar


Haklay, M. and Weber, P., “OpenStreetMap: User-Generated Street Maps,” IEEE Pervasive Computing, 7 (4), 12 –18 (2008). Google Scholar


Schultz, M., Voss, J., Auer, M., Carter, S. and Zipf, A., “Open land cover from OpenStreetMap and remote sensing,” International Journal of Applied Earth Observation and Geoinformation, 63 206 –213 (2017). Google Scholar


Patriarca, J., Fonte, C. C., Estima, J., de Almeida, J.-P. and Cardoso, A., “Automatic conversion of OSM data into LULC maps: comparing FOSS4G based approaches towards an enhanced performance,” Open geospatial data, softw. stand., 4 (1), 11 (2019). Google Scholar


Leibovici, D. G., Williams, J., Rosser, J. F., Hodges, C., Chapman, C., Higgins, C. and Jackson, M. J., “Earth Observation for Citizen Science Validation, or Citizen Science for Earth Observation Validation? The Role of Quality Assurance of Volunteered Observations,” 4,” Data, 2 (4), 35 (2017). Google Scholar


Meek, S., Jackson, M. and Leibovici, D. G., “A BPMN solution for chaining OGC services to quality assure location-based crowdsourced data,” Computers & Geosciences, 87 76 –83 (2016). Google Scholar


Masó, J., Zabala Torres, A., Serral, I. and Pons, X., “Remote Sensing Analytical Geospatial Operations Directly in the Web Browser,” ISPRS Archives, 624 403 –410 (2018). Google Scholar


See, L., Laso Bayas, J. C., Schepaschenko, D., Perger, C., Dresel, C., Maus, V., Salk, C., Weichselbaum, J., Lesiv, M., McCallum, I., Moorthy, I. and Fritz, S., “LACO-Wiki: A New Online Land Cover Validation Tool Demonstrated Using GlobeLand30 for Kenya,” Remote Sensing, 9 (7), 754 (2017). Google Scholar
© (2020) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Joan Masó, Núria Julia, Alaitz Zabala, Ester Prat, Johannes van der Kwast, and Cristina Domingo-Marimon "Assess citizen science based land cover maps with remote sensing products: the Ground Truth 2.0 data quality tool", Proc. SPIE 11524, Eighth International Conference on Remote Sensing and Geoinformation of the Environment (RSCy2020), 115241M (26 August 2020);

Cited by 1 scholarly publication.
Remote sensing

Quality measurement

Standards development

Geographic information systems

Raster graphics


Binary data

Back to Top