Statistics Explained

Merging statistics and geospatial information, 2018 projects - Norway


Pilots for land use/land cover change and crop identification tested on a local and regional level, as well as the use of new statistics for planning and analysis; 2018 project; final report 22 December 2020

NO GG2023.jpg


This article forms part of Eurostat’s statistical report on the Integration of statistical and geospatial information.

Full article

Problem

  1. Ordinary map products record changes in land use/cover relatively late.
  2. A better understanding of how agriculture affects local ecosystems can come from accurate information about where various crops are grown. A particular interest is the need for pollination for certain crops.

Objectives

Action 1

  • Improve land use maps: see if planned developments that are underway can be identified before they are mapped in ordinary map products, which may take several years.
  • Develop land use change statistics: see if the construction year can be identified for the mapped objects by combining satellite images and map objects and observing changes in the ‘green share’ for individual polygons.
  • Estimate changes in the green share: see how new developments in urban areas change green structures over time.

Action 2

  • Apply an existing crop type recognition system with radar satellite data (developed at the University of Darmstadt) to identify which parcels of land are used for specific crops.
  • Georeference existing data on land used for specific crops at the most detailed level possible in order to validate results and identify possible adjustments to the model.

Action 3

Look at new land use statistics for planning and for analysis in the context of the system of environmental-economic accounting – experimental ecosystem accounting (SEEA EEA).

Method

Action 1

Statistics Norway annually produces a detailed land use map. Based on maps for two or more years, it is possible to identify newly developed areas. When combining the land use map with municipal plans, areas that are planned for development (but not yet developed) can also be identified. In this project, these map data have been combined with information from satellite images, interpreted to show vegetation and built-up areas.

Expert staff from the Norwegian Institute for Nature Research (NINA) trained staff at Statistics Norway in the techniques needed for Statistics Norway to be able to perform satellite image analysis independently. This involved investigating how factors such as weather, clouds, training areas and number of images affect the results, performing quality tests and learning how to interpret the test results.

A variety of sources were used.

  • Satellite images are from Sentinel-2, which record 13 spectral bands. The analysis is based on the normalised difference vegetation index (NDVI) calculated from spectral bands 4 and 8. These bands for vegetation monitoring have a resolution of 10 m. The raw images are in square tiles with a side length of 100 km, of which 90 cover Norway; the project focused on tile 32 VNM which covers built-up areas (5 %; including Oslo), forests (72 % of the area), agricultural areas (8 %) and lakes and rivers (7 %). Older images (from 2015 and to 2017) are Level-1C orthoimages (top of atmosphere). These have been converted to Level-2A images (bottom of atmosphere), which is also the level used for images from 2018 to 2020. Images at various dates have been used, with two dates in 2015, four dates in each year from 2017 to 2019 and six dates in 2020. All of these dates were in the period from 1 June to 19 September. In addition to the main images, scene classification maps, where clouds can be identified, have been used with a 60 m resolution.
  • The land resource map uses a classification of the Norwegian Institute for Bioeconomics (NIBIO): fully cultivated land, surface cultivated land, infield pasture, forest, bog, open land, water, snow/glacier, built-up area, transport and not mapped. It largely corresponds to a scale of 1 : 5 000. The map undergoes continuous updating in the municipalities as well as periodic national updating. It takes 4–5 years between each time a municipality undergoes the periodic updating.
  • Statistics Norway’s land use map is an annual map of built-up areas, generally with a scale of 1 : 5 000. A wide variety of sources are used, giving priority to better quality sources. Within this project, this land use map was used to find newly built-up areas, that were planned to be built-up (but not yet developed) and areas that had not gone through changes in recent years. Annual versions of the map from 2016 to 2020 were used.
  • Statistics Norway’s buildings map is obtained from the cadastre. Combined with other sources, this map provides building outlines; around 2 % are estimated as a buffer around a point. This source should provide dates for building permits and when a building has been taken into use, although some are missing. Quality is better from 1984 (when the register was created) and better still from 2000.
  • Municipalities have area zoning plans which are downloadable. The data used in this project include municipal plans for 88 % of the land area within tile 32 VNM.
  • Aerial photos have been used in this project for quality checking. Norgeibilder.no provides an archive of georeferenced aerial photographs. Most areas are photographed at least once every five years, with urban areas photographed annually in recent years. Images for map construction are normally taken after snow melting and before trees are in leaf; other photos are taken in the summer.
A map of Norway showing counties and Sentinel-2 tiles (100 kilometre grid) based on satellite data.
Figure 1: Sentinel-2 tiles, Norway

From the satellite images, an NDVI value was assigned to each pixel. Data were then combined into classes using training datasets. This was done twice, once by Statistics Norway and once by NINA.

Quality control was done using random points from polygons with a known cover; selected points were not part of the set used for training. For example, 84 % of built-up land had been correctly identified from the satellite images, whereas the rest had been identified as agricultural or vegetation; equally, 6 % of agricultural land had been interpreted as built-up land. A further analysis indicated that large areas of forest and built-up land were correctly interpreted.

  • The errors were generally in so-called patchwork areas within urban areas with small built-up and vegetated areas. An analysis focusing on the aggregate results for these areas can identify discrepancies between years.
  • By focusing on properties (rather than individual buildings), the date of the last major development can be identified and therefore changes in land cover can be considered as a sign of possible interpretation problems.

Another check was a comparison between the interpretation of the satellite images and aerial photographs, with objects compared based on a combination of criteria and random selection. Differences may be due in part to timing (different stages of the growing season) and to artificial green surfaces (such as artificial sports pitches).

Information from satellite images was combined with map objects. In this way, the share of different types of land cover can be calculated for individual land use objects. These shares can then be analysed over time.

A stacked bar chart showing forest productivity for Tile 32 V N M and for Oslo/Akershus.
Figure 2: Forest within study areas, by forest productivity. Tile 32VNM and Oslo/Akershus

Weather was shown to have an impact on classification. A comparison of the 2015 and 2017 results, which relied only on one or two images each year, shows that part of the differences in classifications reflects weather conditions (such as rainfall) in the period leading up to the date of the images: the two weeks before the 2015 image had been dry and relatively warm compared with somewhat milder and considerably wetter weather before the 2017 image. The interpretation of the 2017 images had a notably higher share for trees (among areas that were built-up or unchanged between 2016 and 2019).

Action 2

The study area is the former Akershus county and the neighbouring municipalities, Lier, Hole, Jevnaker, Lunner, Oslo and Østre Toten. The neighbouring municipalities provide more areas where fruit/berries and other plants are grown that benefit from pollination.

A variety of sources were used.

  • Radar satellite data from Sentinel-1 are compiled at two-day intervals over southern Norway. Radar satellite data pass through clouds and collect information about the texture of the surface. The Sentinel-1 data are sent and received in horizontal and vertical directions giving an effect known as dual polarisation, which is useful for mapping vegetation; VV and VH polarisations have been used. 382 images have been used for the period from 1 September 2018 to 15 October 2019. The pixel size is an area 10 m x 10 m. Images have been produced from six different orbits.
  • The land resource map uses a classification of the NIBIO: fully cultivated land, surface cultivated land, infield pasture, forest, bog, open land, water, snow/glacier, built-up area, transport and not mapped. It largely corresponds to a scale of 1 : 5 000. In this study, only fully cultivated land areas have been used to delimit the area for the analysis of the satellite data and – in combination with property boundaries – to find areas for which applications have been made for production subsidies.
  • The cadastre of the Norwegian Mapping Authority, updated by municipalities, identifies all properties and parcels of land. Within this study, the cadastre has been used to link subsidy applications to properties.
  • Statistics Norway’s land use map is an annual map of built-up areas, generally with a scale of 1 : 5 000. A wide variety of sources are used, giving priority to better quality sources. Within this project, this land use map was used to remove fully cultivated land areas that have been built-up since the most recent update of the land resource map.
  • The agricultural register of the Norwegian Agricultural Agency contains information about agricultural properties, persons and holdings, and the connection between these units. In this study, the 2015 and 2017 editions of agricultural properties have been used.
  • Applications for governmental subsidies concern holdings linked to an agricultural property in the agricultural register. The application includes information on all of the holding’s agricultural area including any that is temporarily out of use and also including rented land – see below. The information concerns the cadastral number of properties, the size and type of land (whether the area is fully cultivated, surface cultivated or infield pasture). In addition, the applications contain administrative information such as the name of the holding, organisation number and address. The holding also states which crops are grown on the areas for the holding as a whole (not for individual properties). As it is common for a holding to own or rent land from several properties and for a property to be rented out to several holdings, the subsidies register has many-to-many relationships between holdings and properties. This means that the agricultural land owned by a property is linked to several different holdings, but without specifying which part of the property the different holdings operate. This limits the possibility of the correct location of crop types at parcel level.
  • As training data in the first round of the analysis, 139 georeferenced photos of different crops were used. These photos were taken during the 2019 growing season on a few days in early July and late August/early September. The photos were mainly taken in the municipalities of Østre Toten, Eidsvoll, Nes, Frogn and Ås. Most types of agricultural crops grown in the study area were photographed, but with a majority of the observations for the most common cereals, as well as meadows and potatoes. The crops were identified as accurately as possible but in some cases were simply marked as vegetables or as wheat, with no more detail.
  • To obtain more observations for the training datasets (in later stages of the analysis), information was also obtained from the aerial photo archive of Norgeibilder.no. Aerial photos from this archive were used in combination with fully cultivated land area information from the land resource map and applications for subsidies to identify plots where fruit and berries are grown .The aerial photos in the study area were taken in the period 2016 to 2019.
  • To obtain more observations for the training datasets (in later stages of the analysis), in some cases information has been obtained from Google Street View for recordings in summer/autumn 2019. The recordings were used to double-check information about the type of crops from the production subsidy applications.
  • Ideally, observed minimum and maximum daily temperatures for a detailed grid would be used, but the Norwegian Meteorological Institute produces such datasets with a delay of about a year. Instead, maps with daily weather forecasts were used.

In terms of crop type recognition, agricultural areas experience significant changes within short time intervals due to the phenology of plants. Phenology refers to the various stages of plant development, such as germination, leaf development and flowering. The method used to identify different agricultural crops is based on in-depth knowledge of the phenology of the various agricultural crops, with six different phenological stages identified for each crop. The nature of soil before germination and after harvest can also help to provide a unique identification of a crop. Radar satellite images were used to identify phenological stages and sequence patterns, which were then used to estimate the probability of different crops on the ground.

The first round of interpretation was based on just satellite images, without knowledge of land or property boundaries. The interpretation of the images divided large agricultural areas into different plots. While the classification was not necessarily accurate, the boundaries between the plots matched very well with aerial photographs.

Georeferencing applications for subsidies can in part be done at the plot level and in part at the holding level; aggregated data for municipalities and the total study area can be used for verifying results.

  • If a holding grows only one crop and this on properties that the holding does not share with others it is possible to provide the correct location of a crop at the plot level. This information can be used to help the algorithms to recognise different crop types better.
  • If a holding cultivates several types of crops these cannot be georeferenced at the plot level. If the holding does not share land with other holdings, it is possible to compare the land used for various crops for the holding as a whole. A small part of this dataset was used to find exact locations for certain crops that again can be used to improve crop identification. However, most of the dataset was used for validating results.
  • Since crop types are given for holdings within the subsidy applications register, aggregation to the municipal level from data for holdings introduces an error if holdings cultivate land across municipality borders. For five municipalities (Lier, Hole, Lunner, Hurdal and Østre Toten) the proportion of cultivated land in each municipality used by holdings in the same municipality was 95 % or higher.
  • A similar analysis as that that for municipalities was done for the whole study area. Holdings located within the study area received subsidies for 94 % of the study area’s total agricultural area. An estimated 5 % of the area is used for meadows. Only a very small area of the study area (8 km²) is cultivated by holdings outside of the study area while a similarly small area (11 km²) is located on properties outside of the study area but cultivated by holdings within the study area.

The results of the satellite data analysis are 10 m x 10 m pixels. Those that lie on the border between properties are cut along the boundaries, so that parts go to different properties. The area of the various crops can then be aggregated to various geographical levels and compared against information from the applications for subsidies. For each crop, the percentage accuracy of interpretation (compared with the subsidy applications data) was made for:

  • the area within which a holding’s total area of different crops is known, even though it is not known which plots are used for which crops;
  • individual municipalities which have a small degree of ownership/tenancy spreading across municipal boundaries;
  • the entire study area.

The analysis was performed in three rounds.

  • Initially, 139 geotagged photos of different types of crops were used as training data to aid the algorithms to recognise various crops. The photos were attached to points.
  • In the second round, the training dataset consisted of plots with unambiguously mapped agricultural crops. These areas covered about 100 km² and were evenly distributed in the study area. Common crop types such as meadows and barley were overrepresented. Crops with a low distribution (such as peas, strawberries and oilseed) could not be georeferenced this way. Some areas were also included from the dataset where crops are located for holdings (rather than plots), and these were manually georeferenced.
  • The third round had the largest and most varied training dataset, consisting of 113 km² of uniquely located areas, using the same data as the first two rounds as well as extra data specifically for the least common crops.

Action 3

The purpose of the SEEA EEA is to develop a spatial and ecological basis for assessments of ecosystem services, based on spatially explicit extent accounts (area/quantity of ecosystems) and biophysical condition accounts (ecological state / quality of ecosystems).

A diagram showing the structure of the system of environmental-economic accounting – experimental ecosystem accounting for extent accounts and condition accounts, as a basis to model capacity and the use of ecosystem services.
Figure 3: Structure of SEEA EEA: extent accounts and condition accounts, as basis to model capacity and use of ecosystem services
Adapted from Maes et al. (2018)

Land use statistics are detailed, providing information about types of urban developments and built-up areas. However, other than clearly delineated green areas (such as parks and sports areas), green elements are not always visible as ‘urban green’ areas are classified by use as developed land. Within the urban area, developed land is not only built-up land (with buildings and infrastructure), but also lawns and so on. By combining data sources, it is possible to estimate the extent of urban green areas (such as trees and lawns) as land cover on developed land.

Sentinel-2 data and the land use/cover data (see Action 1 for more information on these sources) provide different statistics on the share of urban green. Areas which are known to be developed, but for which the land cover is unknown, may include green areas. Data for Oslo were analysed as an example for identifying urban green areas.

  • The land use/cover statistics differentiate between buildings (close to 45 % of the total) and infrastructure (mostly roads, about 25 %). Of the non-built-up land in developed areas, a substantial part has tree cover, but the share was not known from this source.
  • Sentinel-2 overestimates water, due to misclassification of building shadows as water, but has probably better estimates for grassland and trees.
  • A combination of these two sources uses the strengths of both sources.
A stacked bar chart comparing the information content of a map for land use/land cover and a map based on Sentinel-2 satellite data. Data are shown for the old city centre, Karl Johan Street, Oslo.
Figure 4: Old city centre, Karl Johan Street, Oslo. Comparing the information content of the map of land use/land cover and the map of Sentinel-2 satellite data

Ås municipality was selected as the focus to identify the usefulness of these kind of data for municipal planning. This municipality covers 103 km² and has 27 088 inhabitants. The planning challenges are special and interesting because Ås is an expanding university town with the opportunities for growth, innovation and new local character that this entails; it is located in a very rich agricultural landscape.

  • A 1 km² area of Ås centre was selected for comparison with other urban sites in the Oslo region and maps produced based on a variety of sources (including land use/cover and satellite data, among others).
  • Most of this type of information was already available to the municipal authorities from public maps, registers and their own databases, but these analyses were received positively. The greatest interest was to have a map for the whole municipality.
  • Public green space and recreation areas are important for public health and are also an important topic in the municipal plan.

Based on the work in Action 2, a map of agricultural production in Ås was developed at a detailed level. This identified several types of agricultural production that need pollination.

  • The distribution of insect pollinators was estimated by modelling the habitat suitability for bees across the Ås municipality, using habitat suitability scores specific to southern Norway’s ecosystems and bee communities. Suitability concerned the availability of floral resources and nesting sites and attempted to capture variation in the temporal availability of floral resources.
  • Crops whose production is demonstrably sensitive to receiving cross pollination from insects were identified. For the Ås municipality, these included strawberries, cabbage, potatoes, onions/leeks, rapeseed/canola, fruits/berries, vegetables and carrots.
  • Habitat suitability scores were used to determine whether areas with high pollinator suitability could be found within a 500 m buffer of the crops’ locations.

Results

Action 1

A combination of the interpreted satellite images, Statistics Norway’s land use map and municipal zoning plans made it possible to identify areas that were not built-up, with a distinction between those where further development is unlikely (for example, due to size or shape) and those where further development appears less constrained. The latter areas covered 194 km² within the study area. A comparison of the balance of the green and grey (not green) shares between 2017 and 2019 for these areas identified those areas which had become greyer/greener. About 5 % had become greyer and 2 % greener. Using the zoning plans, these areas can be classified by purpose, such as residential, industrial/commercial, holiday homes and so on.

Within the study area, there were 13.5 km² that appeared to be newly developed during 2018 and 2019. Buildings accounted for 37 %, roads for 10 % and rest concerned other built-up areas (such as sports facilities and industrial areas). For these areas, an analysis of the balance of the green and grey (not green) shares was conducted using satellite images for 2017 and 2020.

  • For areas with buildings, about 40 % of the area became greyer. A similar analysis was done using information on the year of construction. In those cases where there were new construction works but the green share had not fallen greatly, one explanation could be that vegetation had already been removed at an earlier date (by 2017) in preparation for the construction works.
  • For roads, information on the year of construction is not available, and so aerial photographs were used to confirm whether roads declared as new on maps were in fact new or just newly registered. From this, roads were assigned a construction year of 2019 or 2018 (for new roads) or earlier (for newly registered). An analysis of the change in the balance of the green and grey (not green) shares shows that the areas for recently constructed roads became significantly greyer between 2017 and 2020, whereas older roads that were only recently registered became significantly greener.
  • About half of the 7 km² of other developments concerned industry and quarrying. A large proportion had little change in their green share, although it should be noted that these developments tend to have longer delays before being recorded. No systematic checks were done for these, but some specific changes were investigated.

The final part of this action concerned newly built-up objects, in other words those where the change in the green share between 2015 and 2017 was more than 20 %. The study area was restricted to Oslo and Akershus.

  • Most of the newly developed area, about 2.4 km², was previously forest, with most of the rest of the 4.2 km² total split between agricultural area and open firm ground. An analysis was also made of the proportion of trees, grass, agricultural area, built-up areas and water within each of these areas.
  • Most of the newly developed area, about 1.0 km², was used for detached individual houses. Other (not detached) single-family homes covered 0.4 km², multi-dwelling buildings occupied almost as much as did roads. Commercial buildings (such as offices, businesses and industry) were found on about 0.7 km², while new sports facilities did not cover more than 0.1 km². The remaining newly developed area was other buildings (0.5 km²) or unclassified (0.8 km²; without buildings or other objects to classify the use, possibly indicating preliminary works prior to construction).
  • A utilisation rate was calculated based on the area of buildings to the built-up area (excluding roads) within each property. This was considered to be too low compared with data for areas that had been built previously, implying that some building work may not have been completed.
  • The time for green areas to reach a stable level after development was compiled from an analysis of the green share combined with information on the age of the main building on each property. For each land use class of buildings (such as detached individual houses and multi-dwelling buildings) and for a typical range of utilisation rates, the development in the green share over time in years (five-year averages) was estimated, using the average green share observed in satellite images in 2015 and 2017. The green shares were relatively stable for properties whose main building was at least 25 years old.
  • The proportion of trees, grass, agricultural and built-up areas within each of the land use classes was calculated using averages from the 2015 and 2017 satellite images. For land use categories of housing, only those areas where the main building was built before 1990 were considered. This distribution indicates the proportion of these types of cover that can be expected in the future for new developments for each of the land use classes.
    • Sports facilities and areas with detached houses appear to have the greenest areas: about 60 % of the area is covered by vegetation.
    • Built-up areas for industry, commerce and services have the highest proportion of built-up area, around 90 %.
  • From all of the above information, it is possible to calculate the expected vegetation cover 30 years after development and compare this with the known vegetation cover before developments. Concerning areas developed in 2016 and 2017, the area covered by trees could be expected to fall from 67 % to 28 %, that of grass from 12 % to 11 %, that of the cultivated area from 17 % to 0 %; the built-up area’s share could be expected to rise from 4 % to 60 %.
A line chart showing, for developments during 2016 and 2017, the expected distribution of green surfaces and built-up areas in these newly developed areas over the next 30 years.
Figure 5: Distribution of green surfaces and built-up areas in newly developed areas over the next 30 years for developments during 2016 and 2017; expected development based on observations in older developments

Action 2

Accuracy increased sharply when moving from using georeferenced photos (the first round of analysis), to using training areas taken from applications for subsidies. In the first attempt, 61 % of the area was identified with the correct type of crop, while in both the second and third rounds a share of 82 % was recorded.

When using the validation dataset for comparison, meadows had the highest accuracy with 95 %. Oats were classified correctly for 90 % of the area, barley for 84 % and winter wheat for 82 %. Rye and spring wheat had a much lower accuracy, 48 % and 29 %, respectively. These shares represent the correct identification of crops. In addition, there are areas that are classified to one crop but were in fact another crop: these result in an overestimation for one crop and an underestimation for another.

  • For meadows, the underestimation was particularly small, but the overestimation was relatively large.
  • For oats, underestimation and overestimation were similar and cancel out to a large extent.
  • By contrast, for spring wheat the underestimation was considerably larger than the overestimation.
A stacked bar chart showing the area of various crops that are grown on a large scale that were underestimated, correctly classified and overestimated. Data are shown for barley, oats, spring wheat, winter wheat, rye and meadows.
Figure 6: Underestimation, correctly classified and overestimation for crops that are grown on a large scale (1) (decares)
(1) Fully cultivated areas for which no subsidy has been applied for are included in the area for meadows.

The results for crops grown to a minor extent are slightly uncertain due to a small comparative basis at the level of holdings. Oilseeds have a high degree of accuracy (83 %). For the rest of the crops, the accuracy was much lower than found for cereals and meadows, with correctly classified shares less than 60 %, falling to 35 % for strawberries, 17 % for fruits and berries. Given their small area, misclassification on some plots can have a major impact. Generally, the sample makes up too small of an area to draw conclusions about true accuracy.

A stacked bar chart showing the area of various crops that are grown to a lesser extent that were underestimated, correctly classified and overestimated. Data are shown for potatoes, vegetables, peas, canola, strawberries, and fruits and berries.
Figure 7: Underestimation, correctly classified and overestimation; crops grown to a lesser extent (decares)

Within the fully cultivated area, a comparison for each crop of the area for which an application for subsidies was made and the area covered by each crop is particularly important for those crops grown on a small scale, as the validation dataset was small.

  • The area for the diverse category of vegetables was greatly underestimated as the satellite image interpretation identified two thirds of the area for which subsidies were applied. This may be due to differences in topography, temperatures, sowing time and so on, but it may also be due to a varied phenology within this diverse category.
  • For strawberries, fruit and other berries, there was a good level of consistency in the results between the areas from subsidy applications and satellite data; this was also the case for oilseeds and potatoes.

Action 3

Efforts to establish land accounts are strongly supported locally, regionally and nationally in Norway, and international work in this field is important in order to give advice on standard approaches for possible development of ecosystem accounting. Such accounting is of high interest for Norwegian policymakers and interest groups.

The analysis of pollinators shows that pollinator dependent crops grown in the Ås municipality are capable of being accessed by wild bee pollinators. All parcels of all crop types were within 500 m of areas containing optimal or near optimal habitat, indicating that the landscape surrounding pollinator dependent crops is likely capable of supporting populations of wild bee species outside of the period when crop plants are flowering.

Direct access to

Other articles
Tables
Database
Dedicated section
Publications
Methodology
Visualisations