Statistics Explained

Merging statistics and geospatial information, 2020 projects - Sweden


Road map for improved geospatial quality in the Swedish business register; 2020 project; final report 23 December 2022

SE GG2023.jpg


This article forms part of Eurostat’s statistical report on the Integration of statistical and geospatial information.

Full article

Problem

A growing data demand from different user groups, in combination with advancements in geospatial processing methods and technical infrastructure, has triggered an increased use of geocoded data from the business register system for integrated analyses (such as accessibility/proximity studies or environmental accounts). The lack of geospatial quality is an obstacle to harness the full potential of the information.

Objectives

The general objective of the project is to improve geospatial quality in Statistics Sweden’s business register system. The result of the project should be a road map for direct implementation, providing:

  • a clear and comprehensive road map describing the inflow and outflow of address information to the business register system;
  • a full description of the current processes for altering, improving and correcting address information from the business register system;
  • a critical review and identification of weak spots, gaps and methodological drawbacks;
  • a comprehensive list of solutions for improved quality;
  • a set of quality objectives for geocoding the quality of the business register system and quality indicators to measure and monitor progress.

The project focuses on the improvements of processes within the statistics office and in other government agencies which are involved in address data for local units.

Method

The project started with a phase of exploration, description and evaluation, to get a common understanding among all parties, collect information from external partners (such as the Swedish Companies Registration Office (Bolagsverket), the Swedish Mapping Agency (Lantmäteriet) and the Swedish Tax Administration (Skatteverket)) and within Statistics Sweden, to identify existing issues, document current processes, evaluate obstacles to data flow which lead to poor address quality, evaluate hindrances among external data providers, and evaluate issues related to the correction of addresses.

The next phase was to develop a list of long-term solutions.

  • All correct address data should be collected already at the source and any corrections of collected address data should be done at the source. The sources are the three main suppliers of local unit address information: Statistics Sweden, the Swedish Companies Registration Office and the Swedish Tax Administration.
  • Location addresses, and not postal addresses, should be collected. The Swedish Companies Registration Office and the Swedish Tax Authority generally provide postal addresses which are not suitable for geocoding of the true location of businesses; a shift in focus to also collect location addresses would probably increase the address quality greatly and Statistics Sweden would not need to transform postal addresses into location addresses.
  • Address information should only be accepted during data collection if business owners register their business addresses according to an official location address of the Swedish Mapping Agency’s authoritative official national address register. None of the three government authorities which are major address data providers for the business register currently verify that respondents register their address according to the authoritative official national address register. However, legal and practical issues exist.
  • The newly established data domain on business data (grunddatadomän företag) must meet Statistics Sweden’s quality requirements on local unit address data. The Swedish Companies Registration Office, the Swedish Mapping Agency, the Swedish Tax Administration and Statistics Sweden are all involved in the data domain, under the leadership of the Swedish Companies Registration Office. Current data collections have different aims and follow different standards: the Swedish Companies Registration Office and the Swedish Tax Administration Statistics are focused on addresses of main offices, while Statistics Sweden collects all local unit addresses of businesses. The establishment of a specification on business address data to be used by all government agencies handling business data has been proposed; this would enhance standardisation and digitalisation.
  • Statistics Sweden needs a continued close collaboration with municipalities responsible for addresses, the Swedish Standards Institute technical committee and the National Mapping Agency. In practical terms, this involves supplying municipalities with lists of faulty business addresses. Some faults occur when businesses do not have a location address and others for large local units such as hospitals, educational institutions or industrial/business parks.

A list of short-term solutions was also developed.

  • An efficient automated geocoding service built on data matching (in other words, machine learning) should be established in Statistics Sweden to replace the current semi-automated correction process. The new geocoding service should be used for all address data coming into Statistics Sweden such that geocoding would take place already at the main repository at Statistics Sweden and not downstream.
  • Establish a process for corrected addresses late in the data flow chain to flow back to previous instances and be integrated and used already at these stages.
  • Municipalities, private businesses and other external users of the business register’s data currently perform their own corrections of faulty local unit address information, and the possibility should be developed for corrected local unit address information to be supplied.
  • An address quality specialist should be reintroduced at Statistics Sweden to oversee the address quality standards across the office.
  • The correction of addresses should focus on local units in the business register with a high number of employees.
A table showing the number and share of non-geocoded local units with information for the number of employees per local unit, the number of local units, the number of non-geocoded local units, and the share of non-geocoded local units in the total number of local units. Data are in numbers and percent.
Table 1: Number and share of non-geocoded local units


A diagram showing an example of a typical missing location address.
Figure 1: Typical missing location address

Results

The road map was developed and presented in the form of recommendations under long-term and short-term solutions.

Work on some of the short-term solutions started during the course of the project, such as a) establishing an efficient, automated geocoding service for incoming address data at Statistics Sweden and b) focusing on correcting/adding local unit addresses for local units with a high number of employees.

Direct access to

Other articles
Tables
Database
Dedicated section
Publications
Methodology
Visualisations