Explore

Data Matching

Matching of personal data (i.e., combining, comparing or matching personal data obtained from multiple sources) is a sensitive activity covered by various provisions in both the UK GDPR and the Data Protection Act (DPA) 2018.

Automated processing of personal data to establish a person’s location falls within the definition of profiling under GDPR Article 4(4). The ICO therefore regards data matching as being likely to result in high risk and requires completion of a Data Protection Impact Assessment (DPIA) for data matching activities. Public sector data matching is covered by the

Cabinet Office’s Code of Data Matching Practice⁠

requires that: “the data that is adequate, relevant and limited to what is necessary to undertake the matching exercise, to enable individuals to be identified accurately.”

To reduce the potential risks and develop matching algorithms that achieve accurate identification of individuals, the

@Lead Organisation

⁠

will need a good understanding of the contents of shared datasets, i.e., the data quality and provenance.

Matching
`Residence`⁠
data

Given the ambiguities that can occur in matching four-line addresses and the fact that postcodes can cover tens of properties, wherever practicable it is recommended that property address matching is based on UPRNs.

There is a good case to include UPRNs in back-office datasets that may contain

@Vulnerability Attributes

⁠

so that they can be matched in a

@Risk Index

⁠

Matching
`Person`⁠
data

The matching of

Person⁠

is complicated by the manner in which names are used and stored. There can be significant differences between the names in some official records, e.g., birth certificates, passports, etc., and the names people are called or use on a day-to-day basis. For example, a person with the forename Elizabeth may be known as wish to be called Liz, Lizzie, Lizzy, Betty, Beth, etc. or something completely different, and names can have alternative spellings, e.g., Elisabeth.

A variety of methods exist for matching the names of individuals, with varying levels of accuracy and therefore risk of incorrect matches.

Government guidance on quality assessment of data linkage⁠

has been published and

advice on NHS number matching⁠

can be more generally applied.

It is recommended that data matching is based initially on

Residence⁠

to identify potentially related personal records and that the matching is refined based on

Person⁠

by using a combination of title, surname, forename and date of birth. Where the address and these four fields are exact matches the likelihood of errors is very low. Where ambiguity remains across datasets for a @Residence, manual intervention to review the mismatch may be necessary before commencing further processing of the records.

This embedded link can't be shown.

Want to print your doc?
This is not the way.

Try clicking the ··· in the right corner or using a keyboard shortcut (

CtrlP

) instead.

Data Matching

Matching Residence⁠ data

Matching Person⁠ data

Matching
`Residence`⁠
data

Matching
`Person`⁠
data