, which suggests that data quality can be assessed in terms of six dimensions:
Completeness – is the data set complete? Gaps in the data may result in vulnerable individuals or households being missed and excluded from the
Uniqueness – is there duplication of data? This concerns duplication of records, i.e., is there more than one entry for a particular entity or measurement.
Consistency – to what degree do the values in the data set contradict other values representing the same individual or household. Inconsistent data may compromise risk stratification by giving a false picture of potential vulnerabilities.
Timeliness – this is the degree to which the data accurately reflects the period it represents and that the data is up to date. Data that is out of date undermines the effectiveness of decision-making and may result in breaches of the data protection legislation.
Validity – this concerns the degree to which the data is within the range and format expected. For example, a person with a negative age or a month with an incorrect number of days.
Accuracy – to what degree does the data match reality? For example, the presence of biases or errors that mean the data is not representative of the phenomena or entities it represents. The accuracy of the data can be influenced by the method of collection, the choices available for data items selected from lists, or differences in interpretation of the meaning of terms.
The above dimensions are a good starting point for managing and improving data quality.
The dimensions are also important indicators regarding the provenance of the data, allowing the recipient of shared data to assess the fitness for purpose of a data set.
should understand the data quality of
and determine whether they are fit for purpose.
should include data quality statements about the data they provide.
Want to print your doc? This is not the way.
Try clicking the ⋯ next to your doc name or using a keyboard shortcut (