10 dimensions of data quality explained

April 25, 2022

Data is the new oil of the digital economy. It is crucial to the everyday operations of all enterprises. However, data can often be of very low quality. If used for business operations, such poor data will produce results that alienate customers and clients, resulting in huge maintenance expenses and reduced business efficiency, effectiveness, and profitability.

The only way to address data quality is to adhere to the dimensions of data quality. Each dimension represents a specific characteristic or data attribute to be relevant to the users. A data quality dimension is a characteristic for classifying information and data requirements. It offers a way of measuring and managing data quality and information.

Within their data quality model in ISO/IEC 25012:2008(E) PA, the International Organization for Standardization defines a rather technical set of data quality characteristics as part of standards for software quality, distinguishing between two overarching data quality categories: inherent and system-dependent data quality.

- Advertisement -

In this case, inherent characteristics have an inherent ability to meet the requirements. In contrast, system-dependent characteristics depend on the IT system used to manage the data. This post presents nine frequently used data quality dimensions, which may apply to different data levels.

1. Accessibility or coverage

It is the measurement of the availability of required data records. This dimension depicts the extent to which the data are available and with which ease the user can access them. The breadth, depth, and availability of data that exists but is not available from a data provider is coverage. The ISO standard ISO/IEC 25012:2008(E) assesses data accessibility by considering the data’s intended use and the absence of barriers, thereby promoting data accessibility for people with disabilities. For measuring accessibility, no quantitative measure is recommended; instead, this dimension should be assessed qualitatively or by a grade.

2. Accuracy

It is a metric for determining the veracity of data concerning its authoritative source. The precision of data is measured by accuracy. It can be validated against defined business rules and measured against original documents or authoritative sources. Accuracy is a term used to describe the degree of agreement between data and real-world objects; it refers to how error-free and reliable the data are and how closely the data map (or come close to) the true values of the items. Divide the number of accurate items or records by the total number of items or records to calculate accuracy. A local population register, for example, contains 932,904 phone numbers, of which 813,942 have been confirmed, resulting in an accuracy of 87.25 percent (813,942/932,904 * 100).

- Advertisement -

3. Completeness

One of the most commonly used data quality dimensions is completeness, which measures the data’s presence. In the population of data records, completeness measures the presence of required data attributes. The dimension represents the extent to which all expected and required data (records, attributes) are present or not. It also shows the degree of resolution required for the intended use. When you divide the available items or records by the expected total number, you get a percentage, which you can multiply by 100 to get a percentage. For example, a local population register contains 932,904 people, but only 930,611 have a birth date. This yields a completion rate of 99.75 percent (930,611/932,904 * 100).

4. Consistency

The measurement of compliance with required formats, values, or definitions is known as consistency. It ensures that one population’s data values, formats, and definitions are consistent with those of another. Consistency, also known as consistent representation, refers to how data are free of internal contradictions, follow a set of rules, and are presented in the same format as previous data. The proportion of items or records found to be consistent can represent consistency. For example, we assume that in a population register, the date of birth should be stored in the “YYYY-MM-DD” (year-month-day) format. Date of birth was stored inverted as “DD-MM-YYYYY” in 61,196 of 930,611 total instances, resulting in a 93.42 percent ((930,611 61,196)/930,611 * 100) consistency.

5. Currency or Currentness

The degree to which the data are sufficiently or reasonably up to date for the intended task is currency or currentness. Data quality can be evaluated qualitatively. A dataset of bird observations from the summer of 1969, for example, is insufficient to forecast bird populations in 2022. Otherwise, the percentage of current records in a population register can be calculated by dividing the number of recently validated entries (764,111) by the total population (932,904), yielding 81.91 percent (764,111/932,904 * 100).

- Advertisement -

6. Relevancy or Relevance

Relevancy or Relevance as a dimension maps the degree to which the data meets the expectations and requirements of the user [43,45]. Like currency, relevancy may be evaluated qualitatively at the discretion and concerning the user’s requirements, e.g., using a scorecard.

7. Reliability

In some cases, reliability is used interchangeably with accuracy; however, others define the dimension as the degree to which an initial data value matches a subsequent data value. As a result, the quantification is identical.

8. Timeliness

Timeliness is another term that is frequently used to describe how the data’s age is suitable for the intended use or the time difference between a real-time event and the time it takes to capture or verify the data. Timeliness assesses how accurately content reflects the current market and business conditions and whether data is functionally available when needed. Timeliness can be measured in duration or the time between data collection and entry. The timeliness of data is nine days if employees of a population register enter addresses into a database collected nine days before.

9. Validity

Validity of data is defined as the degree to which the data agrees with established rules, and it is quantified in the same way as to accuracy. It assesses how well data conforms to internal, external, and industry-wide standards.

10. Uniqueness

The degree to which no record or attribute is recorded more than once is uniqueness. It alludes to the uniqueness of records and attributes. The goal is to create a single (unique) data recording.

- Advertisement -

Robotics-as-a-Service (RaaS): How subscription-based automation is redefining industry

Top 5 best sales analytics tools for Amazon sellers

Tombot Jennie Robotic Dog review (2025): Is it worth the $1,500 price tag?

Sustainable metal machining: Reducing waste with smart CNC technology

Top 20 open-source robotics projects and initiatives for robotics research

Top 5 powerful AI research tools every academic researcher should use

How to write a winning robotics conference paper – Proven strategies and tips

How to start AI and robotics research: A Guide for beginners and aspiring scholars

Cybersecurity certifications tailored for robotics engineers

The role of external support teams in driving SaaS growth

Top 20 open-source robotics projects and initiatives for robotics research

Top 5 powerful AI research tools every academic researcher should use

How AI strengthens anti-cheat systems against online poker bots

How to start AI and robotics research: A Guide for beginners and aspiring scholars

How to manually humanize AI content and bypass AI detectors

Top robotics programs and competitions advancing STEM education

Why robotics startups fail: Lessons from Rethink Robotics’ rise and fall

How to evaluate a robotics startup: A strategic guide for investors

How to generate leads using ChatGPT in 2025

Best TikTok posting times to gain more followers

10 high-demand manufacturing business ideas poised to boom in 2025

10 dimensions of data quality explained

1. Accessibility or coverage

2. Accuracy

3. Completeness

4. Consistency

5. Currency or Currentness

6. Relevancy or Relevance

7. Reliability

8. Timeliness

9. Validity

10. Uniqueness

MORE TO EXPLORE

Imitation Learning vs. Reinforcement Learning: Choosing the Right Approach for offline AI training

AI hallucinations and the future of trust: Insights from Dr. Ja-Naé Duane on navigating...

Can AI inventions be patented? Navigating the complex landscape of AI patentability

Top 5 best AI meeting assistants to automate notes, summaries, and action Items

Liquid neural networks: A neuro-inspired revolution in AI and Robotics

ABOUT US

FOLLOW US