Well the topic may seem like a pretty old concept, yet a vital one in the age of Big Data, Mobile BI and the Hadoops! As per FIMA 2012 benchmark report Data Quality (DQ) still remains as the topmost priority in data management strategy:
‘What gets measured improves!’ But often Data Quality (DQ) initiative is a reactive strategy as opposed to being a pro-active one; consider the impact bad data could have in a financial reporting scenario – brand tarnish, loss of investor confidence.
But are the business users aware of DQ issue? A research report by ‘The Data Warehousing Institute’, suggested that more that 80% of the business managers surveyed believed that the business data was fine, but just half of their technical counterparts agreed on the same!!! Having recognized this disparity, it would be a good idea to match the dimensions of data and the business problem created due to lack of data quality.
Data Quality Dimensions – IT Perspective
- Data Accuracy – the degree to which data reflects the real world
- Data Completeness – inclusion of all relevant attributes of data
- Data Consistency – uniformity of data across the enterprise
- Data Timeliness – Is the data up-to-date?
- Data Audit ability – Is the data reliable?
Business Problems – Due to Lack of Data Quality
Department/End-Users | Business Challenges | Data Quality Dimension* |
Human Resources | The actual employee performance as reviewed by the manager is not in sync with the HR database, Inaccurate employee classification based on government classification groups – minorities, differently abled | Data consistency, accuracy |
Marketing | Print and mailing costs associated with sending duplicate copies of promotional messages to the same customer/prospect, or sending it to the wrong address/email | Data timeliness |
Customer Service | Extra call support minutes due to incomplete data with regards to customer and poorly-defined metadata for knowledge base | Data completeness |
Sales | Lost sales due to lack of proper customer purchase/contact information that paralysis the organization from performing behavioral analytics | Data consistency, timeliness |
‘C’ Level | Reports that drive top management decision making are not in sync with the actual operational data, getting a 360o view of the enterprise | Data consistency |
Cross Functional | Sales and financial reports are not in sync with each other – typically data silos | Data consistency, audit ability |
Procurement | The procurement level of commodities are different from the requirement of production resulting in excess/insufficient inventory | Data consistency, accuracy |
Sales Channel | There are different representations of the same product across ecommerce sites, kiosks, stores and the product names/codes in these channels are different from those in the warehouse system. This results in delays/wrong items being shipped to the customer | Data consistency, accuracy |
*Just a perspective, there could be other dimensions causing these issues too
As it is evident, data is not just an IT issue but a business issue too and requires a ‘Collaborative Data Management’ approach (including business and IT) towards ensuring quality data. The solution is multifold starting from planning, execution and sustaining a data quality strategy. Aspects such as data profiling, MDM, data governance are vital guards that helps to analyze data, get first-hand information on its quality and to maintain its quality on an on-going basis.
Collaborative Data Management – Approach
Key steps in Collaborative Data Management would be to:
- Define and measure metrics for data with business team
- Assess existing data for the metrics – carry out a profiling exercise with IT team
- Implement data quality measures as a joint team
- Enforce a data quality fire wall (MDM) to ensure correct data enters the information ecosystem as a governance process
- Institute Data Governance and Stewardship programs to make data quality a routine and stable practice at a strategic level
This approach would ensure that the data ecosystem within a company is distilled as it involves business and IT users from each department at all hierarchy.
Thanks for reading, would appreciate your thoughts.