Business intelligence is only as good as the underlying data
The amount, variety and complexity of data in analytical data platforms has grown exponentially over the past several years. The latest advancements in the automation of analytics with reporting, machine learning and artificial intelligence have led to fully automated data pipelines. However, with these advances, the challenge of ensuring that the data used for business intelligence comes from the correct sources and doesn't get corrupted in the process has grown. When data is improperly sourced or corrupted, subsequent business decisions will be faulty.
Practical approach to data governance
While other companies focus on organizational process and governance, we concentrate on a technical approach to data governance. In our experience, we have frequently seen organizational controls fail due to a lack of culture, insufficient attention, the demand of overly complex cross-departmental orchestration, an increase in manual efforts and plain human errors. Therefore, we take a practical approach to the problem, and use targeted automation and machine learning to ensure data correctness.
Common use cases
Data catalog and glossary
Use case: Find data location by description.
Example: A data analyst needs to discover where a customer address is stored, or find what attributes the customer has.
Solution:
1. Provide a self-service portal to users.
2. Enforce a column and dataset naming convention.
3. Augment columns with searchable descriptions.
Data lineage
Use case: Trace data origins.
Example: A data analyst discovers a broken dataset and needs to find where the data originally came from.
Solution:
1. Provide a self-service portal to users.
2. Implement tooling that collects data modification logs.
3. Ensure that tooling is connected with all data pipeline implementation technologies.
Data quality
Use case: Detect data corruption and prevent bad data from propagation.
Example: A data source format changes unexpectedly, contaminating data in the system and spoiling executive reports.
Solution:
1. Implement statistics and machine learning to detect any data corruption.
2. Alert the support team in case there are issues.
3. Prevent the propagation of corrupted data in real-time.
Key features

Self-service data catalog

Dataset profile

Lineage dashboard

Data glossary portal

Data quality enforcement

Quick alert system

Enterprise-wide scale

Machine learning
How it works

Technology stack
Engagement model
We value a hands-on approach, which usually starts with a deep technical analysis of the data platform the client currently operates with. To accomplish this, a hands-on architect or principal engineer joins the team and performs an assessment of the architecture.
The outcome of the architecture assessment phase is a documented target state and a detailed implementation plan with estimates for goals and the effort necessary to reach them. The implementation phase also includes the implementing of required aspects of data governance and data quality on the client platform.