Business intelligence is only as good as the underlying data

The amount, variety and complexity of data in analytical data platforms has grown exponentially over the past several years. The latest advancements in the automation of analytics with reporting, machine learning and artificial intelligence have led to fully automated data pipelines. However, with these advances, the challenge of ensuring that the data used for business intelligence comes from the correct sources and doesn't get corrupted in the process has grown. When data is improperly sourced or corrupted, subsequent business decisions will be faulty.

Practical approach to data governance

While other companies focus on organizational process and governance, we concentrate on a technical approach to data governance. In our experience, we have frequently seen organizational controls fail due to a lack of culture, insufficient attention, the demand of overly complex cross-departmental orchestration, an increase in manual efforts and plain human errors. Therefore, we take a practical approach to the problem, and use targeted automation and machine learning to ensure data correctness. 

Common use cases

Data catalog and glossary

Use case: Find data location by description.

Example: A data analyst needs to discover where a customer address is stored, or find what attributes the customer has.

Solution:

1. Provide a self-service portal to users.

2. Enforce a column and dataset naming convention.

3. Augment columns with searchable descriptions.

Data lineage

Use case: Trace data origins.

Example: A data analyst discovers a broken dataset and needs to find where the data originally came from.

Solution:

1. Provide a self-service portal to users.

2. Implement tooling that collects data modification logs.

3. Ensure that tooling is connected with all data pipeline implementation technologies.

Data quality

Use case: Detect data corruption and prevent bad data from propagation.

Example: A data source format changes unexpectedly, contaminating data in the system and spoiling executive reports.

Solution:

1. Implement statistics and machine learning to detect any data corruption.

2. Alert the support team in case there are issues.

3. Prevent the propagation of corrupted data in real-time.

Key features

Self-service data catalog

Easily find any data in the platform and check its current quality status.

Dataset profile

Provide deep insight for each dataset, such as schema, change log, metrics and more.

Lineage dashboard

Show where the data came from, and what other datasets were generated from it.

Data glossary portal

Provide a knowledge base for datasets and a transparent nomenclature for data rules and policies.

Data quality enforcement

Detect data corruption and prevent it from spreading.

Quick alert system

If there is corruption, the support team is notified quickly.

Enterprise-wide scale

Get outside of the data lake and thoroughly cover all source-of-record systems.

Machine learning

Implement anomaly detection and automate dataset metrics analytics with ML techniques.

How it works

Engagement model

We value a hands-on approach, which usually starts with a deep technical analysis of the data platform the client currently operates with. To accomplish this, a hands-on architect or principal engineer joins the team and performs an assessment of the architecture. The outcome of the architecture assessment phase is a documented target state and a detailed implementation plan with estimates for goals and the effort necessary to reach them. The implementation phase also includes the implementing of required aspects of data governance and data quality on the client platform.

Read more

How to achieve in-stream data deduplication for real-time bidding: a case study
In this case study, we share our experience delivering deduplicated data during In-Stream Processing for a large-scale RTB (real-time bidding) platform.
Read more
Data quality monitoring made easy
Controlling data quality through data monitoring is both affordable and relatively simple to perform. Improve the overall health of your business with data monitoring, data inspection and data cleansing.
Read more

Get in touch

We'd love to hear from you. Please provide us with your preferred contact method so we can be sure to reach you.

Please follow up to email alerts if you would like to receive information related to press releases, investors relations, and regulatory filings.