Certified Datasets as Drivers for Data Democratization

We started our journey at Infostrux in November 2020, two months before we officially launched the business publicly in January. We came up with many theories and hypotheses around the market and the customer needs and felt strongly that data engineering is a challenging problem for many businesses. We indeed correctly predicted there would be great interest in such expertise. However, we didn’t anticipate what we learned in the four months since January.

The punchline is that customers want to make sense of their data fast! The engineering effort to bring the data together under one platform is an important one and the expertise to do that reliably is difficult to find. However, many technologies exist to make that work more manageable, and patterns and best practices have been developed to automate the process using data lakes and data pipelines. In the cloud, customers were used to advantages provided by virtually infinite scalability and performance and they’re expecting the same with data. Spending months to build the infrastructure for ingesting data from multiple sources, writing and deploying ETL scripts, and configuring data lakes and warehouses is not what they want to see. They want to move to value quickly, but there is one big hurdle for that.

The next step, integrating and modelling the data, is a real challenge with no easy shortcuts. It requires a thorough understanding of the data from the various sources used by the organization along with a deeper understanding of the business operations and how multiple departments and stakeholders use data. It is a collaborative effort that requires the coordination of various groups of people. Effective communication, use of precise language, and the ability to cut through the organizational divides are some of the softer skills needed to successfully bridge the gap between a random pile of data that is seemingly clean. But, only a few can effectively use and carefully curated datasets that democratize access to information by lowering the barrier for anyone with decent knowledge of SQL or a common BI and analytical tool to start gaining insights from that information.

At Infostrux, we come to data from the bottom up. We like solving the plumbing problems that typically plague BI and analytics projects. We want to remove undifferentiated problems for our customers by letting automation do the heavy lifting. We realize we’re in a unique position to work with a variety of common data sources that many businesses commonly use. This allows us to build an intimate knowledge of the data in those sources and to understand how most customers use them. We didn’t quite expect that virtually every customer we talk to will ask for help with integrating and modelling their data and giving access to reliable datasets to different parts of the organization to work with them directly.

By accelerating the process for quickly ingesting data from multiple sources and bringing it all together under one platform (in our case, Snowflake), we’re enabling many of our customers to have an opportunity for the very first time to access all of their data at once. This is driving the appetite for making sense of that data right there where the data is hosted so it can be shared directly with analysts to run their processing and generating insights without the data models being locked inside reports and dashboards. Curating reliable datasets that anyone can work with unlocks a lot of value and creates many data power users within the organization. Data Models are becoming the interface through which data is democratized and investments in new capabilities like advanced analytics or data science are enabled.

As a business, we focus on solving problems. Having access to reliable, curated, and trusted datasets is a common problem holding organizations back from moving towards becoming data-driven. We’re happy to be able to help remove that problem for our customers.

Goran Kimovski 

Share on facebook
Facebook
Share on twitter
Twitter
Share on linkedin
LinkedIn
Scroll to Top