Covid-19 changed everything. It changed where we work, how we communicate, travel, shop, what we buy and how much. Many companies are struggling to cope with these changes. Suddenly, well-known models of operations and strategies don’t work anymore.
One thing, however, didn’t change—the importance of using up-to-date, trustworthy data that supports strategic and operational decisions. In light of all the market changes, the challenge today is how to build data strategies and deploy modern data solutions, but with a smaller budget.
When working with our clients on several different data projects, we’ve noticed that they often suffer from:
- lengthy development
- an overloaded IT department
- low quality data
- ungoverned shadow IT
As a consequence of these issues, it’s difficult to access up-to-date, relevant data, making it even more difficult to make timely and accurate business decisions.
Working together with our clients to overcome these very issues, we’ve created a method that enables us to efficiently deliver data solutions in line with our approach to DataOps.
DataOps—not just another buzzword
What does DataOps mean to us? Well, we see it as an overall approach to data analytics that enables you to make precise, informed decisions across your entire organization. It combines three key dimensions: processes, tools, and people.
To be able to use data in your day-to-day operations—whether that concerns daily product ordering, your sales strategy for the next quarter, or HR matters—you must be able to trust it.
Even slight inconsistencies in data points can influence the overall outcome and lead to false conclusions. Therefore, a strong emphasis on data quality processes during development and automatic data monitoring and alerting after the solution goes live is crucial. If you have trust in your data management, you’ll be able to automate a significant portion of your decision-making processes—leaving the bulk of the work to the algorithms and freeing up your employees to focus on more strategic tasks. For instance, data automation and AI can be applied to streamline supply chain management, online marketing, fraud detection, and much more.
The new GDPR restrictions sometimes seem so intimidating that clients prefer to keep their data in silos, missing out on the opportunity to leverage integrating their data with other sources as a result. To be able to securely integrate your data with various data sources—and reap the resulting benefits—the key is a good understanding of available solutions, security limitations, and a proper set of process rules, which could be applied across your entire organisation. Many clients don’t even realise how much can be achieved with their data without even coming close to having to use stored personal information.
Last but not least, the process of delivering data projects might be considered a challenge. Building a corporate data warehouse might seem like a years’ long project. However, with the right approach to iterative delivery and agile principles, a BI team can deliver an MVP in just a few weeks, enabling you to take advantage of your data as soon as possible.
There’s a reason why ‘DataOps’ sounds similar to ‘DevOps’—DevOps is a crucial part of DataOps. Automated CI/CD (Continuous Integration/Continuous Delivery) processes are often neglected by BI providers. It’s a shame, because such tools significantly shorten the development cycle in data projects and help govern the quality of delivery.
When on the topic of CI/CD, it’s worth mentioning the topic of test automation. Running automated regression tests between newly developed versions of your reports (as opposed to production versions) can greatly accelerate the testing process. Doing so will enable you to check hundreds of reports every time your BI tool is upgraded to a newer version. This kind of efficiency optimisation can also be gained by running automated tests when migrating a system to a new database engine.
Data reconciliation is crucial for every data project. End-to-end testing—from the source system to a data point shown on the dashboard—helps to build trust in the data and assures users that it’s reliable.
Machine learning models
The same rules apply when we employ Machine Learning (ML) models in our data solutions—taking advantage of automation can help produce better outcomes.
For a Data Scientist, it could be easier to develop prototypes in Jupyter Notebooks, but a mature approach to the development of such solutions cannot be based purely on this. What is needed is a balance between experimentation and operationalisation; a balance that iteratively produces an increasingly automated environment. An automatically deployable solution can help you benefit from frequent and clear experimentation, effective tests, and a painless GO-LIVE.
Proper training and monitoring
Building an enterprise data solution shouldn’t end with its rollout. Current BI solutions are enabling citizen development to bring data closer to the business, however, at the same time, this can cause the uncontrolled growth of unsupported reporting solutions.
Therefore, before empowering users to build their own dashboards, it’s essential to conduct training, which will focus on presenting the BI tool’s possibilities and the risks associated with incorrect usage. Nonetheless, even once users are trained, it’s still best to monitor their activity.
Providers will often give you a package of reports called ‘BI on BI’. The provided reports cover such aspects as user behaviour monitoring, the usage of reports, traffic on data sources, etc. Such active and close cooperation between business stakeholders and IT is crucial to ensure that your data system remains stable over the course of time.
Data in the cloud
Many companies are deciding to move their data hubs to the cloud. Data volumes are growing, more and more data sources are being made available, and on-premise data centres are becoming a bottleneck. Servers have become too slow to process real-time data, disk spaces are insufficient, and ordering new hardware takes months. The alternative is to leverage data services on the cloud.
Of course, migrating to the cloud is rarely just a matter of applying the ‘Lift and Shift’ approach to your existing data solutions. Doing so requires a deep understanding of available architectures and services, a prepared migration strategy, and the possibility to forecast expected costs. As data normally comprises of crucial information concerning customers, strategies, financial results, etc., its utmost security becomes a priority. Fortunately, cloud services follow the best security standards, which include data encryption, row level security, geo-replication, automatic retention, and private networks.
A well-prepared and executed cloud migration, fully supported by test and data migration automation tools will result in a fit-for-purpose data solution. Such a solution will be prepared for both usage and data flow peaks and will have a lower total cost of ownership.
However, not every tool makes sense everywhere and every time— this, of course, depends on the solution you need. For instance, setting up a CI/CD process for a single report that will only be used for a few weeks would be an overkill. Therefore, we recommend adapting the technologies being used to the project’s scale and requirements.
Leading BI tools put strong emphasis on citizen development. Although their learning curve is not steep, they provide comprehensive tools for data management, data preparation, and modelling.
However, when many individuals within an organisation use self-developed dashboards and reports, maintaining optimal governance and quality can be challenging. Therefore, it’s crucial to have experienced business stakeholders with analytical skills, who are proficient at using a particular BI tool (but who can also count on the support of the IT department, if needed).
The right rollout of a BI solution usually includes citizen development training, but it should also include a set of rules concerning data governance and how to cooperate with IT. Choosing where data will be processed and who will be responsible for data modelling will be a case-by-case decision, however these issues have to be thought through at the beginning of the process. Doing so at the start of a project will help you prevent your BI tool from becoming cluttered and non-functional.
Data can be one of your most precious assets, therefore, it should always be processed and presented in line with the highest standards of quality.
Applying a DataOps approach to your processes, tools, and employee competencies can help you to achieve those standards.
Moreover, with DataOps, you’ll gain access to crucial information quickly and efficiently, allowing you to use your data right when you need it, and before it becomes outdated.