Will active metadata eat the orchestrator?
Leave orchestration to the engineers, and let the business feast on its metadata
This is essay #3 in the Symposium on Is the Orchestrator Dead or Alive? You can read more posts from Louise on Medium.
The world of data management is evolving rapidly, and one of the latest trends gaining traction is active metadata.
Active metadata refers to a dynamic approach to managing and using metadata, allowing it to flow across the different parts of the data stack, fetching people in the tools they are already using. Active metadata is actively updated processed, and integrated into various tools and workflows. This notion is often opposed to the idea of static metadata, or the traditional process of storing metadata in a static data catalog.
Now, I’ve heard the rumor that active metadata is a play by data catalogs to replace the orchestrator.
Where is this coming from?
Well, a data orchestrator is a software solution or platform responsible for automating and managing the flow of data across different systems, applications, and storage locations. Sounds familiar to the active metadata description? Indeed. Is active metadata trying to steal the fame?
I think not. Why is that? Active Metadata has better things to do. It’s hungry for something else.
In this piece, I’ll explore the extent to which Active Metadata could eat the orchestrator. Turns out, Metadata can eat half of the orchestrator, but it cannot eat it entirely. But more importantly, Active Metadata should not want to eat the orchestrator. In fact, it’s hungry for business users.
What Exactly is Active Metadata Eating?
Data orchestration involves managing complex workflows and ensuring that the right data is available, transformed, and integrated as needed.
This process can be broken down into two interconnected components: the “triggering mechanism” and the "optimization intelligence."
The triggering mechanism is in charge of starting data orchestration workflows based on factors like time, events, or changes in the data.
The optimization intelligence, on the other hand, is responsible for optimizing the triggering mechanism. This includes deciding which workflows to execute, the order of execution, and how to transform and route the data by analyzing the metadata.
Active metadata, with its real-time updates and integration into various tools and workflows, has the potential to replace the triggering mechanism aspect of data orchestrators.
It can automate the initiation of workflows, provide real-time updates for the efficient triggering of data processes, and create context-aware workflows that dynamically adapt to changes in the data landscape.
By replacing the triggering mechanism component of data orchestrators, active metadata can offer enhanced automation, better integration with existing tools, and more efficient collaboration between teams. Isn’t this lovely?
However, active metadata cannot completely replace the second component of the orchestrator: optimization intelligence.
This is because data orchestrators still play a vital role in handling complex data processing, transformation, and integration tasks that require their expertise and capabilities.
Optimization intelligence is essential in determining how to process and move data between different systems and applications, ensuring that the right data is available in the right format at the right time.
Active metadata, while powerful, cannot fully replicate the sophisticated decision-making and processing abilities of the orchestrator.
The latter remains an integral part of the orchestrator's role, as it handles the more complex and nuanced tasks of data processing, transformation, and integration.
This means that while active metadata can streamline certain aspects of data orchestration, the orchestrator continues to be crucial in managing the overall data workflows and transformations.
So, active metadata cannot replace the orchestrator. But the real question is: should it want to? The answer is no.
Active Metadata is hungry for something else
Active metadata is a technology that can tackle different use cases. And we do not think its true value lies in replacing the orchestrator.
At Castor, we believe that active metadata should pursue a much more rewarding goal: empowering business users. And it can do so by providing them with data context right where they are working.
Active metadata shouldn’t seek to eat the orchestrator. Instead, it should strive to become a personal data assistant that enhances collaboration, decision-making, and overall data literacy within the organization.
Active metadata can help business users make more informed decisions by offering necessary data context within their existing tools and workflows.
This context includes data definitions, lineage, relationships, and quality indicators, enabling users to better understand the data and make decisions based on accurate and up-to-date information.
Furthermore, active metadata can foster better collaboration and communication across the organization by integrating metadata into the tools and platforms that stakeholders already use as part of their workflow.
This seamless integration can bridge the gap between technical and non-technical team members, creating a more efficient and cohesive data-driven culture where everyone has access to the information they need to work effectively.
Finally, active metadata can play a crucial role in supporting data governance and compliance. By automating tasks such as data lineage tracking and data quality monitoring, active metadata can help organizations maintain control over their data and reduce the risk of non-compliance with regulatory requirements.
For these reasons, we sustain that the true value of active metadata lies in fostering innovation and growth within organizations.
By making it easier for employees to access, understand, and analyze data, active metadata empowers them to identify new opportunities, optimize processes, and develop innovative solutions to business challenges.
With a better understanding of data context and more efficient collaboration, organizations can leverage active Metadata to drive innovation, stay competitive, and achieve their strategic objectives.
In summary, even though active metadata can potentially influence the triggering mechanism aspect of data orchestrators, its real value is in its ability to act as a personal data assistant that enhances data literacy, collaboration, and decision-making within the organization.
By focusing on these more valuable goals, active metadata can bring tangible business value by empowering users to make the most of their data assets. We think the end game of active metadata is far more powerful than simply replacing orchestration, as it helps organizations unlock the full potential of their data and drive meaningful results.
Conclusion
In the ever-evolving landscape of data management, active metadata has emerged as a powerful force, offering new possibilities for streamlining data processes and empowering business users. While it may appear that active metadata has the potential to replace certain aspects of the orchestrator, particularly the triggering mechanism component, it should be put at the service of a different use case.
Active metadata's ultimate goal should be to serve as a personal data assistant, enhancing data literacy, collaboration, and decision-making across the organization. By focusing on these valuable objectives, active metadata not only complements the role of the orchestrator but also brings tangible business value by enabling users to maximize the potential of their data assets.
By embracing the possibilities offered by active metadata and leveraging its unique strengths, businesses can stay ahead of the curve, foster a data-driven culture, and achieve their strategic objectives in an increasingly competitive landscape.
So, while active metadata might have an appetite for some aspects of data orchestration, its true hunger lies in empowering business users to make better use of data.
Great post @louise! A natural question for me is -- what do you think about the reverse formulation -- that the orchestrator could eat "active metadata" ? Dagster, dbt Cloud, and Airflow, to name a few examples, all collect a lot of "asset / model / dataset" metadata and make it available via APIs. If they began exposing business-user-friendly interfaces, do you think they would be competitive with active metadata tools like Castor and Atlan?