Snowflake invests in Metaplane to solve data quality issues plaguing AI development
Join us in returning to NYC on June 5th to collaborate with executive leaders in exploring comprehensive methods for auditing AI models regarding bias, performance, and ethical compliance across diverse organizations. Find out how you can attend here.
Today, Snowflake announced an investment in Metaplane, a Boston-based startup helping enterprises identify and rectify data quality issues with an end-to-end AI-powered platform.
While the amount invested remains undisclosed, Snowflake says that the backing will result in tighter integration between Metaplane’s data observability offering and the Snowflake data cloud, allowing users of the latter to keep a closer eye on the information powering their downstream projects, including AI applications.
Metaplane, which takes on heavily funded players like Monte Carlo and Acceldata, will also launch a native app for the data platform, Snowflake has confirmed. Notably, the move marks Snowflake’s fifth investment of the year and the second one in the observability domain. Back in March, the company backed Observe, which analyzes telemetry from enterprise applications and provides users with relevant context to quickly identify and resolve incidents.
Today, data is the driving force of modern business applications, including RAG-based AI chatbots, but most organizations are struggling to keep their data quality affairs in order. There’s just so much information, spread across siloed systems, databases and applications, that teams are finding it difficult to keep tabs on everything to identify issues and abnormalities. The complex pipelines can sometimes leave teams with hundreds or even thousands of sources to wrangle.
VB Event
The AI Impact Tour: The AI Audit
Request an invite
Founded by MIT graduate Kevin Hu, former HubSpot engineer Peter Casinelli and ex-Appcues developer Guru Mahendran, Metaplane solves this problem by applying AI at different layers of the data stack, right from ingestion to consumption.
The platform integrates with tools across the data stack — like Fivetran, Snowflake BigQuery, dbt, Airflow and Tableau – and uses a machine learning (ML) model to train on the entire data profile, covering historical metadata, lineage and logs. Once the training is complete, it automatically starts flagging data anomalies (even schema changes) according to the monitors set up by the user.
Hu previously told VentureBeat that these monitors can be set up in a matter of 15 minutes to keep an eye on data quality metrics such as freshness, row count, uniqueness and nullness. Meanwhile, alerts go directly to concerned data teams on their preferred channels.
Now, with the investment from Snowflake, Metaplane will deepen its integration with the Snowflake data cloud, covering even more telemetry and metadata on the platform. This, Snowflake said, will include entire data pipelines as well as app capabilities such as Snowpark, Snowpark Container Services, Snowflake Native Apps and Streamlit.
The work will eventually enable Snowflake customers to closely monitor the quality of their data assets moving through different stages of the pipeline to power downstream applications. In case anything breaks across applications and data, Metaplane will notify the users about the issue at hand along with the root cause of the problem and the most appropriate way to address it.
While it remains to be seen when exactly this deeper integration will go live, Snowflake says its partnership with Metaplane will also see the startup launching a native app of its platform on the data cloud. This will allow users to deploy and manage Metaplane directly within their Snowflake instance – without having to connect Snowflake as is the case with other data tools.
“That opens the door to even richer experiences and will allow customers to take full advantage of Metaplane without having to move or copy their data outside of the secure, governed environment of their Snowflake account,” Ashwin Kamath and Harsha Kapre, who both handle product management at Snowflake, wrote in a joint blog post.
Snowflake’s racing to embrace the age of AI
Ever since Sridhar Ramaswamy took over as the CEO, Snowflake has taken an aggressive approach to embrace AI at the intersection of data and better compete with Databricks, which has been focused on AI from its early stages.
Last year, at its Snowday event, the company launched Cortex, a fully managed service to build gen AI apps with information stored in the data cloud. Over the subsequent months, it roped in several open-source AI vendors, including Mistral and Reka, to offer their models on Cortex and help teams build apps targeting different use cases. The company even trained Arctic, its own large language model (LLM) optimized for complex enterprise workloads such as SQL generation, code generation and instruction following, and launched a copilot experience to help users understand and explore their data.
Before Metaplane, the company invested in four other companies to bolster its data and AI efforts. These were Coda, Coalesce, Observe and Landing AI.