Here’s a statistic that should unsettle anyone who has poured money into data infrastructure: according to an SAP survey, 72% of business leaders don’t trust their own data. Decades and billions of dollars into the enterprise data era, the industry’s long-standing promise of a “single version of truth” has never fully materialized. But that’s precisely the problem Daniel Yu is trying to solve.

Yu spent three years leading product marketing for Microsoft’s Azure data and AI portfolio before joining SAP in 2021. He now serves as Chief Marketing Officer for SAP Data and AI and Senior Vice President of Solution Management and Product Marketing, where he oversees Business Data Cloud.

His argument is that companies have been solving the wrong problem, chasing database consolidation when they should be pursuing something more achievable: creating shared context and consolidating the “definition of truth,” not the data itself.

In conversations at SAP Connect and in the lead-up to TechEd, Yu laid out SAP’s vision for SAP Business Data Cloud (BDC) as the connective layer across a fragmented data landscape — one that now extends to Snowflake, Databricks, and Google, with more partnerships on the way.

Q: What differentiates Business Data Cloud from other data solutions, including previous SAP offerings?

My background has been in databases for the last 20 years. Data management is a topic that every company has invested in. We had a very different point of view on which problem to solve. We didn’t just go and say, “Hey, I have a better database.” That’s not the BDC pitch. What we want to provide is better data, full stop.

A data strategy is not a database strategy. When I ask CIOs what their data strategy is, sometimes I hear, “I have a data lake,” or a warehouse, or a lake house. That’s not your data strategy. Your data strategy should be: can my business understand what I can provide and make better decisions?

That whole mantra of “I have a lot of data, let me put it into one location so that I have a single version of truth” — I’ve never seen one work. What we are trying to do with BDC is provide a way you can consolidate the definition of truth, which means we care much more about the context of data, the metadata, the knowledge graph that comes out of that, whether data is from SAP or not.

Q: For organizations with disjointed, siloed data estates, what are the next steps to start moving toward BDC?

We’ve actually had tremendous growth since BDC launched. In fact, it’s one of the fastest-growing products ever in the history of SAP. On the question of how to get started, the bad news is you’re never going to be done with your data strategy. It’s always ongoing.

I just came back from Europe, and I did a roundtable with different tech leaders, and everybody said the same thing. The best advice I can give to my peers is to start somewhere and focus on one use case. Get that done, and then do the other one.

A lot of the customers that start with BDC are starting by migrating their SAP Datasphere plus SAP Analytics Cloud (SAC) to BDC, or most likely, moving their SAP Business Warehouse (BW)—which already has tons of logic in place—into BDC. This is the quickest path for CIOs to really showcase what I call “AI-ready data.” AI-ready data is not rows and columns; it’s rows and columns with the semantics and metadata that give the context.

Q: Can you share SAP’s progress in developing data products?

From a pure numbers standpoint, we had hundreds of data products. But I don’t want to give you the impression that more data products are better, because there might be one data product—say, customer signals—that might contain so much richness. That could be the most valuable thing you can do for the finance team, the marketing team, and the supply chain team. So it’s not quantity only, but quality.

There are different types of data products. There are what we call primary data products, which are basically data that has processed columns plus metadata from the application. Now, you can actually derive data products. If I combine sales and marketing data, I can see pipeline data, for example, or the 360-degree view of the customer journey.

The data products unlock our agent potential. Agents running on their own without the context of the value chains can be very dangerous or very unproductive at best. If we had not announced BDC early in the year, the agent story would not have been as powerful. This is the flywheel effect, the symbiotic effect of good data feeding agents, creating more processes, feeding applications, and a continuous virtuous cycle.

Q: What was announced at TechEd around data products?

We announced something called SAP BDC Data Product Studio. The Data Product Studio allows you to create data products more easily, combine data products from SAP and non-SAP sources, and create data products from BW, which today is very manual. Companies have maybe thousands of information providers. Rebuilding that will be really hard.

The Data Product Studio will allow you to create data products, enrich data coming from BW within BDC into a catalog, combine data products from SAP systems, and bring in data products from Databricks and Snowflake. All of that will be shared in a single catalog.

Think of it as almost like a factory of data products. It’s a very easy GUI-based tool. There’s no scripting needed. You can drag and drop and combine data products into new data products. We believe that this will save a lot of time and create efficiencies for data teams.

Q: What intelligent applications are available now?

We had already previewed Cloud ERP Intelligence Private, Finance Intelligence, and People Intelligence, which went GA recently. In addition, we have Spend Intelligence, Supply Chain Intelligence, Revenue Intelligence, as well as one vertical: Consumer Products Intelligence. This is for retailers who want to do inventory planning and understand supply and demand.

We don’t expect to create 10,000 intelligent applications. That’s not the point, because there will be agents that do a lot of the work in the background. But we expect a lot of partners to build intelligent applications on top. We announced a bunch of them early in the year — with Adobe, McKinsey, others.

Q: You’ve talked about data products and intelligent applications as the building blocks. BDC Connect has expanded to include Google and Databricks. Why is that significant, and what’s next?

The industry is moving away from this monolithic view of the data—physical data going into a data warehouse or data lake—toward a virtualized view of the data. We call this data fabric. Most organizations around the world, in 10 years, will adopt some type of data fabric pattern. This is not just my point of view; this is Gartner.

Data fabric means you can actually have multiple data systems—BigQuery, BDC, BW, Salesforce, Oracle—but don’t need to try and move it around. That’s petabytes of data. Instead, move the definitions and the data products around. That’s the rich value you need to unlock for AI or human thinkers.

When we talk to customers, they say, “That’s great, but there’s something missing. You have 80% or 60% of our data landscape.” We have announced a Solex partnership with Snowflake and SAP BDC Connect for Snowflake as well. Even though marketing makes it seem like vendors are fighting each other, the truth is a lot of our customers have all of them — Databricks, SAP, and Snowflake.

We now have a complete set of compute from the BW world, from the semantical layer for BDC, plus Databricks as an OEM. I think there’s no other company that offers both OEM Databricks, first party, as well as Snowflake.

Q: What about AWS and Azure?

We just announced SAP Business Data Cloud (SAP BDC) Connect for Microsoft Fabric.

There’s a huge theme here: we are radically simplifying the data landscape for customers. We don’t say, “Please move your data to my environment,” which has been the de facto narrative from every vendor for the past 30 years. It doesn’t work because customers have multiple environments, multiple clouds. Bringing the ecosystems together is welcome news for a lot of customers.

Q: What are the major HANA Cloud announcements that came out of TechEd?

For HANA Cloud, there were three major announcements. One is MCP support for HANA Cloud. Now, if you have an MCP server, you can use it directly against HANA Cloud for your language models. That’s responding to one of the top requests from customers.

The second, we published a paper with Stanford on GRT, Generative Relational Transformers. It’s the same concept as GPT, but rather than working on text, it works against relational tables. You’re able to do what we call Tabular AI, meaning rows and columns. We now have that capability within HANA Cloud. You can ask questions against a relational database, rather than asking questions based on documents. In the corporate world, you need something much more deterministic than guesses based on emails.

Third, our Knowledge Graph now supports different custom columns and creates ontologies easily on the fly, without having developers design every single node and column. It’s a productivity improvement that will accelerate the development of agents on top of HANA Cloud.

Q: How does Tabular AI relate to the zero-copy architecture you’ve built with BDC Connect?

I think it’s orthogonal. We support zero-copy. Tabular AI is slightly more nuanced, because ultimately the success of your AI and data projects is based on how much you trust the data.

If I ask ChatGPT to assess a marketing campaign, it’s okay to be in the ballpark. When you want to understand your supply chain or doing financial close, where every cent matters, it’s not okay to be in the ballpark. You need the exact thing. That’s why relational tabular AI is important for business; it allows the language model to be used against a real database with foreign keys and primary keys. It understands duplicates and the uniqueness of records.

Q: So the long-term vision is BDC as the overarching umbrella, with HANA Cloud and these other pieces underneath it?

Our vision is to provide the broadest and deepest set of compute options within BDC, no matter what environment you’re familiar with. We will eventually bring HANA Cloud into a BDC environment so you can manage those clusters easily.

One of the biggest problems is that the cost for these data platforms is not insignificant. The cost of AI projects and data projects might run number one and number two, after security and governance. In the future, if you have compute power within Databricks, Snowflake, BW, SAC, or Datasphere, you can choose when and how you use it. Then you could say, “If I need zero compute, why run a 24/7 data warehouse in Snowflake? Maybe I can run it in six minutes in BDC.” And I know the cost for each under a single contract, so I can optimize.

That’s our vision: to provide choices for customers under the BDC umbrella, but for IT organizations to make the right choices based on business goals, not technology goals. Technology should be like electricity. You shouldn’t care which provider you use. You just want to connect.

Q: What are the prerequisites for customers to get value from these announcements?

The number one thing I recommend for CIOs or data leaders — it’s very hard to optimize things that you can’t see. If you can’t see where your data is, it’s very hard to optimize or leverage AI. The number one project you probably want to think of is creating a single catalog. A single catalog gives you an inventory of all your data products. It lets you understand lineage: who created it, when, and how it’s been used. You can see what the most valuable data set is which can fit into an AI model.

I don’t mean a physical catalog that you have to put into one product. Databricks has a catalog, Snowflake has a catalog, we have a catalog. What I mean is you want a single business catalog. A business catalog is different than a technical catalog. A technical catalog gives you the source, author, and data products. That’s not enough.

If you recall our partnership with Collibra, they actually provide a business catalog on top of multiple physical catalogs. They can transform that into language that customers can understand. And Collibra can do governance. You can put rules on top of that. Striving to have a business catalog with governance rules is the same thing you have to do for AI projects anyway.

Q: For customers who aren’t yet building data products or intelligent applications, what are some practical ways to start using AI?

My recommendation would be threefold.

First: use AI to detect patterns. It could be as simple as putting everything into a historical view — not a chart, but historical trends. Human beings detect patterns very quickly, whether the curve is going up or down. Providing data in a way that AI analyzes historical patterns would be super helpful.

Second: use AI to analyze outliers. You don’t have to run mega crazy projects. Companies can learn a lot by running an outlier agent to see what we perhaps haven’t thought about. Maybe it’s a new opportunity. Maybe it’s revenue leakage we’re not detecting.

Third: put natural language processing on top of the data you already have. That’s the easiest one, because it just unlocks who does what with data. Sometimes technical skills get in the way of finding the answer. Just give people the capability of asking questions on top of data, and it unlocks so much.

Those first three use cases some focus would unlock tremendous value for CIOs and their teams. These are three use cases out of hundreds you might have. But start there, and that will lead you to more questions or things to think about. The old model was, “We’ll just create a pipeline for you, a cube, and a dashboard,” which nobody used. Because by the time I had the question, and you built me a dashboard, I forgot about the original question and already had five more.

Stay ahead with First Five, ASUG's exclusive newsletter. Subscribe today for a curated email featuring top insights tailored for SAP users.

Want to continue reading this article?

Become a member and get access to all ASUG benefits including news, resources, webcasts, chapter events, and much more!

Log in

Not an ASUG member? Learn more