Microsoft Opens a New Front in the Fight Over Data for AI Agents
The Takeaway
- Microsoft blocked partners including Databricks from connecting their data management tools to its popular Power BI product.
- Microsoft said it acted over concerns about reliability, while others see it as part of a fight for control of “semantic layer” tools.
- The semantic layer is increasingly vital for making AI agents more accurate and cheaper to run.
Microsoft has opened a new front in the AI data wars, blocking partners from connecting their data management tools to a popular product as it tries to defend its business software strength in the era of AI agents.
The friction centers on Power BI, a Microsoft product nearly all Fortune 500 firms use to analyze data about their operations in charts and other visual formats. Databricks, a longtime Microsoft partner that sells tools for managing data and building AI applications, in early March began testing a new feature that makes it easier for its customers to connect information on its platform to such visualization tools.
Weeks into the testing, Microsoft abruptly blocked the feature, causing the reports customers had built with it to immediately stop working, said two Databricks salespeople and consultants who work with customers.
While this was not the feature’s stated purpose, it essentially made it easier for Power BI customers to manage their data and build AI agents in Databricks instead of a competing data management offering from Microsoft, called Fabric. The feature saw widespread adoption, the Databricks salespeople said.
A Microsoft spokesperson said the company’s move was driven by concerns over accuracy and reliability issues stemming from the new feature, not by a change in competitive strategy. Many in the industry saw it as an effort to box out Databricks, Snowflake and others from a data management tool known as a semantic layer, which customers use to standardize definitions for business metrics such as revenue and customers. Semantic layers, which have been around for decades, are now emerging as a way to make AI agents more accurate and cheaper to run.
“Breaking the Databricks connector tells you exactly where Microsoft thinks the next platform war will be fought,” said Shehab Amin, CEO and co-founder of LakeSail, a startup that speeds up data processing for AI agents. “Microsoft’s playbook for the past 30 years has been to control products where customers start their work and then steer them to buy everything else from Microsoft.”
Microsoft’s decision has ramifications for other enterprise tech companies. Database provider Snowflake has thousands of customers that want to use its semantic layer with Power BI, Josh Klahr, the company’s head of product management for AI-powered business information, said via email.
Microsoft and other established business software companies are under increasing threat from AI agents’ ability to automate functions such as making the kinds of reports Power BI produces. They worry that customers might use AI agents to pull their data out of enterprise apps like Power BI and analyze it with AI from other providers.
Fears of such a shift have driven Microsoft’s stock down nearly 25% from its all-time high last year. The company has responded to the threat by adding more AI agent features directly into applications such as Microsoft 365 that may curb customers’ need to use competing agents. It has also previewed a feature called Work IQ that will let customers access data from their Microsoft applications and use them in other apps, such as third-party agents, but it hasn’t said how much it will charge customers for such data usage.
Why Data Labels Are Critical for Agents
AI agents need clear, well-labeled data to function effectively. The value of such data is why some business software providers are limiting access for other companies’ agents to the customer data they host or are setting up tollbooths to collect revenue on that access.
Semantic layers are part of a class of tools for creating context-rich data. These layers could hold the key to developing agents that can accurately handle multistep tasks such as routing invoices and processing new employees. Other such tools include knowledge graphs and ontologies, a philosophical term Palantir has helped popularize in the context of AI work.
Semantic layers help companies deal with the ambiguity of raw information. Sales data, for example, can include gross revenue, net revenue or invoiced revenue. By establishing standard definitions, semantic layers prevent confusion that could arise from different departments like finance and marketing having their own metrics.
At Google Cloud, some customers that use semantic layers to develop agents are seeing conversational accuracy rates of more than 90%, compared to between 60% and 70% when they don’t use the layers, said Yasmeen Ahmad, a managing director for the Alphabet unit.
Companies typically build semantic layers using employees from different departments, and it can take months to complete the process. The layers are often highly customized, which makes them an effective way for software providers to retain customers.
“Semantic layers are the new battleground across the entire enterprise stack,” said John “JG” Chirapurath, a software industry veteran who is now president at DataPelago, which sells software to accelerate data processing for AI and analytics computing jobs. “Whoever owns the definition of revenue, customer and orders is positioned to capture AI value, because agents run on definitions, not raw data.”
Microsoft rivals are pushing to make it easier for customers to use semantic layers that aren’t tied to a specific software provider. Snowflake and Salesforce are leading a coalition of nearly 50 companies that aims to develop an industry standard for semantic layers. Other members of the group, known as the Open Semantic Interchange, include Amazon Web Services and Oracle. Microsoft is not a member.
Chris Webb, a Microsoft program manager for Power BI, announced on LinkedIn last month that Microsoft has no plans to let Power BI customers use semantic layers from other companies. Doing so, he wrote in a related blog post, “would be a huge amount of work with few benefits to customers” because other companies’ products behave differently and might cause malfunctions.
The Databricks feature saved its customers from having to separately configure their data to work with Power BI. Databricks designed the feature to work with multiple visualization software products, but the biggest adoption, according to the Databricks salespeople, came from customers wanting to connect it with Power BI, which has more than 35 million monthly active users, the Microsoft spokesperson said. That put the feature in competition with Microsoft’s Fabric suite of database and AI products, which creates semantic layer data for Power BI and other products.
Microsoft is an investor in Databricks, and it works with Databricks and Snowflake to sell and promote their products through its Azure cloud service—as well as competing with them through Fabric. These arrangements are typical in enterprise software, but the battle for control of the semantic layer is upping the ante.
Both Microsoft and Databricks have played down the idea of any tension.
A Microsoft spokesperson said in an email that it made the change because the Databricks feature “introduced complexity and risk to data accuracy, so the change reflects a focus on reliability and long-term product integrity rather than a shift in partnership posture or competitive strategy.” The spokesperson declined to elaborate on the nature of the complexity and risk.
“We’re more committed than ever to our partnership with Microsoft and to helping joint customers accelerate innovation with data and AI,” a Databricks spokesperson said in an email. “That includes making our joint customers successful using Azure Databricks with PowerBI.”