A star is born: a new giant appears in Silicon Valley

“Hi, just wanted to say hi. Can I invest a little more? Up-and-coming startup leaders are bombarded by texts like this these days. In particular, big funds are scrambling to get a piece of the tech pie. However, one founder appears to have received a higher proportion of such pitches: Ali Ghodsi, CEO of Databricks. And he has said yes to many. On August 31, the company confirmed that just six months after reaching a $1 billion financial deal, it had secured another $1.6 billion, valuing it at $38 billion, $10 billion more than after the previous round. Among those familiar with Silicon Valley, these numbers cement Databricks' status as the loudest company today.

It is likely that the software maker will soon be known in a wider arena. It is expected that before the end of the year it will lead the largest initial share launch (IPO) of a software firm, surpassing that of Snowflake, its main rival, at the end of 2020. Alternatively some predict that it could be bought by Microsoft in the largest takeover in the history of the software sector. Whatever the outcome, pomposity has substance. Databricks could become, in the age of artificial intelligence (AI), what Oracle and its databases once were in the world of mainstream corporate software: the dominant platform on which applications are built and run.

Databricks was founded in 2013 to commercialize Spark, open source software that processes vast amounts of data from disparate sources to train algorithms that become the engines of AI applications. The firm added resources, including code that makes it easy for you to program the system as well as manage the flow of tasks, and offered the package as a cloud-based subscription service.

However, Databricks only really took off when it added another component called a “lakehouse”. It is the combination of two types of databases, a "data warehouse" and a "data lake", hence the name. Historically the two have been separated due to technical limitations and because they serve different purposes. The repositories are full of well-defined corporate data, which allows a firm to analyze its past, for example, how its sales have evolved, something known as "business intelligence" (BI). Data lakes are essentially a dumping ground for all sorts of data that can reveal the future of a firm, including whether sales are likely to rise or fall. But this separation is increasingly inefficient and unnecessary, explains Max Schireson of Battery Ventures, an investor in Databricks. “Doing IN and IA in different systems today is kind of stupid,” he says.

A star is born: a new giant looms in Silicon Valley

Firms have been rapidly turning to what Databricks has to offer, particularly established companies worried about being overtaken by an AI-based startup. Comcast, a US broadband provider, uses it to allow customers to select movies by voice; the Dutch bank ABN Amro, to recommend services, and the H&M store chain, to optimize its production chain.

Databricks now has more than 5,000 customers and annual subscription revenue of $600 million, up 75% year-over-year.