How to Measure an HTAP Data Platform for AI Applications

September 14, 2018

AI Is Mission Critical

Every company’s board is asking its executive team how they are using AI to digitally transform the business. AI has become a game changing event. Data scientists are now a mainstay of the corporate analyst landscape, looking for that next actionable insight for the business. But injecting AI into business operations is hard. Transactional databases capture the operational data of the enterprise, but to make automated decisions, AI needs analytics. These AI systems are even more demanding of analytics than traditional decision support systems, requiring massive amounts of data to train models, and real-time analytic results to drive real-time predictions.

To digitally transform the business, AI must be real-time. For AI to be real-time, we need real-time analytics.


Imagine a data science team has identified a better model for running product promotions based on what’s happening in the supply chain and distribution channels right now. Wouldn’t it be great to have a real time look into what’s happening in the business and offer those promotions right when the transactions are happening?

  • What if I wanted to see the status of my supply chain and inventory in a region to determine if I want to offer a promotion today, right now, to drive greater sales?
  • What if my team had built a predictive model that was designed to identify when my sales were getting behind my supply chain in a particular district with the goal of ensuring that I never run over on inventory, jacking up warehouse storage and inventory carrying costs during the month?
  • What if I wanted to laser focus those promotions at specific channel partners based on their volume to date or hold off on pushing product to partners who themselves are potentially over on their inventories?

Data scientists and machine learning engineers are discovering opportunities like these in managing product availability, credit risk, fraud detection, product consolidation, and more.

But to take full advantage of predictive models the business needs to apply them in real-time, on operational data right when the transactions take place. To get the full benefits, the promotions can’t be built on a forecast or applied after the fact – they need to be based on what’s happening right now.

Big Data Helps Find the Insights—But Then What?

Big data, machine learning, and lots of clever developers building lots of clever technology have created the foundation to make AI a business reality. But applying it in real time is still the challenge—and needed if AI is going to drive digital transformation. In the bifurcated world of OLTP databases and OLAP data stores, that’s going to continue to be a problem.

Having separate transactional and analytic data stores builds in an information latency—the time it takes for transactional data to reach the analytics and drive AI. Cutting this latency between these separate data systems is exactly what makes injecting AI into the business operations hard.

We can focus cutting the latency by building a faster pipe with high-performance ETL, extracting the transactional data from our Oracle, Postgres, SQLServer, or the like and feeding it into an AWS RedShift or Snowflake. But that’s only going to take us so far—the latency is reduced, but not eliminated. We’re still moving terabytes of data—which takes hours to complete. And this level of ETL involves significant systems engineering and ongoing care for only incremental gains, adding an expense that does little toward the actual goal of getting the business powered by AI.

And if you are operating on petabytes of data, as seen in IoT or web applications, you’ve hit the wall with traditional OLAP/OLTP architectures at another extreme—scale. In these cases, you have to duct tape operational, scale-out NoSQL systems like Cassandra, HBase, Dynamo, or Redis to in-memory, scale-out analytics systems like Hive or Spark. But these brittle, loosely-coupled architectures require scarce and expensive distributed system engineers to pull these compute engines together and maintain their operations. Moreover, they give up the tried and true features of ACID transactions and full ANSI SQL that application developers have depended on for decades in relational databases.

There has to be a better way.


Enter HTAP

Splice Machine, along with others, are pushing the boundaries and blurring the lines between transactional and analytics data stores to make it easy to power AI applications.

The idea behind Hybrid Transactional Analytical Processing—HTAP—which brings OLTP and OLAP together in a single data platform, is to eliminate the information latency to move AI directly into the business operations. Clearly there’s great benefits being able to get analytic results in real time on databases of record: that’s how I’m going to help the user and the machine learning models deliver better insights in the moment of the transaction. And it sure would be easier to keep the analytics in the data lake in sync with the business with transactional updates on tables.

Clearly there is an opportunity and need to understand how HTAP architectures will perform and ultimately support AI for the business. What price/performance can be achieved with this approach? Will applications need the same capabilities from the data store or is there an opportunity for different application models? How much does the ability to bring the OLTP and OLAP stores into a single platform cut overall costs and free up time and resources to focus on the central challenge: getting AI inside mission-critical business operations?

Benchmarking HTAP

Benchmarks for OLTP and OLAP have been around for some time now – TPC-C just turned 26 years old this month. But they’ve become less about solutions—“will this database solve my problem to make my business successful and my customers happy?”—to more about assessing a price/performance point. The reality is that in many cases, the benchmarking has already been done and you pretty much get what you paid for. If benchmarks factored into your database decision at all, it would be pretty cut and dried: for what I can afford, will that will be good enough?

But for HTAP we are at a different stage of discovery. We’re pushing boundaries of how this technology will help us write applications or even complete operational systems in new ways. What does an HTAP data store need to deliver? How transactional does the analytics store need to be, or conversely, how much analytical processing is necessary over transactions? The big price/performance questions for HTAP revolve not just around the throughput for these simultaneous workloads, but how we can leverage the combination to make our apps better.

Open questions like these highlights one of the most meaningful benefits of benchmarking for the industry and for us actual users: often, it’s not just about who’s best (remember, that’s ultimately just about price / performance), but with the ready availability of this next generation of database technology, it’s more about what’s a meaningful metric that translates into useful for the business.

If we’re going to have a meaningful dialog about what “best” means and what’s relevant in HTAP, benchmarks are an opportunity help us frame up the landscape. So yes, we could all benefit from yet another benchmark – a benchmark for HTAP—but more importantly, the discussion that goes with it. This is what I hope will become a series of blog posts on the topic and one that takes us in some interesting directions.

To start off, a quick update on what’s been done so far. A few HTAP benchmarks have been proposed to date, mostly by leveraging what’s already been done in the separate worlds of OLTP and OLAP.1 2 If HTAP is going to do both, shouldn’t it do just as well at both together? Well, not necessarily, but it’s a good starting point and one such benchmark is the CH-benCHmark whose name gives it all away: the ‘C’ in TPC-C and the ‘H’ of TPC-H put together in one place. And as a starting point, it does make sense.

The ‘C’ side of the benchmark is a representation of an order management application. Orders are entered, priced, and stock is checked. Shipments are delivered. Payments made. And ‘H’ side? It’s a number of roll-up queries representing a range of standard business reporting. TPC-H assumes this data will be extracted from a transactional database of record and measures how fast the analytics data store can turn that extract into meaningful results—the information latency.

But in CH-benCHmark the transactional and analytic workloads to run together on a single integrated schema. The OLTP processes and complex OLAP queries overlap on tables to force the underlying database engine to simultaneously run analytics and transactional workloads over the same data. It’s a good first cut.

A Framework for HTAP Benchmarks

To run CH-benCHmark, we are using a database benchmarking framework, OLTP-Bench3, a “modular, extensible and configurable OLTP benchmarking tool.” Don’t let the OLTP part fool you—it’s configurable enough that we can do a wide range of things with this framework, both transactional and analytical. The goals for this framework are to see how a range of different benchmarks will run over different data stores—everything from the traditional OLTP and OLAP engines to big data and noSQL and now to HTAP and Splice Machine! These goals include the ability to combine complex and varying workloads. In particular it can emulate some very real-life scenarios including different mixes of users (multiple business units or true multi-tenancy) and changes in workload mix over time (time of day, day of the week). Built in Java, it connects through JDBC and is designed to easily plug in different DDL and queries (dialects) to support differences between different data stores. It can also support databases with full ACID support as well as data stores with limited transactional support.

While this framework may not provide the most highly tuned benchmark results—the kind that database vendors traditionally build for hitting the highest price/performance metric possible—this approach provides something more important as a framework for understanding how HTAP can support our applications and complex combined workloads. It offers a way for end users to model their mix of processes and applications to achieve the end goal: planning for getting AI into the business operations.

We’ve added support for Splice Machine to OLTP-Bench and our fork of the project can be found here.

In OLTP-Bench the client-side driver handles workers and generates workloads based on the configuration provided by the user. Statistics collected during a run are consolidated at the end for the results.

The Initial Results

CH-benCHmark scales the same way TPC-C does by increasing the number of warehouses in the data set. For our initial results we’re running with 1000 warehouses—an HTAP-1000. We have a thousand transactional worker threads executing in an “open” system model (meaning that as soon as they complete a task they immediately take on the next—no waiting). To see how the analytics load impacts the transactional throughput, and how well the system handles both workloads together, we increase the number of analytic workers, also in an open system model, and measure the results.

Using OLTP-Bench running CH-benCHmark under these conditions on a 4-node cluster, Splice Machine’s throughput is 11,588 tpmC for no analytics workers. For one analytics worker, the transactional throughput actually improves slightly. And when we increase to four analytics workers, the throughput drops to 10,772 tpmC, only a 7% reduction in tpmC throughput.

As a result we have a solution that’s clearly ready for a combined HTAP workload with the benefit of a single system that’s now ready for operational AI!

We plan to publish more detailed results soon. But for now, the goal is to provide you, our community, with some representative numbers and an opportunity to join in the process.

Next Steps

So what’s next? There are a number of things we plan to do with OLTP-Bench, not the least of which is to continue expanding the results. We will be taking a look at other “big data” data stores so see how Splice Machine’s native fully ACID-compliant store stacks up and to also help build a picture of the options in the HTAP / Big Data world for managing data across different workloads. Splice Machine, like many others in the big data ecosystem, can work with a variety of storage layers via external vs. managed tables, creating many different options for managing new or combining in existing sources of large-scale data. In particular, we will be taking a look next at Hive to see how it stacks up to Splice Machine or can work in a combined environment.

Another area we plan to explore will be the transactional capabilities in candidate HTAP solutions. Do other HTAP databases provide complete support for ACID transactions? How will data stores that only support transactions on single tables support HTAP? How should we be thinking about the transactional capabilities in this HTAP world? Although this might not seem like the stuff of benchmarking, we believe this is important in understanding the options. Here too we will be taking a look at Hive LLAP and how it plays in an HTAP world.

We will also be taking a broader look at what an application on HTAP should look like. While the existing proposals for HTAP benchmarking have been derivative of TPC-C and TPC-H, does that represent how we should be building applications on an HTAP architecture? An analogy for this would be a benchmark of an electric bicycle (which comes to mind because here in the Bay Area everything with wheels is getting lithium-ion’ized) to a conventional bike but insisting that for straight comparison we need to test with the electric power turned off. In HTAP and in the analytics we plan to inject into our transactions for AI, the transactional processing should be taking advantage of the analytics and the analytics workload will be much more varied than is being measured under the TPC-H queries in CH-benCHmark.

And to leave you with a final thought, how will the worlds of data science and operational systems evolve when we can bring everything together under HTAP? Will we be able to think of our HTAP store as an environment for data discovery and machine learning? Can we combine more of our systems into one place, gaining greater leverage from our infrastructure costs and system engineers? Can we further cut the latency between information discovery, model training, and application of the predictive analytics? Ultimately these are the kinds of efficiencies and acceleration that our boards and executive teams are looking for from the application of AI.

Lastly, we want to empower you in our community to contribute and participate in this dialog as well. To that end we’ve made our fork of OLTP-Bench available here, our data sets available on S3, and a quick start here. Feel free to take a look, try out your own tests, experiment with workloads that model your applications, and, most of all, share your results.

Thanks and we look forward to continuing this discussion soon!

Contributed by Rod Butters, CTO and CMO at CXO Now


The mixed workload CH-benCHmark, Richard Cole, et al, DBTest’11, Athens, Greece.
2 HTAPBench: Hybrid Transactional and Analytical Processing Benchmark, Fábio Coelho, João Paulo, Ricardo Vilaça, José Pereira, Rui Oliveira, ICPE’17, L’Aquila, Italy.
3 OLTP-Bench: An extensible testbed for benchmarking relational databases D. E. Difallah, A. Pavlo, C. Curino, and P. Cudre-Mauroux.