The startup, SnappyData, has developed an in-memory hybrid transactional analytics database that brings together OLTP, OLAP and Apache Spark to ease the pain of customers that until now have had to custom-stitch streaming, transactional and analytical environments together.
SnappyData's open source platform provides a unified engine for real-time operational analytics, wrapping stream analytics and the twin data-modeling technologies OLTP and OLAP into a single integrated cluster. Sudhir Menon, founder and president of SnappyData — along with Richard Lamb and Jags Ramnarayanan — says that the platform leverages approximate query processing (AQP), which uses machine learning techniques to understand the kinds of queries a user might ask the system to create data samples that can dramatically improve query performance.
[ Related: 21 data and analytics trends that will dominate 2016 ]
"Not only can we run these queries very, very fast, we can also provide accurate error estimates," Menon says. "The samples are updated as the data changes. We are solving the performance problem so you can run more queries and get more insights, faster."
Menon says the solution fuses the big data computational muscle of Apache Spark in-memory design and unique "shared nothing" architecture that eliminates single points of failure while delivering very high performance based on Pivotal GemFire. Along with addressing performance, the solution addresses complexity and lowers the total cost of ownership by providing a concurrent, in-memory database built directly into Spark.
The SnappyData Platform is a single unified, scale out database cluster that ingests static data sets (e.g., from HDFS), acquires updatable reference data from enterprise databases and manages streams in memory, while permitting both continuous SQL analytics on the streams and interactive queries on the entire dataset (acquired from streams, HDFS or enterprise DBs). SnappyData achieves this goal by a deep integration of Apache Spark, as a computational framework, and GemFire, as an in-memory transactional store.
[ Related: AtScale simplifies connecting BI tools to Hadoop ]
To ensure interactive response times, SnappyData's query engine is equipped with state-of-the-art AQP techniques and a variety of data synopses to ensure interactive analytics over large streaming or batch datasets.
Menon says that the basic idea behind AQP is that one can use statistical sampling techniques and probabilistic data structures to answer aggregate class queries without needing to store or operate over the entire data set. The approach trades off query accuracy for quicker response times, allowing for queries to be run on large data sets with meaningful and accurate error information.
"SnappyData was created in response to the continued growth and demand for mixed workloads — and with it, the time-consuming and expensive task of stitching together multiple solutions," Leo Spiegel, senior vice president of corporate development and strategy at Pivotal, said in a statement today. "We are excited to support SnappyData's success."
[ Related: How different SQL-on-Hadoop engines satisfy BI workloads ]
Lamb, Ramnarayanan and Menon worked together at Pivotal for nearly a decade to build Pivotal GemFire.
"We built that product from zero customers to about 1,300 customers over a period of about nine years," Menon says.
Pivotal spun out the team as a standalone entity in January, and it now consists of more than 30 engineers based in Portland, Ore.
Menon says GE Digital is looking to the platform to help it address industrial IoT real-time data processing, which is key to GE's own Predix Industrial IoT platform. Hitachi is also working with SnappyData, drawn by the potential to offer real-time operational analytics while lowering total cost of ownership for its financial services customers in Asia. Menon says SnappyData is also currently conducting proofs of concept with a number of financial services customers interested in fraud detection/management and risk management.
"We aim to do for real-time analytics what we did for in-memory transactions with Pivotal GemFire," Lamb said today in a statement. "This investment helps us address what we believe is a real pain point for customers in a wide variety of markets."