MemSQL paves a smoother path to Spark for real-time analytics
Spark Streamliner is a tool that integrates MemSQL's in-memory database and Apache Spark's in-memory data-processing framework for streaming data from real-time sources such as sensors, Internet-of-Things (IoT) devices, transactions, applications and logs.
Offering "one click" deployment of integrated Spark along with a Web-based interface, it allows users to create multiple data pipelines in minutes, perform custom transformations in real time and develop new analytics applications, MemSQL said.
Hooked up with a real-time data source like Apache Kafka, Spark Streamliner supports thousands of concurrent users running real-time analytical queries. Data is streamed directly into MemSQL. There's no need to extract, transform and load (ETL) data in batch fashion; rather, users can process data as it streams in, thereby eliminating analytic latency.
Featuring a simple SQL interface, Spark Streamliner can easily be connected to popular analytical tools, MemSQL said. Users can also share a single resource pool for multiple pipelines, effectively reducing their total hardware footprint.
A video demonstrates MemSQL Spark Streamliner in action. The open source tool and a library of example extractors and transformers are now available on GitHub.