Want to see how fast we can summarize a billion rows of data?
Don’t blink. It takes just six seconds.
Remember the good ol’ days, before Apache Spark 2.0, when
life was simpler and processing 173 GB of data took a whole 30 seconds? You
actually had time to sip your coffee before the bar charts rendered.
Those days are gone. In this post-Spark 2.0 world, using the
combined force of the Syncfusion Big Data Platform and theDashboard Platform,
you can process and visualize that same amount of data within six seconds. Advances
in Apache Spark have increased Syncfusion’s data processing speed by a factor
of five to ten times, generally speaking.
You can find all the details on how this is done by reading
“Using
Business Dashboards to Summarize a Billion Records in Seconds,” a
whitepaper written by two of Syncfusion’s big data experts.
The data used in the speed test described in the whitepaper
was sourced from information the New York City Taxi & Limousine Commission
collected from approximately 1.1 billion taxi trips that occurred from 2009 to
2015. That’s a lot of fares.
After using the Big Data Platform to upload the data, tune
its performance, cache it, and partition it, we then called upon the Dashboard
Platform to create a visualization of it.
Running on nothing more than commodity hardware that anyone
can access, massive amounts of data were visualized in just a few seconds by
using these Syncfusion platforms. Getting started with them is easy. Simply
refer to the following Syncfusion resources, and you’ll be on your way.
- Whitepaper: “Using Business Dashboards to Summarize a Billion Records in Seconds.”
- Videos: “Summarize a Billion Records in Seconds;” “Syncfusion Tutorial: Big Data Cluster Manager Introduction.”
- Product Webpage: Big Data Platform Homepage;Dashboard Platform Homepage.
- Documentation: Big Data Platform; Dashboard Platform.
- E-books: Spark Succinctly; Hadoop Succinctly; Statistics Fundamentals Succinctly.