The Syncfusion Data Integration Platform is an easy-to-use,
powerful, and reliable system for processing and distributing data that helps automate
the flow of data between systems. It provides tools to prepare (ETL or ELT) and blend data from a variety
of data sources for generating analytics-ready data, which can then be fed into
target applications for dashboards, data warehousing, and business intelligence.
The Syncfusion Data Integration Platform is useful in the following use cases which cannot be achieved in our Dashboard Platform alone:
- Complex ETL operations like joining two tables from two different databases.
- Fetch web data (using REST calls) from multiple sources and move it into a single target table. Then, connect the target table in a dashboard for visualization.
- Support for the following data sources:
- JDBC connections(SQL Server, MySQL, Oracle, etc.)
- REST API
- Social feeds (Twitter, Google Analytics, etc.)
- Amazon
- No SQL (Cassandra, Mongo, HBase, Elasticsearch, etc.)
- File formats (JSON, XML, CSV, etc.)
- Big data (HDFS, HBase, Flume)
- We can also create our own processors using C# or Java.
The Syncfusion Dashboard Designer can establish a connection
with the Syncfusion
Data Integration Platform (DIP) Server and access its data flows. This makes
it easy to consume blended data from different data connections as a data flow from
the DIP Server through a target server or file from the Dashboard Designer.
Creating a data flow
with the Data Integration Platform
To try this yourself, first download and install the Syncfusion
Data Integration Platform as prescribed here.
Once the installation and suggested configurations are done,
you will be directed to the home page of the data integration web application
in browser. Log in with the created user account or the default one and start
creating the data flow with the required processors.
Here is a simple illustration of a data flow where data is
fetched from a CSV file and written to a JSON file.
Figure 1: Data Integration Platform user interface for
designing your workflow
In the Read input csv
file processor, under its configuration settings, specify the appropriate
CSV input file path. Likewise, in the Store
output file processor, under its configuration settings, specify the JSON
output file path where the data is to be saved. To consume the data from the
target file path to the Dashboard Designer, add the PublishDataSource processor as the endpoint.
Figure 2: Processor configuration settings
Execute the data flow to initiate the process of fetching data
from a source CSV file to the target JSON file. This process can also be
scheduled for capturing the latest data updates. Now the data will be saved in
JSON format in the specified path. The data is now ready to be consumed from
the dashboard.
Adding the data flow
as a dashboard data source
Open the Syncfusion Dashboard Designer application. Expand
the Server Explorer panel from the
side bar on the left. Expand Add Server
and select DIP Server to add a new
DIP Server connection. In the prompted login window, enter the hosted URL of the
data integration application where the data flow was created and the user
credentials.
Figure 3: DIP login in Dashboard Designer
On successful login, you may get the data flows that are accessible
for that account listed under their respective servers in the Server Explorer window. Here, the CsvToJson data flow we created is listed.
Figure 4: Viewing DIP data flow in Dashboard
Designer
Right-click the CsvToJson
data flow and select the Create Data
Source option in the context menu. Now, the resultant data source from the
data flow, the JSON file data, can be created as a new data source in the
Dashboard Designer.
Figure 5: Adding the target data source in Dashboard Designer
Figure 6: JSON data source in Dashboard Designer table
canvas
With this data source added, you can design your dashboard. See our documentation for help getting started.
Scheduling in the
Data Integration Platform
The Syncfusion Data Integration Platform provides scheduling options as well to keep your target data source up to date. You only need to do a couple things:
- Go to the configuration settings for a processor in the data flow and schedule the time based on your convenience.
- Always keep the Data Integration Platform instance running on your machine.
Let’s say we want to schedule a daily data update at 4 A.M. for the data flow we created in this blog. Here’s what we have to do:
- Right-click the Read input csv file processor and select the Configure option.
Note: Usually the initial processor in the data flow which retrieves data from a source database is preferred for scheduling. - Open the Scheduling tab and select CRON driven from the Scheduling Strategy drop-down list.
- In the Run
Schedule text box, enter the CRON code0 0 4 1/1 * ? * to schedule a daily
data update at 4 A.M.
The time settings of the machine the DIP service is running on will be considered for this purpose.
Figure 7: Scheduling refresh every day at 4 A.M.
To learn more about scheduling, see our
documentation.