Data Stream Simulation

Problem Statement

In today’s IoT environment, an increasing number of devices are interconnected and need to communicate with each other. The development of systems which collect and process the load of all those devices poses a great challenge in the future.

  • The hardware of many control units/sensors is not available or not finalized during a project. Nevertheless, the software development should start as soon as possible. Therefore, the lack of proper hardware significantly slows down the software development for these devices.
  • When developing software solutions which process the incoming data streams, it is often unclear how the hardware which runs the program has to be sized to handle the load.
  • Another difficult decision which has to be made is the choice of the right software the development is based on. A huge number of different software systems is available on the market, from classic database management systems to big data solutions. Choosing the right basic solution leads to significantly lower development times and earlier available results.
  • Systems processing message streams (e.g., Apache Kafka) need a realistic data input stream to test their functionality. Performing data processing like sentiment analysis on the messages need unstructered data of high quality to get a meaningful result.

bankmark’s Solution

The Parallel Data Generation Framework (PDGF) enables customers to simulate any kind of data stream. PDGF‘s flexibility allows to connect to other software solutions and write the generated data into those systems.

This way, PDGF can simulate, e.g., sensors built into a smart meter, even if the hardware producing these values is not available. The generated data are very accurate. As a result, the software development team can use these data instead of waiting for the right hardware to get the job done.

Thanks to the very realistic data PDGF produces, hardware/software solutions processing such data can be tested as well. This way, it is possible to better predict how a specific solution will perform when confronted with the real data later on. PDGF‘s high speed also allows to test such solutions with huge amounts of data to get an impression of the system behavior under heavy load.

The unstructered text which can be generated using PDGF also enables testing message processing software in a very realistic environment without the need to feed the system during development with real user data or very small, hand-crafted data sets. The result is a much better robustness of the developed application as well as much less problems when going into production.