Build Customized Performance Benchmarks

Problem Statement

Hardware and software must work together seamlessly in production environments. This is pretty hard to achieve due to the sheer number of hard- and software-variants present in today’s IT environment.

  1. Hardware choice
    One problem when planning a larger system, especially in the big data environment, is the sizing of the hardware. Severeal choices have to be made, e.g., how many nodes must be used to achieve the required performance or how powerful a single node has to be.
  2. Software choice
    Whether big data software like Hadoop or Hive outperforms “classic” solutions depends on numerous factors. In general, big data tools need a higher number of nodes to perform better than other solutions.
  3. Hardware/software interaction
    Hardware and software influence each other. To get a precise prediction how a specific hardware/software combination will perform in a concrete business case, the specific customer use case has to be simulated and tested on a representative hardware- and software-environment in a very realistic way.

bankmark’s Solution

Using its Parallel Data Generation Framework (PDGF), bankmark is able to build a realistic simulation of a customers’ use cases by replicating the customers’ concrete data along with their relationships. PDGF can generate data of any size. Therefore, it is also possible to test the behavior of a system when confronted with big or small data.

Based on the generated data, bankmark builds a specific performance test, which answers important questions regarding the expected performance of the defined software components on the defined hardware. This ultimately helps customers to decide which hardware/software combination solves the specific use case optimally and prevents the customers from investing money into a new solution whose performance is unclear before going into production.