ETL Testing

An automated testing framework has been established for the v2 ETL code base and data artefacts. It follows this broad outline, all run via automated testing scripts.

 

  1. Claritum ‘before’ instance (containing the product in a state of 'A' tests) is spun up, along with a shell Data Warehouse.

  2. The ETL code is called to process this ‘before’ instance into the shell Data Warehouse. At this point testing can be achieved via reports or any manual or automated method over the available data. This can be seen as the ‘initial’ phase.

  3. Claritum ‘after’ instance (containing the product in a state of 'B' tests) is spun up.

  4. The ETL code is called again to process this ‘after’ instance into the same Data Warehouse that was loaded in (2). At this point testing can again be achieved via report or manual/automated query methods. This can be seen as the ‘delta’ phase.

This testing approach is entirely scalable for any number of Claritum instances and can be repeated endlessly.

All of the ETL testing code and the Claritum and Data Warehouse scripts are version controlled within GIT.

At present the scripts are run manually at the command line. However there is nothing to prevent these being run on a continual integration test cycle as we see fit in the future, along with automated test report outputs or any other testing artefacts we produce.