System and load testing for a data science pipeline in a big data environment.