I spend a lot of my time building and evaluating implementations for research papers. Over time, I’ve built a strategy for running evaluations. I just uploaded a project to github with a skeleton for performing evaluations how I do. The description is available here, and the github project is here.