As CNN models become the standard computer visionmodel to be deployed in real-time vision applications, assessing and reporting whether the results of their performance translates fromresearch datasets to real timescenarios is crucial. Results of different CNN architectures areusually reported on standard large scale computer vision datasetsof amillion andmore static images (He et al., 2016, 2017; Szegedyet al., 2016; Howard et al., 2017). Domain specific datasets likemedical imagery or plant diseases, where transfer learning isoften applied to CNNmodels, comprise smaller datasets as expertlabeled images are more challenging to acquire