By comparing the performance across the seven algorithms, the top four were selected: Gradient Boosting, Random Forests, Extra Trees, and Bagging Classifier. For the next step, we tuned the hyperparameters for each model using cross-validated random search. Parameter tuning was undertaken using three-fold cross validation due to the scarcity of known clusters (n = 957). Therefore, using a traditional traintest-validation split would bear the risk of making the performance too dependent on a specific subset of training data, waste data, and inhibit predictive ability [38]. A random search was utilized with 1,000 iterations, as it is empirically and theoretically more effective than grid search, as it allows the testing of a broader value spectrum for each parameter, and as it is less likely to waste effort on irrelevant hyperparameters, given the same amount of iterations [19]. We assessed the performance (shown in Appendix 2, Figure 5) of each algorithm after training each model with their respective set of optimal parameters.After comparing the results from the three iterations, the models’ oversampled datasets have been discarded. By looking at the cross-validation performance of the models in Appendix 2, Figure 1, Figure 4, where the mean cross-validation scores are ≥90%, indicating that the model is likely to be overfitting the training data and may not be performing well for unseen data. One could argue that the overfitting is caused by a disproportionate increase of synthetic samples as opposed to the original dataset (Appendix 2, Table 7). In the case of SMOTE (minorities), the class of stolen-bitcoin went from having 4 observations to 306, meaning that almost 24% of observations are synthetic; and, in the case of SMOTE (auto), almost 74% of observations are synthetic as shown in Appendix 2, Table 7.After discarding the models that are trained with the oversampled datasets, the possible winning models are one of the top four trained with the original dataset with or without tuned hyperparameters. As shown in Appendix 2, Table 6, the algorithm with the best mean cross-validation accuracy score is Gradient Boosting (GBC) with the default parameters described in Figure 4. For comparison of the Classification Reports 4 and Appendix 2, Table 8, as well as the plotted receiving-operatingcharacteristic curves (ROC curves) on Figure 5 and Appendix 2, Figure 8, the Gradient Boosting algorithm with default parameters as implemented by ScikitLearn was used with its tuned hyperparameters version using a random search.As the classes of interest (darknet-market, scam, ransomware, stolen-bitcoins) are related to illicit activities, it is both important to maximize both precision and recall, since it is as important to minimize false positives as to maximize true positives.