Skip to main content
Fig. 5 | Biology Direct

Fig. 5

From: Systematic evaluation of supervised machine learning for sample origin prediction using metagenomic sequencing data

Fig. 5

Source prediction of mystery samples from new cities using the species models. Latitude (a) and longitude (b) predictions from the multivariate regression model on species abundance data are plotted against the true geographic coordinates on the x-axis. Each data point represents a sample. The dashed line shows where predictions would be exactly correct. c Predictions from the classification model are illustrated in comparison to true sources. Each entry shows the number of samples predicted to be the corresponding city (row) and originally from the corresponding reference (column). Cities within the same continent with respect to reference are boxed in blue. Squared errors for latitude and longitude from the regression (d, e) and classification (f, g) are shown in boxplots for each city

Back to article page