Fig. 4From: Efficient differentially private learning improves drug sensitivity predictionThe effect of data bounding on regression model accuracy. The figure illustrates the effect of projecting the outliers to within the bounds in linear regression, for different sample sizes n with 10-dimensional synthetic data, evaluated by Spearman’s rank correlation between the predicted and true values (higher values are better), both for DP (solid lines) and non-private regression (dashed lines). The lines show a minor decrease in accuracy of the non-private algorithm as the projection threshold becomes increasingly tight. This minor decrease is eclipsed by a dramatic increase in the accuracy of the DP algorithm. Similar plots with higher dimensional data, and samples from a heavy-tailed distribution are included as Additional file 1: Figures S1 and S2Back to article page