Machine Learning

Machine learning is about pattern matching. Using a "training set" of publications that have been labelled as "relevant" or "irrelevant" to a given topic, a machine-learning algorithm can be trained to identify relevant publications, based on the pattern of words in the titles and/or abstracts of these publications.

Machine learning is not perfect. Like humans, a machine-learning algorithm can make mistakes. We can quantify these mistakes by testing the performance of the trained algorithm on a "test set" of publications that have also been labelled as "relevant" or "irrelevant" (different publications than those in the "training set"). When we do this test, we can see that there is a trade-off between "precision" and "recall". Precision is the proportion of publications that the algorithm predicts to be relevant that are truly relevant. Recall is the proportion of publications that are truly relevant that the algorithm predicts to be relevant.

Table 1: Trade-off between precision and recall

Topic	Model	Recall	Precision	Positives	True Positives
Existential risk	"High recall"	0.9514	0.2021	1140	230
Existential risk	"Medium recall"	0.7500	0.3034	477	145
Existential risk	"Low recall"	0.5000	0.4286	162	69

By selecting a model with high recall, you will get more "false positives" (irrelevant publications that the algorithm predicts to be relevant), but you will also get fewer "false negatives" (relevant publications that the algorithm predicts to be irrelevant). By selecting a model with low recall, you will get fewer irrelevant publications to sort through (and thus you will save time), but you will also lose some relevant publications. You should select a model based on the amount of time you have and your preference for either high precision or high recall. Please note that publications are shown in order of decreasing predicted relevance. Therefore, the same publications (those with the highest predicted relevance) are shown on the first pages of the bibliographies for all models (low, medium, and high recall).

In Table 1, the number of publications that are likely to be truly relevant are in the column called "True Positives". Thus, in the first row in the table, the model predicts that 1140 publications are relevant ("Positives"). Of these publications, 230 are likely to be truly relevant, based on the precision of the model (true positives = positives x precision). However, this is only an estimate, based on the performance of this model on the test set. This model is not likely to perform identically on the test set and the new set of unassessed publications.

Existential risk

Download Existential Risk (CSV | RIS)

Other bibliographies are to be announced for specific x-risks, e.g., artificial intelligence or asteroid impact.

Publications predicted to be relevant by our Machine Learning (ML) model, but not yet assessed by humans and thus not yet in the non-ML bibliography

Existential risk

Low recall (CSV | RIS)

Medium recall (CSV | RIS)

High recall (CSV | RIS)

Machine Learning

Table 1: Trade-off between precision and recall

Manually-curated Bibliography

ML Bibliography

Existential risk