About: The Existential Risk Research Assessment (TERRA)

The problem: an overwhelming volume of research

An overwhelming volume of research has been published, and more is being published all the time. It is taking an increasingly long time to find all of the publications that are relevant to our research. We need new methods of efficiently searching for relevant publications, and we need these methods to be systematic, to minimize bias in the publications that we read and the conclusions that we reach¹.

The solution: a semi-automated system for finding relevant research

This system uses volunteers from The Existential Risk Research Network to identify relevant publications in a set of search results, and then it uses machine learning to identify similar publications in new sets of search results. It is a "recommender system" or "recommendation engine".

1. Humans do some of the work

We search a database² for publications that might be relevant to existential risk. The results of this search are a "corpus" of publications that might be relevant. Using the titles and/or abstracts of the publications in this corpus, we assess the relevance of each publication, labelling it as "relevant" or "irrelevant".

2. Machines do some of the work

We "train" a machine-learning algorithm to identify relevant publications in the corpus, by telling it which publications are relevant and which are not, and giving it the titles and/or abstracts of these publications³. We then set up an automated and regularly-scheduled search for new publications, using the same search strategy⁴ as we used before, and the trained algorithm tells us which of the new publications are likely to be relevant.

3. Humans do some more of the work

Because the algorithm is not perfect, we double-check the publications that it predicts to be relevant, as a method of quality control. If we agree that a publication is relevant, then we add it to our bibliography, which is published on this website as a resource for the research community. Thus, this is a semi-automated system⁵. Humans are still part of the process, but we save time by not searching through all of the publications that the machine has (correctly or incorrectly) identified as irrelevant.

Why not Google it?

Why do we do all this, when we could search for "existential risk" or any other topic in a search engine?

Transparent and repeatable searching

The algorithms that are used by search engines are "black boxes" that are not open to public scrutiny and may give different results for different users. In contrast, our methods are transparent, and therefore they are open to scrutiny and they are repeatable, both of which are of critical importance to scientific progress (improving the methods over time, and gauging our confidence in the results).

Collaborative and cumulative results

It is inefficient or impossible for everyone who is interested in a topic, such as existential risk, to go through all of the search results by themselves. We share the work between many people (and machines), and we also share the results. If someone wants to know about this topic, then they will have access to a systematically and transparently collected bibliography that represents a vast amount of collective work and knowledge, rather than having to "reinvent the wheel" by doing their own search.

Notes

¹ For more information on this problem and its solutions, please see O'Mara-Eves, A., Thomas, J., McNaught, J., Miwa, M., Ananiadou, S. (2015). Using text mining for study identification in systematic reviews: a systematic review of current approaches. Systematic Reviews, 4:5 (DOI).

² We search the Scopus database at present, but we plan to search additional databases (such as Web of Science) in the future.

³ Please see Methods and Machine Learning for details. Or please see our publication in Futures for more information.

⁴ We use a set of keywords, defined by members of the research community and refined over time. Please see Methods for details.

⁵ For examples of similar semi-automated systems, please see Lyon A., Grossel, G., Burgman, M., Nunn, M. (2013). Using internet intelligence to manage biosecurity risks: a case study for aquatic animal health. Diversity and Distributions, 19, 640-650 (DOI).

Existential risk

Download Existential Risk (CSV | RIS)

Other bibliographies are to be announced for specific x-risks, e.g., artificial intelligence or asteroid impact.

Publications predicted to be relevant by our Machine Learning (ML) model, but not yet assessed by humans and thus not yet in the non-ML bibliography

Existential risk

Low recall (CSV | RIS)

Medium recall (CSV | RIS)

High recall (CSV | RIS)

About: The Existential Risk Research Assessment (TERRA)

The problem: an overwhelming volume of research

The solution: a semi-automated system for finding relevant research

1. Humans do some of the work

2. Machines do some of the work

3. Humans do some more of the work

Why not Google it?

Transparent and repeatable searching

Collaborative and cumulative results

Notes

Manually-curated Bibliography

ML Bibliography

Existential risk