Skip to main content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.

Text and Data Mining (TDM)

General guidelines for text and data mining projects at UBC.

New machine-readable COVID-19 dataset

Jump to dataset: COVID-19 Open Research Dataset (CORD-19)

A new fully open dataset has been made available for researchers. Requested by The White House Office of Science and Technology Policy (OSTP), it represents the most extensive machine-readable Coronavirus literature collection available for data and text mining to date, with over 29,000 articles, more than 13,000 of which have full text. The OSTP has also issued a call to action to develop new text and data mining techniques that can help the science community answer high-priority scientific questions related to COVID-19. Read the call to action here.

To participate in this call, researchers should submit text and data mining tools and insights that they develop in response to this call through the COVID-19 Open Research Dataset Challenge (CORD-19) on Kaggle.

What is this guide for?

General information on Text and Data Mining resources and support for TDM activities at UBC Library. This guide also maintains an up-to-date list of library licensed resources that allow TDM activity as well as information on legal support.

Jump to:

What is text and data mining (TDM)?

TDM is a broad label that refers to bulk collection and analysis of a corpus of data. A corpus can be anything from the full text of a set of journal articles to public social media posts to census data. The work of Text and Data Mining is to programmtically extract unseen relationships in the data.

Getting help with TDM

The library can help get your project off the ground legally and safely. We can help negotiate licenses for access to resources; develop agreements with providers who don't normally allow TDM; consult on project planning and tool selection; help train project members in the use of TDM tools.

To get in touch, reach out to the subject specialist librarian for your subject or schedule a consultation in the Research Commons. For legal questions reach out to Michael Serebriakov in the Office of the University Counsel.

For more information about: