Skip to Main Content

Working with OpenRefine

This libguide will walk researchers through the program OpenRefine.

What is OpenRefine?

OpenRefine is a free open source tool you can use to clean data. The program runs in your web browser and does not require an internet connection.

Common scenarios when working with OpenRefine are:

  • Separating data combined together in a single cell
  • Correcting inconsistencies in a data format
  • Adding data from an external format in bulk

What can it help you do with a simple tabular format?

  • Get an overview of a data set
  • Resolve inconsistencies in dataset formats, in where data appears, and in terminology used in data
  • Split data up into more granular parts using filters & facets
  • Match local data up to other datasets
  • Enhance a dataset with data from other sources
  • Export cleaned data as files, as projects for other OpenRefine users to edit. The changes and transformations used on the datasets can be exported to be used on other datasets within OpenRefine (scripts).

Please note: OpenRefine works best on Google Chrome or Mozilla Firebox browsers. It is highly unstable on Internet Explorer or Microsoft Edge. 

 

Workshop Resources