Skip to main content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.

Working with OpenRefine: Home

This libguide will walk researchers through the program OpenRefine.

What is OpenRefine?

OpenRefine is a free open source tool you can use to clean data. The program runs in your web browser and does not require an internet connection.

Common scenarios when working with OpenRefine are:

  • Separating data combined together in a single cell
  • Correcting inconsistencies in a data format
  • Adding data from an external format in bulk

What can it help you do with a simple tabular format?

  • Get an overview of a data set
  • Resolve inconsistencies in dataset formats, in where data appears, and in terminology used in data
  • Split data up into more granular parts using filters & facets
  • Match local data up to other datasets
  • Enhance a dataset with data from other sources
  • Export cleaned data as files, as projects for other OpenRefine users to edit. The changes and transformations used on the datasets can be exported to be used on other datasets within OpenRefine (scripts).

Please note: OpenRefine works best on Google Chrome or Mozilla Firebox browsers. It is highly unstable on Internet Explorer or Microsoft Edge. 

 

Workshop Resources

Associate Librarian, Director of Digital Scholarship Lab

Getting Started with OpenRefine

  • Download OpenRefine
    • Use this link to download OpenRefine to your own computer. Version 3.0 is the most current and stable version. 
  • Make Chrome your default browser
    • If you make Chrome your default browser, any links you click - and OpenRefine - will automatically open in Chrome.

 

Acknowlegement

This work is licensed under a Creative Commons Attribution NonCommercial-ShareAlike 4.0 International License. Data and workbook are shareable under the Creative Commons Attribution-NonCommerical-ShareAlike 4.0 International License. (CC-BY-NC-SA license).

 

Most of the material was created by John Little, Data & Visualization Services Department, Duke University Libraries, Hilla Sang, Data Visualization & GIS Specialist, UNLV Libraries, and Tricia Lampron, Metadata Services Specialist, the University of Illinois at Urbana Champaign.

John Little GitHub

Hilla Sang Libguide

University of Illinois at Urbana-Champaign Libguide