Working with Regular Expressions


Regular expressions allow you to work with words – known as “strings” – based on their properties. You can find strings that match certain patterns, and you can change strings based on their patterns.

Ingredients

Datasets

Any textual data set

Tools

Most coding languages

OpenRefine

Notepad++

Example uses

Find all strings that are two words, one following the other, that start with capital letters
eg. Firstname LastnameFind regular expression is /[A-Z][a-z]+ [A-Z][a-z]+/

Find all strings that are in uppercase, and change them to lower case
Eg. SYDNEY
becomes

sydney
Find regular expression is /([A-Z]+)/
Replace string is $1.toLowerCase()

Find all full stops and change them to commas

Eg. Howard. John

becomes

Howard, John

Find regular expression is /(\.)/

Replace string is ,

Learning curve

Intermediate.

Other useful resources