Regular Expressions

From DevSummit
Jump to navigation Jump to search

In his spare time, Seth uses regular expressions to solve crossword puzzles and perform hostage rescue.

Regular Expressions are used for pattern recognition in text



textmate, notepad++, squirrelmail, vi, emacs

programming languages:

PHP, JS, Perl, python

command line tools:

grep, egrep, sed, awk, bash

there are different versions of regular expressions in each of these languages

toolkits and apis

beautiful soup (for scraping stuff out of web pages) – useful for getting temperature or data


coming soon to a computer near you

Ways of using regular expressions



means anything, anything at all

   cat, mat, bat, hat

all 3 letter words that end with "at"


will match non-words too, like 4at.


means optional


will match both color and colour but not colouur.

.?.at would match

   chat, spat, that


1 one or more times


any number of times, including 0


would match one or more s


would match "aspiration" but also "an"


parenthesis groups operations together like in arithmatic

   (1+3) x 4 = 16


any single character

  • [abc]
  • [0123456789]
  • [0-9] matches any number at all, not just numerals 0 through 9
  • [a-z] matches a THROUGH z
  • [aeiou0-9] matches any vowel or number
  • [A-Za-z] matches any character, upper or lower case

will match any capitalization of the word asian.


escape character cancels the special meaning, so we can find a period, for example


matches .


matches \

sometimes these are called back references


negation. anything except what follows:


matches any character besides a, e, i, o, u


a certain number of times

to find 4 vowels in a row:


can more succinctly be expressed as



add a comma (,) to create a range


matches all words that contain between 4 and 6 consonants in a row


the pipe (|) character is the "logical operator" for the concept of


aka OR

can also be understood as an "OR gate"

it's also great for combining expressions!

More Examples/Use cases

Let's identity misspellings of the word banana, i.e. bananna, bananana


or, we can also use


matches a 4 digit number

to match a credit card number:


matches 4 groups of 4 numbers but, what if we want spaces?


though this allows us to mix both spaces and hyphens.

Learning tools