Regular expression magic

From DevSummit
Revision as of 23:37, 4 May 2015 by Vivian (talk | contribs) (1 revision imported)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

REG EX: Regular Expressions Patterns

also called REGEXP


  • String matching - not limited by specific characters, are flexible


  • Standard syntax
    • Preg - perl compatible or
    • VIM syntax- using VIM
  • About
    • most languages support regex matching
    • powerful way to do mass replacement, to replace inline ex. redacts a name, change all address formatting,
  • SAMPLE SET
    • apple pear lemon orange
  • USING REGEX COMMAND CHARACTERS + examples
    • ^ this is the beginning of a line
    • ex. ^l match for first letter is l
  • $ match for the end of a string ex. e$ matches apple orange
  • [ ] set inside the brackets - note that putting concurrent characters means any one of these characters
    • ex. [er]$ matches apple pear
    • ex. [a-e]$ matches
    • ex. york matches only "york" but [york] matches y, o, r, or k
  • + one or more charachters
    • ex. [aer]+$ apple pear orange
  • 0 or more charachters - note that all strings are returned using this b/c of including "0" instances
    • ex. [aer]*$ apple pear lemon orange
  • {} uses a number to match the command ex [a-z]{2,3}$ matches the last 2 or 3 characters within the defined set to seach
  • w/i, in this case a-z
  • \ escapes single character of what comes directly after
  • \w escapes w and runs command to looks for words
  • ! negative, ensure this does not appear

EXAMPLES

  • set: nick_newyork betty_texas anne_houston buffalo [a-z0-9_]$
    • ^[a-z]+$ just selects the names
    • ^[a-z_]+$ selects all, includes _ to continue along the string searched
    • [a-z ]+$ note the space, selects nothing, looking at the end
    • ^[a-z ] note the space, selects everything, looking from the beginning


MORE USE -- SEARCH AND REPLACE

  • Look Behind and Look Ahead
  • ?<= ?>=
  • Using the side carats to point the search functions
  • () parens are for groups of charachters, of multiple charchters note [] brackets are for sets
    • ex. ([A-Z]{2})_([A-Z ])+$ again, note space in second group rertuns nick_new york anne_houston
  • ? this might not appear
    • ([A-Z])?_([A-Z ])+$ says you might not see this groupg


MORE USE EXAMPLES

  • 1. Pick out email addresses in a document
  • [\w.] - any word including any .
  • [\w.]+@([a-z-]+.)+[a-z]{2-4}
    • any word of 1 or more characters, then @, then a set that may occur multiply,
  • then the TLD
    • [\w.+]+@([a-z-]+.)+[a-z]{2-4}
    • here we added a + in the set to instruct that it is a searchable character, to include
                       testing+myemail.com
               
               

USE NOTES

  • case matters [a or A]
  • alot of regex is fighting with regex, it can be frustration
  • try expressing the logic of what you're searching for ahead of time -- can help you formulate


DELIMITERS

  • /i placed at the end means case insensitive
  • REFERENCE URLs
  • Regexper.com -- demonstrates logic of a regex
  • Regular-Expressions.info
  • Rubular - Ruby regex
  • Command line tools
    • - sed [replace] ex. echo "hello there" | sed "s/hello/whatsup/g"-> whatsup there ex. - egrep [search] or grep -e [extended regexp] ex. egrep '^a' /file/path - returns anything that starts with a

VIM

  • Will need to escape characters in the command program itself \ use your escape characters
    • ex. ^[aeiou]\+$
    • look for words that start w a vowel with one or more characters matching a vowel at the end
    • ex. Do this
    • ex. Returning Odd numbers: [0-9]+?[13579] same as [[0-9]*[13579]- you'll see 0-9 0 or more times, then after 1, 3, 5, 7, or 9 after.