Regular expression magic
Jump to navigation
Jump to search
REG EX: Regular Expressions Patterns
also called REGEXP
- String matching - not limited by specific characters, are flexible
- Standard syntax
- Preg - perl compatible or
- VIM syntax- using VIM
- About
- most languages support regex matching
- powerful way to do mass replacement, to replace inline ex. redacts a name, change all address formatting,
- SAMPLE SET
- apple pear lemon orange
- USING REGEX COMMAND CHARACTERS + examples
- ^ this is the beginning of a line
- ex. ^l match for first letter is l
- $ match for the end of a string ex. e$ matches apple orange
- [ ] set inside the brackets - note that putting concurrent characters means any one of these characters
- ex. [er]$ matches apple pear
- ex. [a-e]$ matches
- ex. york matches only "york" but [york] matches y, o, r, or k
- + one or more charachters
- ex. [aer]+$ apple pear orange
- 0 or more charachters - note that all strings are returned using this b/c of including "0" instances
- ex. [aer]*$ apple pear lemon orange
- {} uses a number to match the command ex [a-z]{2,3}$ matches the last 2 or 3 characters within the defined set to seach
- w/i, in this case a-z
- \ escapes single character of what comes directly after
- \w escapes w and runs command to looks for words
- ! negative, ensure this does not appear
EXAMPLES
- set: nick_newyork betty_texas anne_houston buffalo [a-z0-9_]$
- ^[a-z]+$ just selects the names
- ^[a-z_]+$ selects all, includes _ to continue along the string searched
- [a-z ]+$ note the space, selects nothing, looking at the end
- ^[a-z ] note the space, selects everything, looking from the beginning
MORE USE -- SEARCH AND REPLACE
- Look Behind and Look Ahead
- ?<= ?>=
- Using the side carats to point the search functions
- () parens are for groups of charachters, of multiple charchters note [] brackets are for sets
- ex. ([A-Z]{2})_([A-Z ])+$ again, note space in second group rertuns nick_new york anne_houston
- ? this might not appear
- ([A-Z])?_([A-Z ])+$ says you might not see this groupg
MORE USE EXAMPLES
- 1. Pick out email addresses in a document
- [\w.] - any word including any .
- [\w.]+@([a-z-]+.)+[a-z]{2-4}
- any word of 1 or more characters, then @, then a set that may occur multiply,
- then the TLD
- [\w.+]+@([a-z-]+.)+[a-z]{2-4}
- here we added a + in the set to instruct that it is a searchable character, to include
testing+myemail.com
USE NOTES
- case matters [a or A]
- alot of regex is fighting with regex, it can be frustration
- try expressing the logic of what you're searching for ahead of time -- can help you formulate
DELIMITERS
- /i placed at the end means case insensitive
- REFERENCE URLs
- Regexper.com -- demonstrates logic of a regex
- Regular-Expressions.info
- Rubular - Ruby regex
- Command line tools
- - sed [replace] ex. echo "hello there" | sed "s/hello/whatsup/g"-> whatsup there ex. - egrep [search] or grep -e [extended regexp] ex. egrep '^a' /file/path - returns anything that starts with a
VIM
- Will need to escape characters in the command program itself \ use your escape characters
- ex. ^[aeiou]\+$
- look for words that start w a vowel with one or more characters matching a vowel at the end
- ex. Do this
- ex. Returning Odd numbers: [0-9]+?[13579] same as [[0-9]*[13579]- you'll see 0-9 0 or more times, then after 1, 3, 5, 7, or 9 after.