The idea of regex is pretty simple. You just use some characters to represent a kind of character. Like when you do cp, you use cp folder_a/* folder_b/ to copy everything. It is only a much more systematic approach that makes it much more powerful. Of course, our skepticism always warns us that this may not be the best system theoretically, but now that all that we use use this system, we better first study it :). Besides, the most famous Chinese martial art book says to win a battle, you should always first study your rival.
- asterisk * : used following a character or meta-character (meta-characters are characters that represent more than one characters, for example a dot . is anything.) represents zero or more repeats of that character.
So, 1* is 1 or 11 or 111... Similarly ? means 0 or 1 times of iteration of its preceding character, ie, ba? matches to b and ba. + means 1 or more times. {n} Matches the preceding character n times exactly, for example, [0-9]{3}-[0-9]{4} matches any number of the form XXX-XXXX. {n,m} n to m times. - dot . : any character as I said, but except a newline character.
So, 1. can be 1 followed by any other character(s) except new line. - [] : any of the things inside. [abc] matches to a,b or c. [0-9] matches from 0 to 9 (and behold the dash means a range)
- ^ has two usages. Inside [] it means the negation, ie, [^a] means anything but a. ^ outside a bracket means the beginning of a line. So ^A means a line that starts with an A.
- $ means the end of a line. So a$ means a line that ends with an a. ^$ means an empty line.
- | or. (Gilmoregirls|Smallville) Gilmoregirls or Smallville. (Which is your favorite TV show?)
One good feature of regex is that you can group things together and later make reference as I did in tutorial 3. There's also POSIX character classes and they look like this : [:digit:], [:alpha:] ... I don't want to talk about this, you can always look them up. And they'll show up in the next tutorial too.
Linux Bash tutorial 5: If you can't find it with grep, it's lost. (Under Construction)
No comments:
Post a Comment