Follow @RoyOsherove on Twitter

Q&A - Greedy matching in regular expressions

This came in the mail, thought other folks might be interested.

Hi Roy. I need to check a line of html and make the value of the style attribute lowercase.  I've tried to come up with a regex that will work but I keep making the entire line of html lowercase instead of just the stuff in the style value.  I can't get the match to end with the correct quote, instead it goes to the last quote on the line.  So something like this:

[Tag style="WIDTH:20px; color:blue;" href=""] I want to change to this:
[Tag style="width:20px; color:blue;" href=""]

But instead I get this:
[Tag style="width:20px; color:blue;" href=""]

Because the match ends with the end quote of the href.

If you can point me in the right direction (or having something like this laying around), I would GREATLY appreciate it.


It's called "greedy matching" - because it looks for the *last* character.
Try to add a "?" after the quanitiy specifier (probably '*'). That makes the match end on the *first* match.

For example, given the following string as input:
The following greedy regex (greedy by default) will match up until the lasd 'd':

However, this regex will find several matches, the first one is "abcd":
(you can do without the braces if you want).

I'd also suggest adding two good regex mailing list to your arsenal instead of sending help messages to various people:

There are people there that know a whole lot more than me on regular expressions.


Team Agile - It's time to start my own business

Some thoughts on logic duplication in SOA design