Search The Blog
About this site

@RoyOsherove

Subscribe!

This site aims to connect all the dots of my online activities - from tools, books blogs and twitter accounts, to upcoming conferences, engagements and user group talks.

from 5whys.com
Twitter: @RoyOsherove
My Book: The Art of Unit Testing
Latest Posts
« Team Agile - It's time to start my own business | Main | Some thoughts on logic duplication in SOA design »
Monday
Jan102005

Q&A - Greedy matching in regular expressions

This came in the mail, thought other folks might be interested.

Hi Roy. I need to check a line of html and make the value of the style attribute lowercase.  I've tried to come up with a regex that will work but I keep making the entire line of html lowercase instead of just the stuff in the style value.  I can't get the match to end with the correct quote, instead it goes to the last quote on the line.  So something like this:

[Tag style="WIDTH:20px; color:blue;" href="blah.com/PageTWO"] I want to change to this:
[Tag style="width:20px; color:blue;" href="blah.com/PageTWO"]

But instead I get this:
[Tag style="width:20px; color:blue;" href="blah.com/pagetwo"]

Because the match ends with the end quote of the href.


If you can point me in the right direction (or having something like this laying around), I would GREATLY appreciate it.

Answer:

It's called "greedy matching" - because it looks for the *last* character.
Try to add a "?" after the quanitiy specifier (probably '*'). That makes the match end on the *first* match.

For example, given the following string as input:
"abcdfgdrbdtargd"
The following greedy regex (greedy by default) will match up until the lasd 'd':
(.*d)

However, this regex will find several matches, the first one is "abcd":
(.*?d)
(you can do without the braces if you want).

I'd also suggest adding two good regex mailing list to your arsenal instead of sending help messages to various people:
http://groups.yahoo.com/group/dotnetregex/
http://lists.aspadvice.com/SignUp/list.aspx?l=68&c=16

There are people there that know a whole lot more than me on regular expressions.

 

PrintView Printer Friendly Version

Reader Comments

There are no comments for this journal entry. To create a new comment, use the form below.

PostPost a New Comment

Enter your information below to add a new comment.

My response is on my own website »
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>