Roy Osherove

View Original

When Regular Expressions Attack

It's not that I'm BAD at regular expressions, I just learn enough to get me by when I need them.

I'm nothing like Darren, who decided to do something like this:

While reading the VB Language Spec today I decided to attempt to follow the grammar syntax for matching DateTime Literals:

    
http://msdn.microsoft.com/library/en-us/vbls7/html/vblrfvbspec2_4_6.asp?frame=true

The resulting pattern can be seen (fully commented) here:

    
http://www.regexlib.com/REDetails.aspx?regexp_id=638

OK. I can't resist. Here's the full pattern (and it looks great in The Regulator):

(?'DateLiteral'     (?# Per the VB Spec : DateLiteral ::= '#' [ Whitespace+ ] DateOrTime [ Whitespace+ ] '#' )

    \#\s*

    (?'DateOrTime'  (?# DateOrTime ::= DateValue Whitespace+ TimeValue | DateValue | TimeValue )

        (?'DateValue'
            
            (?# DateValue ::= Whitespace+ TimeValue | DateValue | TimeValue )
            (

                (?# DateValue ::= MonthValue / DayValue / YearValue | MonthValue - DayValue - YearValue )
                    
                    (?'Month'(0?[1-9])|1[0-2])      (?# Month 01 - 12 )
                    (?'Sep'[-/])                    (?# Date separator '-' or '/' )
                    (?'Day'0?[1-9]|[12]\d|3[01])    (?# Day 01 - 31 )
                    \k'Sep'                         (?# whatever date separator was previously matched )
                    (?'Year'\d{1,4})

                \s+

                (?# TimeValue ::= HourValue : MinuteValue [ : SecondValue ] [ WhiteSpace+ ] [ AMPM ] )

                    (?'HourValue'(0?[1-9])|1[0-9]|2[0-4])    (?# Hour 01 - 24 )
                    [:]
                    (?'MinuteValue'0?[1-9]|[1-5]\d|60)       (?# Minute 01 - 60 )
                    [:]
                    (?'SecondValue':0?[1-9]|[1-5]\d|60)?     (?# Optional Minute :01 - :60 )
                    \s*
                    (?'AMPM'[AP]M)?

            )    
            |
            (     
                (?# DateValue ::= MonthValue / DayValue / YearValue | MonthValue - DayValue - YearValue )

                   (?'Month'(0?[1-9])|1[0-2])      (?# Month 01 - 12 )
                   (?'Sep'[-/])                    (?# Date separator '-' or '/' )
                   (?'Day'0?[1-9]|[12]\d|3[01])    (?# Month 01 - 31 )
                   \k'Sep'                         (?# whatever date separator was previously matched )
                   (?'Year'\d{4})
            )     
            |
            (
                (?# TimeValue ::= HourValue : MinuteValue [ : SecondValue ] [ WhiteSpace+ ] [ AMPM ] )

                    (?'HourValue'(0?[1-9])|1[0-9]|2[0-4])    (?# Hour 01 - 24 )
                    [:]
                    (?'MinuteValue'0?[1-9]|[1-5]\d|60)       (?# Minute 01 - 60 )
                    [:]
                    (?'SecondValue':0?[1-9]|[1-5]\d|60)?     (?# Optional Minute :01 - :60 )
                    \s*
                    (?'AMPM'[AP]M)?
            )     

       )

    )

    \s*\#
)