给梦一个奔跑的方向!
PDF Print E-mail
User Rating: / 0
PoorBest 
Written by xlingfairy
Wednesday, 28 October 2009 11:10
以下是PHP Manual 里关于正则表达式式的条件分枝的叙述:
 
Conditional subpatterns
It is possible to cause the matching process to obey a subpattern conditionally or to choose between two alternative subpatterns, depending on the result of an assertion, or whether a previous capturing subpattern matched or not. The two possible forms of conditional subpattern are 
 
 
       (?(condition)yes-pattern)
       (?(condition)yes-pattern|no-pattern)
    
 
If the condition is satisfied, the yes-pattern is used; otherwise the no-pattern (if present) is used. If there are more than two alternatives in the subpattern, a compile-time error occurs. 
 
示例:
(?(?=[^a-z]*[a-z])\d{2}-[a-z]{3}-\d{2}  |  \d{2}-\d{2}-\d{2} )
 
 
昨天在抓取该网页数据时,
 
发现只抓取了其中的5个, 以下是先前写的正则表达式:
<div class="title">[\s\S]*?<a href="/(?<url>[^"]*)">(?<name>[\s\S]*?)</a>[\S\s]*?<span class="myerror">\$(?<price>[^<]*)</span>
 
上面写的正则在取 price 的时候,只取了被标识为 myerror 的 price .但是被 myerror 标志的价格只是其中的一种, 该网页的 price 有两种情况:
 
1,
<div class="sprice">         Manufacturers RRP*: <strike>$29.95</strike><br/>Our Price: <span class="myerror">$24.00</span>         </div>

2,
<div class="sprice">           $11.95              </div>
 
所以用上面写的正则只能匹配第一种情况.
 
正则的条件分枝以前没用过,还有前断言后断言搞的我云里雾里,摸索了快两小时才把它们给搞定.
 
 
最终的表达式(只有价格部分):
<div class="sprice">[\s]*(?(?=M)[\s\S]*?<span class="myerror">\$(?<price>[^<]*)|\$(?<price>[^<]*))
 
<div class="sprice">[\s]*后面的部分就是一个条件分枝
(?=M) 是条件,即指 \s 的后面紧跟的字母是 M, 如果条件成立,就取被 myerror 标志的 price ,否则就取上面所说的第2种情况的 price.
 
 
还有另外一个版本:
<div class="sprice">(?([\s\S]*?(?=myerror))[\s\S]*?<span class="myerror">\$(?<price>[^<]*)</span>|[\s\S]*?\$(?<price>[^<]*)</div>)
但是只对该断文本起作用,对源网页进行匹配出错,我想应该是 [\s\S]*? 引起的吧:
 
1,
<div class="sprice">         Manufacturers RRP*: <strike>$29.95</strike><br/>Our Price: <span class="myerror">$24.00</span>         </div>
 
2,
<div class="sprice">           $11.95              </div>
Last Updated ( Wednesday, 28 October 2009 11:18 )
 

Add comment


Security code
Refresh

Popular Contents

Recommend

Related Articles

Site Info

Members : 1
Content : 130
Web Links : 7
Content View Hits : 99678

Links