Tess-two OCR not working - java

im trying to get text from an image using tess-two on android.
But its giving me a really bad result
01-16 12:00:25.339: I/Tesseract(native)(29038): Initialized Tesseract API with language=spa
and like 30 seconds later it shows this as result string:
{ga
.,
r¿
y“: A
r M í
:3
' ‘Ev’.-:.. -: A 7
» w- ?" _
Á.» ¿"A ¿rw-V r
mjÏfn 'n’n . Y
' "\'ZA".‘.¡ A‘ :‘ïvAv- « ‘
:"Éf‘Ï'" -Ï«l :‘,.v:...»- .
' RFI' .. ’ g)" 3;:- 1-;4',
= * ¿,arifgggk mw; .1. ,
' "53» "J
't‘ ‘ ¿Las ;.‘».L',-‘»
' ' 'N‘“ "“=: - '. V . ‘9!
5.? ' “F a .“
Y , <_ 7- . 7.-, .
;« z "1:;2wr . A - . ' -»‘ 5“:
“4-”, ¿rn 73:33: w v'.‘ ¿a ‘ A ,z, v VA
...,,« ' 'Q ' ‘ 4 214€. 5 . AV ¿JL y .13:
1 » . 21mm; » ¿ati-“fl ¿ab-1377*“ w”
. x ‘ ‘ ú F v'v:
1 . ' . ; (“ya í .
of course thats not correct, im using this photo:
i have tried it a lot of times, always similar result.
What can be wrong, this is my code using tess-two
TessBaseAPI baseApi = new TessBaseAPI();
baseApi.init("/mnt/sdcard/external_sd/tess/", "spa",TessBaseAPI.OEM_TESSERACT_ONLY);
baseApi.setImage(bitmap);
String recognizedText = baseApi.getUTF8Text();
Log.d("Texto leido", "texto: "+recognizedText);
baseApi.end();
and this is how i get bitmap from file
BitmapFactory.Options options = new BitmapFactory.Options();
options.inPreferredConfig = Bitmap.Config.ARGB_8888;
Bitmap bitmap = BitmapFactory.decodeFile(photopath.getAbsolutePath(), options);
im using that bitmap on a imageview and it seems correct, so i cant find why its working that bad.
Any idea?

Here change the language code for image text language.
eg: if you want English language text recognition, then use 'eng', or Spanish language for 'spa'
1)
TessBaseAPI baseApi = new TessBaseAPI();
baseApi.init("/mnt/sdcard/external_sd/tess/", "eng");
baseApi.setImage(bitmap);
String recognizedText = baseApi.getUTF8Text();
Log.d("Texto leido", "texto: "+recognizedText);
baseApi.end();
2)Download language package files from Download here
you must download osd.traineddata.zip file and tesseract-ocr-3.01.eng.tar.zip(here eng for English, spa for Spanish.. etc) files paste into assets folder.
3)before set bitmap convert into gray scale image bitmap

Related

Extract Image from webRTC Stream

I am trying to extract a single image from a webRtc stream.
The stream looks something like this:
--boundary
Content-Type: image/jpeg
Content-Length: 27778
ˇÿˇ‡JFIF``ˇ€C
 $.' ",#(7),01444'9=82<.342ˇ€C
2!!22222222222222222222222222222222222222222222222222ˇ¿¿"ˇƒ
ˇƒµ}!1AQa"q2Åë°#B±¡R—$3brÇ
%&'()*456789:CDEFGHIJSTUVWXYZcdefghijstuvwxyzÉÑÖÜáàâäíìîïñóòôö¢£§•¶ß®©™≤≥¥µ∂∑∏π∫¬√ƒ≈∆«»… “”‘’÷◊ÿŸ⁄·‚„‰ÂÊÁËÈÍÒÚÛÙıˆ˜¯˘˙ˇƒ
ˇƒµw!1AQaq"2ÅBë°±¡ #3Rbr—
$4·%Ò&'()*56789:CDEFGHIJSTUVWXYZcdefghijstuvwxyzÇÉÑÖÜáàâäíìîïñóòôö¢£§•¶ß®©™≤≥¥µ∂∑∏π∫¬√ƒ≈∆«»… “”‘’÷◊ÿŸ⁄‚„‰ÂÊÁËÈÍÚÛÙıˆ˜¯˘˙ˇ⁄?ÏÅÊ¢∫∑[òYq¡©;ävh&Á*ckyöfi^£°ı®e6+°‘¨~’‰ëySX#óâÉIãr°ÃTÏ9ÌQ{“#3§ÌäR}hgfiÄhˆ4º1◊Ë ÌHsN>‚ìÎ#
¸;Qä_zCEÜ!•<sÕ)§Áßj»müzq§ÌöÑÌG∏Õ«CE8≈%/“å~fi*Tb≤z;béø˝j.4oƒ‚Kuonj§√dπ«
M”%»h…Í2>µbÂ7GûÎÕ£
àäõ™‘gΩDE7µ<èZaÎ#Õfi≥ÆÉZr˜¨˘Å‰“(∆∫V%ÿÎ÷∑nEb]é¥X§c7fi4Äw©$1Õ4R4HUß‚ö)¬ë¢”OÛÌL$gÉHlfO\“Sç4”DâIKäNÙƒîQ#ѧßRg“Å E(§¢ÄÙQILAE/4îQö(¶“ä(§Fh¢Ä
(£öQÕXä(†a⁄ä(¶ £öZZWц“—#òîw•¢ò®†RG•(§«<R˚P—ö-
wjm-;†ß©‚ô⁄ïh)ØLT¢°SœZî`“5Lï=Ía“°\T´“ë¢>¥ÒMUʧ=h-˘”¿§=jUZ,;+≈4°Ó*–OÛäiOj`—P¶)•≠Zh˘¶È¶d‚U€€4èZï◊÷£Ëx¶e±,}zUòâœj®á•YFµQHºáåTô9ÎPÆ0jxˆÙÕIWgΩ7g|~µ6‰M!ïÄπ§& rF{ä≥hA®eπÙQ≈>ñn0=®2ìπ–⁄„Ê∂-œ•`⁄0∆?≠m[úÅäg<ç{sW„«QÕgA”èZ—á∂zUZ^îΩOÖÊû'•∞ä\RÖ¿ßÅ≈RÄh≈8{ı¶1i#…†SÖ[[Å*f≠gøjÊÌÆN?∆∑aïeå5#eêr1Î÷±µ[ˇPÉΩ~ıÆ
8‡åv¶#ë!dM„ØjÆ…Õjj6me?ù˝Àò’JEfi>ȧ+i†~ï#/4”Hv8Ôı§=3JsG4¿o“É”ä_≠'Õg•7Ω;˙SH9†`
g≈/JJqé8£ÎE è|sF)=({RR“q#4cÉ”äO©†hñ⁄C Gcö‹ 0:0Æt‰√úV›åÇ[lu+# ƒlf_C≈1ÖXπM≤+áÉP∞œ~î#iµ!Á5"ì•RúVÉñ©Ã£m!£Âx5âvπœ—\' }}kÈsö
G;2·ÈÄv´W)ÜÕU«4çíΩi‘Çñ§¥!4ÃÛ¿ßöcS4”O^¥ÍC#ѧ≈-ƒ7µ£ÈEä(=(ùË¢íÄ
(¢Ä
(§†A⁄ä(¶EP⁄ä(†ä(†ä3K#îQ#/Z;Q# é)h¢Äíä(∞•¢é‘Ω©(•Õ∞bógö=ËÌKIE
ZNiA†B„4£˘Rbó∆/Jvi£•(§4<ïO#öÑS¡¢≈¶XVa[é*ö∑8©—çI¨YmNje‰Udsäô
©ì®„9©P
âzu©”ì÷Ç–9ÈN€ÌNU©6Á&ǨW)ûzTlß⁄Æœjç”ûù)¢(:f†e!π¸ÎBDÙÙ™ÆøJ§a8êû2jÃK»‚†ÔV°4»Zñ¿„•<g∑°N⁄r/<äEÿ6ÒÕ5Ķ€I∞≥p3A2f|É-«55¥l\`UËt∆ñL≤ê+rœI
Ñsπ¢µúM«πm¿‚≠[i¿Ú÷¨6AG#ö£'"≠º'5£Gäñ8Ä©Ç„µ\E\S¿•úx≈(QåˆßÈ# éiÿßcä\SõsNÅ“ùL>xÕ^≤ª1>›5GÁ9≈„ΩB*ÁRéA5 ?5bȘª#JÿV•ı4)<Lå2¨1\ÈÌe?î‡òõÓüÈ]#5Zˆ—/ (√û«––6r“¶”«‡jp´#µº√Á^á‘Uy©¡§"ÈIé=iXsı§‰Êò\O®û∆ó∂hˆ†üCäO¬ójN:‚åhÁ—‹–0<˚“î¥û‘}i>¥~Z#uQœjN)§wÎKE'nïoNõÀò+:¶iU∂:ü¬ò”7Æ≤˘U£5°˘–+˜Ôı™R.…HÏyÅê∞Ê°=˚‘ÌP∞≈3±™ 8<UƸ‘2é¶Ü4e\AèpΩknp+.·:“.';væıG≠vùx¨∂8©7[8ßqé)8Ù§P√L=ȃ”M1\iÈI⁄ñõLB”OZ^Ùô†AIE†LL
)3GZçt4fÅázJ(†AIKILAEPfiä(†ä(†è¬ä(¢ñíÄ
)i>¥¿)i(§—IK#-%¥¥îP˜£ÎIìK#ÌGJN?3IK#E¥Z%-ñõN‚ÄüL˛4·H§HœÙ©w®GZx=ø≠)6XI*uz®µ*˝iX÷2/$ô´1∂k9ÙÕX碇öF—ë©’a+:9#∆jÏRqA™d˚A∆N=i·ΩËbı±Y£‡Ûäß"Ìä—‡ö≠*Jh∆H°å61Va¿aL)œf› aTdëh‘ëƒÔ¿´ñ∂à‰o$÷Ìùú p¥
S±çm¶I!~µm¢˙Æ?
Ÿ∑çE≠‘p0(±œ9≥&
$˜zw≠l{¬Æ
xÙÃ.1c
—R®†/4‡1H/µ<Ù™8ß{‚òÑ“”±öP¥õ{S±#ßw¶!Zv(º–`
ZFeUÀfi≤Æı1∏¨_â¶Â8«á≠;∑¯“܆`:
◊”ÔwéyÏkµ9´q∆;–3¨äp‰VmÖ‡ï1´#P3?U∞1˘±Ò2åµÑôNFx ◊[XzΩëâ˛ŸˇÆä==hå…è≠3´D,à$Sêy5\ézPMàÈ0s◊4ÍOÎHaÕ7ë≈8˚“bò ¡4v£ΩJNÙøJNî√©Õ
R8‚ö3H„ä:v¢ä!¢é¥îS÷ö√"ùI÷Ä4Ù…≤¶3‹d}jk§˘wc•e[Jaò0ıÕm»¢D»Ë¬Åô‰dTL9©FyS‘Tn3#t®‹db•#ΩFz}hü:gµe‹'+jeœjŒ∏Jí‚s◊I¡¨yk°∫è9¨Kï√Hfi%ozqGzkPP“}©ßëJi¥…ì•4îJN‘„M†µ&9•ÎHhîQä;P &äJ(¢ÉILB“QK#¬ä)(£ä)ifíä1LAEP“QEpÌGz(†¢íäZ))yÈ#E6óä)h¢Ä
;—E1ûMîPQ÷ñíé1#ÕÇ领…ÔKM•†.:óΩ6îu†.8i¿ÛLÕ;µ$ú1QÉÕ8R*‰ÍEJ∏™Íje4aS+{T€äïqöF®ù^¨E.*†>’ ?ùâöI5<»+=‰sS∏u†´ì˘á=i¨˘ÎQ™ˆ©6èO∆ö!ªê≤Â∏ÕZ∑Së≈Äöπo#⁄®õ∂Hp+rŸy≤≠1≈l€v„Ù†¬fåW—j§"ØFJyíúSDzfĵ É!ßÅÉJ;S1Ké‘¥‡(†)q≈1)›®‚é(‚°û‚8s∞^ÛQKpUN_˘V)3flMåìü»U$&…ÆÔ‰∏ìbgnxQR⁄È˚éÈy>ïj”OH#$noSZ)≈©¿úı£ä
'·≈dX`SO^îΩ˝®„Ëfh»eÍ+°¥∫∆~ıÕg=µ√B·≥flΩ=∆éúuÕ)ó°ÎP[‹,—ÓüJõ?ï!ú›Â°∞∏%y∑êˇfl&™ÀGN’’\#ó0¥nW6Ω¨Õm/#¯˜1YyÕ7TÚGÜ9‚¢#fiêÜ„Û§•§„Ω1à8§œ4‡OzC◊≠ ⁄ì4Ωy4úzP}('èaFO≠'µÁhÈG“ì±4}i Ω©(¸©)h8ËÜÁ"∂Ïe€mœ+¸´ÙÕZ”Ê1\z
E©◊lôı®òU€§‹áG"©g"Ä!"òz‘≠÷¢c#…–÷|Î÷ÆJ}*圡©.&m“√ªèØ+n·ÎÓ#sH⁄&c`TmNsQûh)±§“w†“d‹J3E®);““P §¢ä%å—#%fÄ
(Õ¿(§¢Å-'J(Öfä(¢íÄ∏π¢íå–—IE1ä))h•=)(§µîfòE%ÄZ))iÄw•Õ%%;ΩîP+ãGQ#ä^îÇä‚““t•†äLä\–\ÛNœµ2å˙P+íÁäPx®Úi¿˛T π*û9≈JΩWœ‡‹—a©’≤*e'µSV©—≥fiïçc"“û>µ(ÈUïèjô[µI¨Y7ΩJå~ïöïsÿP]À+œˇ^§'ö∞ô>¥ˆ,kÔZ™f©∆=´F’~a≈hj€/≈k€YˆÀåqZ∂ο¿¶sL–ÄzUËî‚™¿1÷ØF8¶éY±¿SÒJ1NJƒ•≈Í`ú1#·Ù†. •¿£5≈‘vÈπ€û¿u5V+0E‹«V=Ó´…é"q”uT∫æöȈÆqûUõ=7nOaÿS∞Øÿ≠mg-Ào|™gøz€Ç›#P®∏•é!¿≠«&Å«‘º(¢I%‰‡VM÷†O=i‹gxì#–”豧#5ëCr
˘•Ìéî3üJ1ÅJG9Õ'Ú§¡¨ÆÃ2ÄI«zË#îH°á9ÆS85OΩ16÷?)ÈLá?˝jß®Y≠‰z:Ú¨;∞Æ9ù¸Ë ê[tr.%CÇ*ª/8≠˝R«ÕhÑ~ıG#˚¬±N$O1H
›©>¶ûÀäg–P!1Hù)ÎGjNÙùz^˝©9†¢åw£Ò†≈®ÌI#îPP(¢ÄÖ;X(«≠43v<Îul‰é
ReŸ#.>îiíÚc'≠Mvò√˙pi™’^S¡´,~Z•3RUï˘™∏ÊßôÒöÕû^ §kUπêcäƒπìúU€©qúü÷≤dlöF©XçèΩ4“û¥√L#x§4R#òQ⁄íì8¶+ãMÕ!4n¢¬∏sE&i7Qas!‘ù)ª©3ö,.d;4fôìG4Ï.a˘§Õ3ö9≈qŸ£u7ö0iÿ\√≥Fi∏4ò¢¬Á∫ç‘õh≈f.Í7Rm£X9òn£u}h∞πòn£q•€I¥–Ã7Q∫ç¥m¢¡Ã√u®≈¢¬ÊbÓ£u&⁄1Ì#˘òªΩ®‹i1KåPÃ\—öJ\Q`ÊbÓ£4õiq#]ÜÍ7RÕ.⁄Ã√w•(jLR‚ùÉôÜÍ\“K∂ïÉôÜÍ]‘<”∂S±†ÛKûi¡)|∫Ïn}iG^îˇ.óg0⁄\“Ï4l4X\¿)¿˚“i0iÿÆbPjEcöØÕ85+ó£|Ù5:∑ΩPI1Vë∏©hfiπm}95aœ5Q’òœ|“±∫eÖ˙UÑ‚´†œnjtZ
Qn3Î≈i⁄ÛÉYpút≠{5ÈÕë±jº
Ÿ∑+.’GØnÍFF„⁄Éñl–ÑzsWTp8™‚≠(8™G,É⁄ùH;»Sħfiù#E5›cRÃ#Q‘ö√ø’K˛ÓBˇ{π¶ÑÀ◊öîp©Ü–V0ÛÔ¶8…'©=>“¡ÓàíBDgëÍkrtâ"Ä1Ì,R›zn~ÏkF8≤zTëA”ä∑ÑÇ=ŒqävâQìUÆØ£Äc9>’N˚V‡¨g÷∞noyÎüZª.µsÛ¬≤•∫vl&I¶Öísì궨G
†¬ähñfë◊ännï!“9ÈYåÌúfêıÕ?4”◊• ∞“8§ßs“ìê>¥¿g_j:wÈÈN#<~¥Ãq#\÷”Ø∞|∑<t÷¿ åä‰îï9ÁÎ[6õ‘FÁûŸ§3WÉXZùô∂îfl}#ÈÔ[bïêHÖ§`Ò#ŒFE8FT˜®XsÙ≠´cc9ãü%éQ±”⁄© õOZ ±ı≈!#ûVöq#∆‚è÷óòÈÉ#Ñ"êÅ⁄óö?:'lR
^¶èz!#4îΩÈ3#√Ω%-%u£µËú«(`pGJ⁄ìCª≥
¬=+R∆`–œNEeW8»¨˚Ü‚Æ›¸≤ìÿ÷M√“)ßì÷≤Æf#5nÈ˙÷%‹›jMQZÊ]«™l‘ƃµDA≈R#‰ÊöM9•Ÿöv3sª4gäxJ]îôêÛF
O≥⁄çî≈vA¥˙Q¥öüo4c‘Ée]Mä1#l§ÿ*j1#Ìmˆ©qEC∂çµ.;—#m£mI¯u£‘ǃ[h€RbÇ)ÖÜbåS®≈3büä1#
€Õ&*LQä√6“b§≈†ÒF*LRbÅ≈ß„ö\Pä1O≠†ÒKä~(€#‚ómI∑äM¥\,3büÅF;PqF⁄v.=(ե̺—ÉN«µ0ézSÄ¢îRsN≈%-(ÈKöo·Fx†èzp®ª‘£$u¶!ÿœÌ£©£(›É¶#ä∞ßmßqX©≥⁄êÈWDbû!“ãÖä#ÎS«∏u´€ˇ◊R-±‚Ç‚Ï6:∂áäç!¡‰qS¨~µ6:£4Láä∞≠€5UR•ÕMç‘À낵l…X±∑oz“≥sø≠7tuáß5•#‹C 8€ú˚÷=õä€É†>ÙŒY#b’qGJ“ØßJg4ê†sN¸)ES 9®.n‚∂M“7=óπ™∑⁄¨v‡§d3˙ˆäãq®LO'ûXÙ“«›_M{(EìÖU´ñZV“$ü˝Bˆr“¬;u˘F[ª…≠·œjb#é#ÈW!∑ߨjãπàw¨ÎÌeb #è‚°
óÓ/!≥Lz
ÁouGôéÊ¿ÙÕg‹fldñ,rZ™ÊlæB˚S$t≥ºçщˇ*X≠˛mŒrjx‚U
î/Jb _N¥°
JR]≈´Gñ^V™≠‹«Z°shW,ÉèJãô¸˙”Nsˇ÷©Oµ4Ä4ÄäöjR=)ÑSLC>ÜêÚi¯¿&õè ê∆„úSëŸ0ÎIÔÕ4Û#ŒÇ ÏM~`*Úüj†ï°pÀ¯◊Akr≥F¸h[õhÓÌ⁄7œOc\„ƒ—H÷Û¨Ná‘z◊P
S‘lE‘[ì2˝”˝(ö`GZåé}™—Åª]xe>µP",q≈‚9§?Jn))ÿ§†Õ'oJ\{–~¥m&)‘ÜÄE)£•6äZ()`ò≈09„Ω&* EEª„ï$uÍ+‚Cä÷Û|ÿzé
b›|¨√Ú§Qìu'kÂ∑◊ªœ5é√-Î#€+ÌÕ?Zú'µx¶A⁄M†Tÿ¶Ù†Ò«Sç%%%/4îá⁄ìΩ:ì≠7ö)¬íÄíüHGµ6íùI# IN4î¬RRëfiíÅ E-!†9¢ñ饿)qEÄLbåsKER‡“Ì¢„F)˚h&Ä¥ˆ•ŸFÍLûô†^Êû<¨T4†PŸåv£ré’
.fi∆Å‹=*6ıß`SZò
•§†P!p1fiåQfièj)h¢Ä
(¥RúRR– •≈ÔFx†€NPq“õör∑4¿}IÉ“¢‹E874…ó4˛’
ú”Û“Ä$RzêÑê`N≠ÔS£˙Ù™†‘™«9ÕããÜ©U0*™7f7Î≈
1‡`˙S’M9#0_jF—òƒBj›∫∂E5c“ÆBπaé)3¢3–ÿ≤œπo”¨{8∆πl∏§àõ5-˙z—é®[}—üJûk∏≠cÀûq¬é¶©≤-<âóvGsXw⁄π|«U{∑sUn.Ó5 ∂.HÏɵ_≤“÷$êáìÙ»‘ßi¶=¡OïNÀ‹ˇÖn¡n®°QBÅ–
ñ8™‚De®p”Ê∏Ü’2Ìœ`:’+ÕY!R±[◊“πÀΩ#±%òñ'÷ö£´<Ÿ¬zXí‹<¨V>O≠0E%√|‰™˙w´ëB®0´ÅA€`áꙂ¶• ”¶M∞≈Sfl"ßsO
*’Ωú∑
ê0£¯èJÆ®IsÙ≠4“Fd;G•]Ü⁄( 2flfi4ÚM*«7KÅMßb§≤çÕßWA¯Uº÷˜µS∫µ‹w Ó(ñE4èZôÜ
0äC! w¶ëœ•J¿f)àœá≠H˝t“)aµY≥π0I◊éı_∑“ì¶ÄÍaïe#¿ı©x‚∞,.¸ó⁄«Â≠ƒp√ ÊÅô˙•ë?È0èùGŒ†}·XÆó°ÆµH"∞ı?≥JfA˚âÃ?∫}iä∆ASMM"`ˇZàèj#6ÉJF)1LÌHiqHE ı≈†É÷è≠Ä”“ê“ˇ:NîÉîÔ≠'Ω6´ *…˛UZQ≈!ê¡&«*z©«8≈HÁkg•%«Ôbœ®†g=u–÷iÊ¥Ó∫‹Vv>j`∆„⁄òjFˆ®Ë”L4ʶú–!§ˇ˙È)M%ÎE.);–bñóP1†
1K¯Q# IÙ•&öy†“3I#%õöbb˛îEáÈN∆i|≤iŒ¥T¢:p#(pSReTqM/# ≤ùÖ£-fiõ∏˛6§.;TY4P1˚≥IäJ]‘¿.iB”wsK∫ÄÅE&h⁄Ä“R“~4RJZCLB~QK#R– ¢éÙPE—ä(•≈ƒ&)h≈/µ%î”±«J(•†
x†S≈ JëRÅà*A“Äüù8!¶!¡é)‡úÒH§sL´ı:ìPÖ#*Ç(∞£jπí}jÇUàŒ1ÎAW4£¡´∂‡nõ 5£lr¬¶∆–ôøgé+nflÓÉXñéA<VÙ»h8ˆÔE܉i∂¢"B±èõ$dÙ®a∑û˙MÓN¬ycfiñŒ√y7>ã[–≈ÄPe6Ekfê T_©Ój¸QtQw≈Eu©≈l•#√?Ë("≈ßíT›!«µaflÍÌ.UN‘Ùı¨€›A‰r]…5L$ó.H_NÙÌa6,ó3mè$˙ˆßCj∑π‹fi¶ßäU⁄L®G4ÓM∆Ñ©Ùß™g≠J´œêÜ™zÛSG»€I&¨¡dœÜsµSZ1™D∏E4ÇflOH“¸«˚£•]»vöZöM8û‘‹”K~¬iÇ0∫tß
m8}*Ω'SGÁ#˜¶kõa&YGÕ¸Î1–ÇCb∑:’yÌÑÉ fiæ¥ÜcëÈ≈4Ø÷ßë
úÉQ;R"9˙”qœ.)§Pg4áß‚öG·#
ÈZVª«?)8¨“;–¶4uJ})\,ë≤8»#≤Ùο√ s»‡{÷ê<‰P?wjmeÚõ˝[sJ§Wk™πÅ.†h‹uË}≠síƒÒ»—H?xΩˇº=i!2∑÷êä~)§s“ò
«‹SÒ‹RP~¥“3O§"êÔGµ;¥û‘fi¥Rë≈ï^P*÷>ïØ)
≥/ÀP) 5vdˆ™,6æEFMÚƒ÷I\µ–fiƒJ‰VãÜ9Ä‘mR5Fií0Ù§4ÍJoJ6˝)~ÇêıÎ#ƒ"ìå”≥M'†õöR}ÈπÔ#öm©
§Õ&i(§”sKÇiBS¥m&• ≈;Jp#)≈ÄÈM/È#¿nQQnÕ&iê…ÈL.i>¥PM3ûµ%4–!¥Q≈Á¥RP(h§Õ†•Õ3'=(¡4Ï–Sv”∞£≠.Í3Õ®›#\\üJ9§…Õ/Z)i1J)ÄQflä=©E'z1KF;– ˙QKG4QåR–x†Êó¥{–&4æ‘
QL
z”3Õ<r(*ÉR™ûµı©óÆ("•J±”8©ó9ß`D=*A)¬û§PD^’ éúßöx#◊Ú¶
¬*⁄¶Dˆ§sR)\¸¶Ä%å`÷ï©≥î’Îcí4.∆¥!Ê`•nÿ€*pıïeŒøiÄHÓÕKx±å’Ì—¬õú‡
Àk‰Ö{¨À≠AúíÕœ•%≥N˚V$èÂOÁX2\<ŒV>OØ•9'lø ß∑≠Zä £#≈Q,Ü+^C9À{’µJzØJëRêXj&jPòߢ`™§üj“ÉOúˇ¿A˛t≈(m§ò·G…ÈZ1[E?y˝Oj±Ä;⁄ò‘:—üSMÕ
„∑RIMÕqsHM!4‹”B1≈8Sxßï&Å∆(Ë:Rä?ZNÙsGz=©Çx£¶÷≤Êçë∞¬∂ø
éhVU¡Îÿ–&1M`1Veàƒÿ"°«Z#DE&Iéˇï2ÄE6§≈4éhål8>µ∑ev&]§¸¬±Œ)— b` NEw:#y"™flŸò∑''*}}©÷◊+<`˜Ó*»<‚Ä9Ü]Ÿ;p¿·ÅÏj"£•mÍV]n¢?∆£¯áØ÷≤]˘óêyÕe|bì©Ù¶ï†C
!ßêiΩË⁄NzSÒI#«|QN£Ë1Ma≈?öFÈ# r’ ΩÖhJ*åΩÈVB#ÌXwjq[ÿcÈY7Ωh9∫”
HiÜò«µ%8ı¶PM„≠”I†=ÈîföOÚ†.)ÈM&ì”Çfò
£©SÇÄiàGû¥Ò©24∏P¥w†Ìı9¶M<ø†¶&íñÅ IN§≈&(ÌK≈zP11KäBh›≈-!£ö0Mp)πß̧ P!ô4∏4qFI†AÉN¿ön
4πåÒF(«ô4ùÈh¶äpRRå–Åäu ‡8†••£Q≈w†fiè÷ìö_∆Å-%;P®?JNîΩhÔF(ÔKLBÅJ);“é¥ÏS◊9®ÛN›È#i‡‘Ω*#MLß•M∏Uu5"Ú)Å2πäî?ΩWˆ‚§∞åSÉZÄ{öïsLG≠Lùπ®9ÎVQAÔ#…PdÉW≠˛≈Uç#´Qpséhz÷M£5Ìán¿¨XX‡f≠F•˙ÁÇç]ÛöF¬sÔV`Ägs|ÕQ#†W‚ÄULTÀiÍús“ßä'ë∞ãüÈHLJ„n7õ¸´Íjƒ6©À¸Õ˙Uú–√pF†zì‘‘ô®˜“®†CÀSM&‡}ÈiÖ¬íé˘4ú–!iπ§§Õ1i §&£-œZg“äN¸R‘ö㡧ÌäQÕ'JCPh¢Ä‘ùÈO÷ä#G$+2ê:Àöâ∞
◊ÈD먴µÜhÑG„M#ä∑qàÙ„±™Á≠"#≠7n{T∏§#4⁄nfi*\R4∞J–I∏~#÷∑ ôe# z÷ï=≠À#˛´‹S–Ø#öƒ‘,≈¨Ö‘~·œ?Ïöÿâƒä
=—%âí#
ëÇ
9FL*3WÓmö÷Srßò€‘z}j´/µ!«“*R)ò‚Ä#†ÉO¿Ù§≈3∏ÏsF)f)Ø“•≈1«¿ß-PòVÑ£≠Põ–R2‚≤ÓÿœÈZó≠d\ì# mQü|TÜ¢cLBL&Çriî•4”∞:]úS<pJ~—ÎK¿¿¥∏≈Ö0∂M H“¬örh∆ÖœΩ%.(†CqIO≈% ä1KöLÊÄ
C÷ó≠ı†ciqK¿§flé)Äbóäni:“€¿§‹h⁄iq#94∏ßbäf)qÌN§†§•‚äLbéhÔK#
≈ßc"ó#)psN
i¿P#4†bóñÅç•˙RE
:J(•ÌöJ1öb
ZP“ˇ*N‘ΩE©E •«Ωè¬äZ`-/“êußä9G*äàpEH¢ò…ñ§^µ‘Ä–†q÷üäb∑
x4p"
jë“û
0&^æïm9™*¸UÑõh¸uj1“®G8´Q‹éî‘Ñt≠¿¨ànì<÷úhq“Å∞j¸ Ù&≥Ì•fix<zöŸ∂í4È…ı§®-3É'“Æ™™å(¿™È85aX#<R“RÊÄÎII⁄å”}(ÔM'öZÄ∏‚iªπÎLf®ÀåÊùâπ!jaí°yÇåì˘’wπb~QÅM"[,<°~Ò´µ√ª˘‘aYŒI4ò™∏fi‘¥î
»‹P;ÊóÈH:QÏ)Z3û‘cöJ(ÌK˙–{Rù˙RÅH)¬ÅËƧêk2‚‘∆≈á+⁄µ1C¿ÉfiÅò8§ bØ\Z;◊ï™ÖyÕb,Svˆ.)¶Å"ô∂•≈&0(b Ë¬·ÂÈ[ä¿ÄEs<˛5£awÉÂ9‡Ù4£unóP4l9ÍÏküö"Õ\08˙˚äÈ3‹T3∆$\Rlg2H¶qû©ÂL¥ëÜ˚≠Ç*ˆÕ
ÎHGª‘“lz ÁµÊìk¥ò~„Ù¢„fiö¸
>oJçÀw≠2ºµü?zπ)oJœôè9†EÉ÷±nOÃkVÂÎw,ÿ¶
Qöy˙“h<Q∂ùMÌ#Hx£wTgÎ#«3
fÏäOz(Üì¶íÄ
1E!†Õ∏§•∆(ô4`‚ó Rˆ†Aä8“M&
u7öv1KåSò•≈>éÙÆ1≈-.(†afiíñíÄäZ;PQKE&)8•§†≈QìHC∆)xM4&Ói{”)iÄπÕ(ÈM“Ûöê)ƒqM†£•¢òw•≈E∏ˆ†4¥ù©p(q#£∑P!i{ı§†S‡äPqÔMöp‚Ä3ÈOS€Ω0S˚”E©A®á^*UŒ{—`ºv5*Ù®◊u<ÜòáÉÕ?&£∆*AÈBEı˝)™3⁄¶TÊòòÙïX##˛Uåu5*ÒfiÅì©¡´0 √™`‘™¯†FÌ•€`
€∂πœz‰°îÁä◊≥∏<sCC:Ày≥Z…¿Êπ˚iè©ù9©∞™„¸Ò÷©nr∏F˙∞ÕX
≈!8ÊöHı®ÃûıíùÑŸ+=F_äÇIÇñ≈U{≤flp~4≈r„Ã|ƒ
™◊E∏A¯öÑ)cíI>ı"∆4hRflxìO äxÖ8z˛îƒ&;RÅ»ø“î
.8›#≈!¬•€ez #ÈıÆ•$IcVËAŒk¡≠oß∂ ©8Ó;Wq·}b‚[®£ÖàW`Æå2>µôΩªà=Ë¥üZ=È¥wÈIK# ⁄ä(†pz”á≠4t߇R¢É÷ìöZ#&29ÈT.≠1Û∆8Ó+Cµ!È“òL1≈4÷Ö’ØYÍEg˚Pi)sI÷Äı§‰s”‚)9†
K≥"˘n~eZºFz◊8¨»·Å‰V›µ»û0sÛ¢Åôzù±∑ú‹ß»#ìè≠gyí,ÖI˙WU"¨ëï`
û=Îúº≥ky|ïÎ{zP"Ìúfè1©1ë“íêÊÿQºÁ†¶‚ñùÄqojÇFÒ˙‘Ω™ {ÒJ√)ì≥.`÷Ñ«ØjÃ∏8Õ2Óõ≠d?Z–ªcÕg6iåa¶såû‘Bÿ¶ìF=M0GZ3G·HÌEîîîÍn}®ÌIú
2M&)ÄgÛ£ö]¥∏ÔHÅöv⁄Z(òcfiä9†ùii(w§•¢Äö){Rw§ ¢íóµ
9§ÕƒRRˆÊÄä)hîô•ÔF(3EQö.9ßåTy•êÕ‹ÛK∫òÖÁîg=)Z-è∆äb
ˆ„dQK#ÇéEËÕ/z:—K#Çñì#4¥¿Zv=i¢ùÔLC‘f•QQ©©Û#…c©◊ʆCÉS>î'Ω;ë⁄£‹sO\‚Å«•\SGJrä§&…îäî6j%ZëE0&SRzbÉR=F)†ÛR)†(ß®ÔLDÒú÷ù∂r+:!œZ”∑^GzmZí#≠hO±Ì{w≠Hé)£úTægΩRÖj7∏v·F>¥ÑÀí\*éXUG∫f8AèsPf9bI˜50N=Èä√Bñ ±…>µ(AÈJ=Ü)ÿ†q–búÁ#ÎœJQ÷Å≈(ˆ§ÈJ}h·ÔG~)¥†ı°˘·Sø^kª-±˚\nG›«˘W˘ÄÔ^ù‡õmñÚ»G™èÊ•fo≤:·È≈/ÎI€ÅEãfl§¸ii:u4Q≈ é)E&1N~Z®†«4¥J
√#VuÕƬYGÀfi¥}h‡úR©§€Z76ü∆ÉÍ*âR8¶x§€O∆!ÕG∂úé—Qàœ∏§ˆ†~“Ë∞Ú‰?CV.-“Íç∏Ó°ı¨åêkJŒÁÃç√ßΩ2dâ—⁄7ëz˚èZÑèo∆∑Ø≠~—t‚T˚ß◊⁄±NgpAÏiXD8†)ÿ˙P(§qU•<¥ƒTÊ"ò'<VUÀc8≠€Ø5èt„ûh*È≤jãUâ€-÷´˘–g'Ω4úSà4“0ú—ıßp;SN)ÄqÈIAı¶ÊÄ–IÏ(£Äi…4∏ßRS0(ÌKIèJ(§¸)i;QE(ÎFh†¢ÉÕ-%¥ÜÄÈIÕ8“cämÍJm-¥ÄJ)hÄäu%‚õäì¶‚ò§ßh€ÕJ¢ú#&ú#"Äú;n;—Å#1HE-!†ííäøç%¿vsIKE∏¥P)hM∏ı£NÙ¥æ‘SÔNRÅLBäëO5ÏÙ† ¡„äîd‘HELå†d™=ÍUJbµLß∞¶ ÉRî*ìRÖ„¿NjdAúû)ù>æ’*äJ“§«=iàjU‚Ä∞¸ÈÍ9Õ.8ß/i䃱~µ©k€5ñúZ6Á†
à[Ä[V!x5B ’‰läCEàÜy'Û©¬ää.EX¿¥;
\R– öP2hÔF{P&áp(œjNΩi¡IÏh9Ì˘SïI˜ß"yÕYäë∞äIˆÄÅbÓ’:E¿¿≠XtuDfiNê!‰r«Ë+f“’#Ï∂x˛Zœ¡?A◊˘P4õ>L∂O2·©≈zflÜ Èÿ˚ÏOÙ˛ïÂölaÔ#ßjˆ=:/'N∑åéDc?\sP∂6ëkΩ~ΩË 3«µ'z^Üä8‚íÉÔN†{R“≈(Ä(ÈGzZË¢íÄ
m)≈%V‚–≤®oø£ßµbc“µn-ÉÇ»0{èZœe=∆)᥋TòÌA^iUä0 ‡ä\Rcä’µúM˚C≠S‘mH&‚ I«Œ£∏ı˙‘ÌÜSÇ+J9D…üÃSÄp√êzSqfiØfi⁄˘dOıLr√˚ß◊ÈT ë⁄ÅfzÕûcä“ï}´>·x<P5‹“Ä}ÎÊWÓÎö€∏Aœçxº–êÃÊöffl“§cŒ*2iä„Øjåıßìäa…†RS±F(∏4bùIfiÄíîÊä#'4îÏ˙”M–“~4¥î¿(¢äC
NÙ¥S§•¢êƒÈ≈-PEá¥ù©h†«ÑR–iü•&)rh†Ù•ô¥¿>¥Rfå–—èzN‘fêÅE ji&ĺÙ*#öJb%$ë#Œi´O≈†ä\RäèmßëœJZGÉö0qRbåSÄR”¬äv⁄åfó N)‚:ƒ;i·jMîÒ†D8•¶”ÑbòXØäpZ∞#óÀS\-.”åUêÇúGj≈`ı©îNµ ú‚§RΩ†V1ÌVUN8
úUà‹fÅí"±„<T´‡Êñ9åÒVH˝iäƒbzu©:TÈ,CΩOÁ≈€≠UX\TÀTfliàfû.°ÓsLdB~Çû!'úùo ©“Í‹û†õVÈåU€xõ)Îq ´PœlShP„êjÙ`‚¢ÜT8ÈW#d≈!éà‡Ù´9ÕBÖ=jpTÙ4Ä\¸¥‹úz”¬gÓÛOûÊê¥#Îfi§Tn¸T±BŒ€cRIÏ5§∫`Ü1-˝ƒv±r«ìÙ¬◊2÷<ï9ÜDPç+t$u≠{hÃúizkMûó7cjp:ööÛCø6œssz&ëLjªQG|z‘¶∂0"桉+RÊ;‘ñ÷+V«+vU… ÁYÑmò}+≤∑åKek .ªO∂Fò`ãz]µüð§îıñCπè„˛vêÄ}ij>NÌøü|´Çw2ØÊΩpzWú¯&€}ÙlW°-˘ÒØH)!Àp•£)(ÏÛIö=©»†•¿Œi1éîΩ>îÄwZ^˝
ıÔKfiÄGzJQÔ#!•§=h)8•˙RPéix†}ihï– Gfi˛u=%&3!–´r)ÑU€µ»»ÎY§∑J#<ÅM«õõΩ=ËÌNäVâÚ)π>îüÖYY'á’H‰…û‹€KÂÛÂü∏OÚ©mÁ0æÑı©, unU∫AÓ≠P‹©é’ü:u5±4L¨… ƒã¡«zÕùx"êå;ï∆EafiØäËÓÆb^F9¶uÕ4‘“&£≈BWö
äê”HÕ0#≈4”»ÊöGµ6êû)ÿ‚ì•7>¥á≠)§§÷özıßS`√ΩRg⁄Ä
(§†£4ôgö-ô£4Ü/Z)(†§˙RfèŒÄ4Rv£<PÊì•ÑÛ#~h…§Õ&h«ÉIIö(‘g⁄õFhƒ—⁄õE-Ph)#ˆß‡Q≈ ‚ùìI∏
Qaìö2iÖç&MÚ¬ìu2ì4¿ó}(oZã4g“Ä&äP˛ï„Kö∞$ß *∂iwqEÄ¥%•Ûj∞cfiå”hKKÊ’`fiî°˘†W,˘îÔ0ı5XNŒzP˚Õ.‚{‘9ßgµ'
ìN
¯‘û¥‡‘ƒZ«4‰µT5=[ö∫¨
ê5SGÁ≠Y
LDÍ∆§V ı®©ö`Jƒı†1≈"„“̆"∂{‘Í‹u®T (2±3VRVrj∫ÉR®ÊùÇÂËØ$˘™ÙZÉǨïqS'P#r;‹úg≠_ÜÁΩsÒ7Ω_Ç\w¢√Ωéäi1ä∂í[∆B±idˇûqåö ÖÂ~RJ®Ä=FFùtìj◊Ãùåz˜?•K–§Æ]∞∑‘Ø2ê*X«‹®'Á–VÂ∑áÏmœò Û\≥L€fl>ŸÈZQ¢«TP†vä}gæÂ_∞ä6®\ìÅåû¶ÜP TÙ#ñä¢O?∫å«)VÍ≠É]Và‚]<∆‰œ?÷∞µÿºª˘ÄÓw«ö“ºπé‚.¿Üè˘
∑±=NÖ~Ì-5;ÁØSN®,˘Ô¡V€I1—¸Œ•v#ØJ√¥Vñ[}…ÈÈÅ[É•{ãö?≠oB#'µ;nÑ4QJi;PJ1öL~¥·öCä){–ER–!)
)§†“Ù§«4tÕ;R“
w·# HzR“îÅÆ∫V~*˝flJßRPÃÒ§∆
HE7ÃQN«•!Î#98Ω
É⁄±"\ÃÉ˝™ÿúÌ∑•4Öß⁄#fiòØ›>£“π©”#8#‘“∫[{çÀ±Á±¨˝N◊;¶çrflƒ=}ÈàÂÆ„䫪è Ê∫”ØCY7)¡†G/s◊Œj±Õj^ßZŒ4¿àéiÜû}i¶Ä#4”N<”M0Õ4öu!‚ÄfiÙÓ‘îò§ßRbÄíùI#
£¥PRS®≈E)ÈF
%¥òhîsE%
(¢ê¬å“Q#Ñ¢åRbÅÜy¢åQäb
(§†RQEöp>¥⁄)~~¥“¬õÕ'„L-#9”#†CÛA4ÇÄ(z“p£⁄Z-%-£)h≈ òöu7Ï`P!s≈8n)h¿”Å4¡÷ú:”yßäéî(P}©Ÿ4‡h(«Zx5yßÇ:S(<‰TË‹UP‹SïŸÕ1’ªÙ©AÔT¸Œ=_fiòÀ»Aˇõr˙’zôNNh,´rjUnqÕUSR£cÒ°Ç.©«0léïQ$©ï≤(“ìÎRÉUA©U±fiÅcjª‡ıŒ+1¨∆˝®—Èó>\ÿ$mee?à5”Èóö7œÒ1䇷ì9ÆáO∫¿Ú9•$T^ß®Z $âp{Uä¡”n≤PcäfiÌY¢§¨¬ä(¶Iïé,]$Ä}‰˝G˘[√ròı!?ÎØ‚9˛ÜµºEkHfl∫∂?1ˇ÷ÆwMê√©¬√åH2}è˙’Ù%ù»˚ˇ•>ò~ˆiıû?§≈‰Èp'OêœüÎWzqMçv∆™;
~G„#\´!Õ–„ÓÆx´19xï±ÇGJ≠›s#N8mNO™∂Ñ‹]√—åÙÊ¢}·¯<»¶ÔÚÒ»ï-
2p8ÔJF$s≈IJ√ø•çˆ•†ä(ÌHbgäJZn2(3Õh†
xv£ö_c# H‘øÖ#{“c3Óè?çU©ÓèÕPTå;RbñíÄ”÷üM?÷Ä%¥]◊(=ÎNÔàπ™õê}]Ω8çGΩP‹ézT¢1pflxŒ£®$ ∂E(j6˛Y.ÉÂ=#≠`›/ZÍùñd ÛÿäÁo†19Óüª˛Õfi/_Áä…qÕmfi'\÷4ãÇsL®ÿT¨*3LDg“özSˇZiÎ#4á„M†ö
/^E6ÄÙt§ÔE
J))ÄQGz(¬ä3Ep¢äL–1i)3Fh≈Ñ—ö/jJ1H
•ˆ†–bêäv(Ì#≈%?ò†ciih†BbåR‚ó‹RS±F(îòßê))ÄÃQŒiÙò9†BNé)h••⁄iq#
•Ì÷åbéù(QfióÉEΩ/zAÙ•Î#É£äZ(ÇóΩ©y¶bú:RSÖ1·J)ßP˜•))√Ù†ß
hÎÕ;Ò¶ÑÈŸÊíñòâˆÎS´Ue„©”≠0'Fı©‘‘å‘À…‚ö`L=jEa”œLSÖXF¸Í¬û*ödÚju8q[fi§Sûj™CRÉä∂áú’àœ9ÕRå˚Uòœ#µ4°j‘≤ólDZ¢?ï^Ö»#¥
;ŧ‹ìgvvp>¯T˚WùhíÊ1Œpkª”$-”€•bÙfª¢ıQLíñ≠õ¶ …Q∏~L◊xóé2+–d£e∆r«≠pt¡ÕTIë›≈/ük™9t9ı09˙÷vâ.˝"ûP?Å„Ù≈héï,•±Â√“ê“îTw
∂>ÇÅêŸÚcÔjÊÎ◊•Clõa\zT‹ÉVâr£ñŒjÑ◊QD¡$b7tÕ^ê„üJ ºú˝©c0C—±“ÜR¬gÕå!,ç◊⁄¥á„Y¿ˇlY8ÿn;ÉZ†®ê‚(«ÁKEï!ÖºRRÜõŒ)«•0– •i;v•¸(flJZJSı†¶9‡‘ïáÂ4ôHÃúÊNï˙S§ˇXiøùIA◊µ%/ZCÌö%6úy‹SÓúøºcÌSflûTS4≈·œ“•ªÑπ‹:Å“ò’Ƕ#ûî«)§|ôF»™◊
≥∆GØÈW'Z£!(säb9ÀËJ3)¸=Ît√]ÖÏBT'ÚÆZÒ0«"òÃ*9©ÿsQ0†Dt∆‚ûi§d`3ÈM'<S»≈&1#öJq¶–QA¢Äíî“PIö1E¥‹—#¿ö)3K# ä(¢Åã⁄íóä(ú“‡P)p(üç¥îÑ˙—E?ozwzm
J^îPÕ/4ô£ö\Qä(Ì#)0)GZwf•ÿi€ÄÙ§2
bóòdÕ…†cç‘‹Êó<”á≠≠.3EåqKÅ“ó†sJ))h¥¥ûÙ¢Ä•åq#¶ €N≈Êùä`6ùfl•ÍN¶ùäN‘∏…‚ŸÈ√fiêNÔöbSÄ˝i†å”¡‰S#.”SGå
Ñ6iË‘ƒZQR.qU’èJïO4az”«Ú®◊Ê•÷òRsS%Fº‘ã€äLßûï0&†≠KüC#2¬1∆3VPÛTìûsV£§4_äØBA¨ËèJ–Ñé(∏&ã.+æ—fl?ïyæì&Ÿ¿Æ˜Hêí∏5î˜6陫GE®¶#W¨D"æú¥OıÆ⁄π/57.ÉÎ≈4&_ƒª≠ßã∫∞oÃc˙V‹móe=F˘¸´ñÀo⁄2xx»«©‰◊G∏Gz£≥ÇøèQ¸⁄á∏&y«|U{√˚†£© Uè•U∏˘•âGÆMd €p1⁄ü∏“°‹»§„ùÿRÔÀ2û6ıÕ]âÏˇ®^7bPÇsÉÉRMúØÎäÑ".Uww u§u'¥äD∏öI√chØv™∂2˘ˆ‚R…´`qP˜)9ÎG¥îÜ‚ê”∫ˆ¶“Üö~î„LnÙvÊú:TA˝i$∏X∞>Û∑›A‘–"zZç%Ñb¢B2T‚§†‘3î‘«=*ºÁ§ Fkπ>Ùw˙”r3ö^*Fá≠/È(¶ëA†”#jÈ√˜˚’á˚‘À≈≤˚‘ç‘”únÀ(Ê©7‘njùÃ`Ç›
3eù8Õi  *Ñ»Fh.c∑9ËkQè#pÆä·çxáiƒsl0j&ÈV•_ò䨑¿à”*CÕ1®Œ¥”O¶`7ÎHi‘Ñq“į”iˇÖ6ÄÕ•§†§4¶ä7ΩÙºR–0≈•¢ÅàGz){Rcö¥bäR—H¢å—Õ∏Qä9•†–G•-!†ChÈKämh†P=ËÈFM∏†BsIöu7†öNi¯£e0#ßÖßÑ•U≈ -8(•S¿¶û¥˙i†Ô≈.M%/j_j)3≈P!Ÿ•ÎMÙπ†Rå”GZp˙”√≠(œZniA8†CÛÈKËi¢é‘¿uÆ{—”µ(†BÒJ):R–!hø≠'CÕ("®CÖ8uÔHæî·”9†d´”ÉS/^ıæµ84¿ùx"¶CÔU≈HáË¿Îœ"ëË:ÑÛ茧Fˆ˝)ã‘ùI»©ó$ı™»xÍzÙ©îÄNß"¨G√UUaÿ‘Ë‹“≈Ëèzø ≠fF«?˝jΩ}y†f’ãÏïXz◊s•Kǧ•y̪|À]¶ì&U®ô§›ƒKD§ıß‘Øæ˙Tı+`{Ö`¯ñ«√∂Tˇ1˝kz≥ı®ºÕ1œt!øßı™B9]&_#TÅ∫˛ÛoÁ«ıÆ≤˛C«&xV
è¸tˇËB∏êJMπIr+¥ø+=ÇJ9Vœ±~∏™ÍÆLUÙ<˛òaW‰GqN˜ÆKV◊nÌıwÜ∂¢ë¥u<ˇÖJ(Î6H£
¡«£qQ∑L:œS⁄πÎ_ëÅs>¨ü·[6˙›Ö»¿ò+œ«ˇZöìW‘ÖnUF2Z√$í1}ºÆF+L≈ò`'êVö 9∆ÌÀÓ(RBiè∂˘≠—»⁄XäüÖ})ÿ®(i†SΩÈ
H~îΩi?ïx™◊≥˝öŸÂ∆vå‡U£öç‘2ÌaêhÀxÄ›Õ4kUçïU…Íqì˘U¯e9aeœflô∫¬ãçbìu˘z~Uü5≠’±˘‘≤g%á"ò\◊éEÚ€Àm뉸œÙ©„πÚ‘c˚®§Ók;ÔõsƂ߉ÓèzΩŸb ‡»>¸≠—G†◊)‡¯…Ry^‰·“´,Ä'ñ6ÔèûS˛Áäl”∂
∞˘ø∫9⁄=ÕK∆~îqH¨d)jF˜¢ä(π£Ωîcp¶Ä‹µ]∂…Ù†”„˘av b#j≠7›5e™¥¯∆)1¢ì≠Tô*˘™“-!òÛßZ«ªOΩ≈oŒµìtô⁄öÀ].÷&≥⁄∂o£ÎYi°ê0j3äîÉL"®DgÈMߘ4”#
4⁄wCI#
‚ê”ඖh≈/ÈI# E-îÜóPîbñäO¬äSIä(¢äC
)qF(¥S±G—K÷ä(Hi‘ÜÅ\i§¸)‘”#%(Õç'SGjJ-(¶ÊÅ#dRn¶“–∏“ÉœJ1J494º”∏ÖÖ1œz®¶ìFN(¸)9ı£&è•ñêR–˜£Ω Ëu/Zonw4·L…ßîÓù©¿”;R˝)à^î¥ô„“ó&Äı•qMÊóúÒ#Ì#ÎI«•/·T+ùö`ß{P!Ífi∆ß
ı®«·R°»ÎN¡rpy©®Wfi§S⁄Å‹∞ß∂i√Ø“¢VßÉŒ:”$∞µ2úüZ™ØŒ*d$ü•;ÀJ{‘ËOlsUìäù8Ù•`πm•^â∏VzV£`8Õ Fú/Ç+±—$
‡ÙÆ'΢—ØÑNúd˜©ív4ã=KOp–‡≈\Æ{H‘hÀV˙:∫ÜSêk8æÖHuGqõo$}ŸH%-Q'ûL•dœ°Æ∂«7ûçà&—ˇ<!\fi©ïw2cÓπ«”5ª·â∑YÕ<£Ó¸ˇÎß∞ñí8ó8_jÛKŸºÌJÍRx2}Ëó“àlÂêˇ
ìÕyúy)πà…9>Ùtzí{“á+åolsKı§;≤‹ù’°)ù>áäÿ∂Ò|—ú\Fí/™¸¶πô*çèÅà8§⁄û°g‚}6Áè;Àl„q˙Ù≠qu P¬E¡Ës÷ºSO-3âé–ÿ¡Ô]µïƒé'èZL≥∏#˝÷Ëh™Zm≥Hä›Õ\⁄A‡˛tÇ¡÷õ¯R‰˜q#
Ì˝)¶üMÈ#Ü̈•ÿj^3N≈f‹ÈPOÛÿ˛´≈cÕcuhAU2 9 È]I®gëaâùø:ìÿ}hó≤‘ù‘¥ø,õŸcê=3”•_‚X˜/À<Á‹’Ë¥ËÑ)Ê∆≠& sé¨y?©5ZhcÖÇ¢‡Ÿ4ôHåJu©MHƒÌHzußRS¥¯ó2(ı"öpjkE›rÉfl4 6˙'·Qï˛ÂBjÖqèUg´/UfÎIçw®‰-5ÜGJëô≥'Z π^¢∑&JÃ∏N¥ƒsWqgwÛ¨)ìkÍ.;ç`fi«Çp*Ñe∂MFjf’¶"2?eHE0˝(√÷õRm–f)ßt§=hîò‚ùÉäJJ
-†¢ñé›((•«PqE8“c⁄ê ä)qF(R:Z((Êä
£4îP“QHh¶ëN¶ö7µ'zZJ
üZZ`†öL“fÄ)¿äèw4MHX
M¸‚£§Ô#»£5‚û9†Dôgµ7‡!àzÒE)£≠1 Õ.)(œ4u‹‰”®{ı‚ñö1N†•<uÕ7µ(<SÿœZwò3N¶”á4 wj/Säp¶iŸ†ÍO¬éµB1OúJB$Õ=N;TaÅˇP‹Êù¿∞πÔR)«zŶ{‘¿åbò”fi§
PÜÎR!˜¶™ZôlÙ™¿ÛR«J∏¨}juo¬©+˜„2∑°†EÙnj¬>A™›Íq'l“IØC>5èò=z’∏fl¶)Ác§Íå•UèN˘ÆÎI‘ÉaıØ$∂òÇ1∆
v˙5—!Mg$hüC–A„4Té^›IÎR»‚(ûFÖR«‘&9OFPr#˘î˘bù·iJfiIgè?à?˝sRxòun√ù—÷nì(ÉRÖè1Ò˛µk]ì∂ß‚Y|≠a«ÃÚ=xÆFÆ£∆3oˇ-ëû√ˇ◊\∏»ˇÎ—–W‘^Oj2qü åÛ÷Å#»§Ó{÷m„e∆¥f'VM˜ÃqûÙ¨›=1C˚«8ÆøNOôkò∞èÊçN~U‘ŸáKñ,?*ìDwzWÓcG%W˘”Ô‘ˆÉe´û£f´üΩ#
cÖ'5Z≈ùÏay≥2$˚ÛNømñ
flÀl}qREé%åtUR›∫“Sà§˙– ≈(
wN)Äfi¥÷#ÿ%A¡»ˆ©µ7°†do–÷Uœ2}+RSÄk&côMK:Rˆ†c4bê≈8«fiÉGz#'zπ`ππ\vÊ™{UÌ1s3ATÄ”~BjY:
à–"&™≥}˙¥’JRKöMç‚ÉEÜA2zVm¬q“µùx5B·>SLF—ı5ç{s]©¡ı¨´òè5dú¥ Uà®≠i^EÇN+9Ö "aÕ4äy¬)Ĭ)‚úz”{PO^i0)‘‹PH§=i¯¶öLQä^})
ä1KäCΩ‚é)h‹z“˜£ßZJ)9¢äcQ˘QöL“QGj(;QEË)3œ4ÍJ`6ê˚”øJµ IN#Û¶‚ò A•†
mß‚íÄQÅKä1#
≈ÔO≈
h4¸qIê)wPÛE&·KûzP~¥îº‚ìΩgQGZ/Ú•»ÌI¯—≈/·K⁄íñÄS≥M£•0
:ôûiŸ†ÔÕ-7#•-1ÁΩˆ≈ı£4·÷îgµ ‰bîu∆x¶!~îî~·LBû¥º_•(=('˘ÕNû¯®UÄ#Ô#_ZêT#ˆÎNöwBqŒz”Év$èZrı˜¶"¬7>’:±ùUSŒs“•V°Åm\T™ıQ_ûSR´Êê#ìÒWbì=Î)XÁÆ*‘R˙ö#m[>XWm°ü›®ÆfiBJÒ]ÓÑfl∫RLûÖƒÙ3>Gµ]eÖXdÇ*ûö¿€U⁄àÏ[‹∆‘†ÛtàXå¥X˙v?≠s‰õ5€›B h«9èØ_Á\L£å’¶g%sÃ|U(óXé>æ\y˙gˇ‘+ „◊ö≥´KÊÎWLGBsÏ*ØÛ˜¶ƒÄ±ZØ-À(‡u©X˜≈WAÊ]∆ò‡µZ≥e-#êaÉéOΩWüF∂`]öA¥pEhÅÉ¿®56Ÿ¶\êp|∂ò≈%ΩÖs'O∏&‚ØÕ"ÉÙÔ]û„t0H\˙◊7ß í®È„ÚÆ£NÕ,N(íIËRwGWi<Àjz¯<„4flµÕŒHaÓ)‹m∆1≈Aª?ç#ÓKst&ç d¡íD^Ω≤3˙V†àò}J÷>∏,Á˙ë[ÇÇÄ—KÕ%söZ)iÄú“v•ÔE!ïÁ˚ßö»|Ôb=kV‰¸ßö ˛#ı©cB
wjLRÒ#'z^q“ìö'J”“«~ïõZ˙b‚>¶öƒΩ#®çK/fi®èJDıI˘sı´≠fi®ûX“cBRR—”öCj• jŸ®'-4#HŒ
f‹Gú÷‹â«Jœù85d›‰≥X≤°Rk®πãÔV‰$6hÃ∆MG–TÃ*"(3MÌRcúSH†C
&)ÿ≈4f
&)¯§Ô#
≈&)ƒbìîú“ö=y†¢Çh=zP1;—«zB}Ë&êäJBiøù0öJO≠–!h¢éîQG4w˜†äNsEòÊóèjCM∆)ÁÈM†bP1÷ä(~î{—E6äw§†“S©∏†≈9A4`”î.”NF
8†¶ëKAÈ#(§Êä1ı•§ßSùOz\QK#ÖÈKH>¥∏Ô#≈•ÎIä\v†Afiú1IäZ`-/ni(œ˘≈≈¸
:õfióåPÛÔJ:”1äpŒi°\x‚û ≈F:RÓ‚òdÁ(Ë*êy5"úÛäL¶§ÈR/°4\w&åˆÇóö·÷§®∆;”◊•0%Sœ #ÛÔP®ı0”≠2Iö± ‘
9ÕNÉöE#J’π◊w·È¡ç#5Á–êÆÀ√ì.˜©í–®û©§ ^ ÌëZUâ°»¡Ó+r≥âopÆ˘6N‡`·ªWtFAµ¡›∆RgC’IïQ,Òóc%Ów99¸kgI¥µkI¶π¥ñ|∞DÚ¡»„'ß·XëÒ'π≠K=f{fÜ%èi$¸Àfi®ì*‡ßú˚™;Tı∂j+øQRG
4˘”¥ïs#ú¸£0JÃ⁄ı™—ˇâqå‰yéâˇè≠ÎÍ+7[e+jÑıóp˜⁄ •ƒ—éó_fπyBÓ$‚ª*ÕØ-R‚Uïø0}ÎãHZ]ÿgö˸?|˙d‡HOíflxiÀVTv7ÊöÊ—v¥n#„*8™–ÍœÇL`sÛúfÆ\ΈO¸.„”è®j1fi(An´éçûqS`π—i3≠fi´#ß)XŒ;í?¬∫÷πØE˛çq.:∏\üaˇ◊Æò})!¢ó•†R‚쮆ìµ(§9¡ÊÄ)])¨¡ÎWÔ y™RPÏ˚R‚ìØñê åuÔIJi(≠Ω<jß‘ö≈Ô[÷kãT˙Utìñ5Èäë˘o∆£j##ÁÉTÒW$8ST¯§ ìΩ;fié¥Ñ0‘2Ù59∆* i†d.EPù:‚¥ÿqUe^µdw‰≈ªÉ"∫i£ÎY7Qg<b꧗î8≈Vn>µØySäÀa#»M&*B)ÑP)=È‘î‹R~„ÈöiÍ)ÄÜõåq¶Ûûh§˙QH~¥3M&ÉHH≈“göJ9Ì#Çä(†<““Q#Ó/4QfiåP—⁄åsAqH}®ÈJH†¢å“fÄê““# E)(Õ§≈.8†Bg4ùÈÿ£fÄÅN¿¶“}(&Wû`Ï*2i(.˛(…4¡“ú(fi‘∏¶è∆îP!8•†ÉGC#≠E/q#∆);QLiiΩO4Í^‘£i≠.iÄ·“óîRÌK‘QE
QG~‘v‚òKfiìu•ÔLC≥œzp¶éΩ)√<öaa¿‡„äëò9ÌKë¡≈Nò∆jP}™a⁄û
L2jAœ÷¢Rx«Jêz–¿Á ”î”÷û<~¥ïséºT´”öÅx©S–bò⑘Ó*U$ı®óé5"éO•!ÿµz◊E°\ôFMsH≠iÈÚys.Ü4{âpß=k´ fiº˜D∫ !Õw∂Æ#µb¥fèbj„ı∏åzåºp«p¸k∞Ækƒë0πé\|¨∏‹˙Í–ô‡Ä!8„⁄Ç n¥÷…ÕQù»‰8Ö\—”;c´`U õÂ≠}5vY&xÕ•±u#≈bÎ-∫Ó‹Gc¯‡çméïvÀK”ÆZio£VrÅ4ÖxŒ{Q&ô»ÈÒ9µÛNu´x«Ø©\ÈÈ
XÿZ¬±[í™‰≥ìœ^ùOÈYKq!G°ÈK¡#4‹eiÀìÅ”öíéÛ¬ÒÏ—ë±˛±Ÿø\J€ÌT4hº≠&—?Èê'>ßüÎW˙qH¶%P(bé‘Ω©1#¬öˇvùMìÅ#óáéS*Õ·À~5X
õ;R‡bì'&ó4ÄCöBy†éy¢Ä˜´°ÄbˇtW>ú∏˜8Æëxå{
†+7ZcSÕ1∫PyæÈ™ù˘´3üì“´˝*X–îî¶íêx®$Í*sUflóö‹L"´H
Z™Ú
≤JRäŒ∏èÆ+RATÂœ]≈úÒXS≈µœ∫ªò¯5Ö}ZC2®»Ù3Ø5
å”M8äi‚òÜöC÷îû‘√#4“h'öaÍh'önhÕ!¶I÷î—H≈¥¥ò§ÌJ(†•≈-%-ä
%-P0"ì¥P!;ÛEs#7†ßRP1(∆('äL‰bÄä:“u£Î#u‚ñÅi¥Ïä(òÔF)‡QähßÖ≈≈?Øj#)G·Iœ“äåä
'z-z˛îP˝(Êéù®Õ04æ‘⁄wj:Sánî⁄PyÌ#©~î›fl≠‚òÆ?=©G•3í})›∫R√øºbõ÷îS‹q≈;Å÷ö.F)àxπ"ößgfiãé√«#µ.)ÅπÔN?âWü≠LâP£åÛS¨ô¶Ç?Nï \Tj˛Êûöy≈ìÕ.F(«Afiò;ä:Ù≈H¶èÁN\Êë$¿˙ù<v™ÎÄFXèµEò◊ {U»FÊ™∆Z∑ÅŒ)1ùNç©òX++“4mUKpk«‡ìhΩ+§“5#™1„∑5-ôÎÅÉÉëXfi#œŸ¢Ù‹i4ùDH™áëÎSk„~ô∏Ìpßı©L´j|„˙”HˆßÁ?•»Ù≠nb ≥∆H8Ù§KÀ∏! H0*—#µî∏8ÈËiùã∂zúûybï«j{õ◊û‚yàPÛÌ}ËfDû8ò©ÛÄ
bÁúû˝i˚G8ÎÌQñfêEÃç–
Üsˆifi?6F< ˝áµ;\J“9ßFª‰P$‡T6ÓK†ïqF‚ΩqÎı≠{8≠§◊-„µi*êeP‡‰‰i"ñÁ†≈é%A—#Q¯Tò§„R1¥}E-¿J)‘òÊÄ¢óÓ‘’ÿÛ⁄êÃãúô*!OúÊR)©((ÔäA£B8§ßq⁄êÛ#∑]◊É›´¢<Fk…I∫å{Ê∑ü˝Y™±∆!ÈQ∑Z#S∏Ë*S‹}·P‚§°8ÔIäv9£‚ãn•1¢SÉRbé( c©¸Í)-¡O‡jÂ5óåS#*KqÍ:ß,ûOÁZÚ/Zß*Qp±ç4
s◊Û¨´ª`AÎ]´ÌY”¶A†F‚/-ÕUaÈ[w÷˘…±ùO ”¶5JÀäâ±LC4”ıß5FhsM?çÉ#܈§•ÌKä`7Ω-ßHcqKå—EbñíÄäJ1EQEÎGj)(RN§«4fi‘¥∏Êé(π£∂)h‰–;QJi1#É4ô•ÔöNÙ¿\–)3Õ¿§ö)π¢Ä≈:ò)‘£ß•;åS3ÉJÈ#8≈%zQ≈/ja‡”≥Hz”u¢ä=±#¥πù©q#ä;R{äZ.!G?˝z(≈/¿;fä.9"Äú
viºR˚Ù§1√Ω4î£ß#Ω2l(ßjOJQö¸)√4ò£ò!€©G±†cΩ.
a√!™Un’2:Êûô¶"¬ì”Ø„S)5÷ßQÔHdã»ßc◊§\u©3Ëz”‹uÊùöN£˘”Ä…ÁØjB$CœJ∞Ñ}j®ˆ©ë≥˘PRE‰=
∑Á¸k9ä∞í`rhR)1œSı´\m ʱRLésV£ì“=C‘≤ìû‡W_Á-Œù,LO# üŒºrfi·◊Óí+J+…#˚«Û§“-NußènæıQß⁄3LäWñÂG8ˆ´%B=4”k1ú˝*‹Y«Z≥I∆j[åò≠'…Ãl∏§∏W∑P;}—[~‡P##'úQqÿ ä⁄h ÛG»98‰ALÚÒ‘sÙ≠löBp
+é∆^’≥·ò˜Îp쌿Õ˙c˙äáÆ—˘T±LˆÚoàÏlu^(πIÄ1öZ·∆©y⁄·˙x’´=JÓIïLŒr#ÎKp±◊
\Tc%{“„ÈÿCÒIäo‚h¸h¯™Û˝”S†»$ö≠tp¥Ücø˙÷†}):π>Ù˙ëÜ=®«•(ÈäC#ÜÒHyÌJhÈLzjñ∫_`kj_πYZ#›;7µkM–SøjçÖMQ8†e˘o¬¢©¶í¢∆:Tn)iÿ§#ë#
§¡ß—äwò∏Õ.;Qä#E$y™r≈Z8ënÌLib¨˘·‚∑•á⁄®KÁäÊÓ≠Œ≈sóê|˙◊mqÁä¡‘-≤ƒs,*&’ôT£ï®
b 5ÕLÀL"Åw•«z\P1öL0i¿R‚ÄG•'nÙ¥6ä3IfiÄfiÉIÕZNih¸hQG„EQK#A≈'„N˙“P1;QäZJ%Q⁄Ä”fiúi3Ì#ÜöJRi3#ƒ¢åÛI#'z3A4ƒ.x•…¶fävhiùÈ√9§Ä“é¥Çó$Pæîi£èZ``RfñÅ¿≈§ó‹–0ÏSE:ÄRˆ•œjLgΩ8r(•È¯–rO4¥Q#∆h§•Ô#œ=(öLú—ú”Óh≠7=®Õ0%öp'÷°˘“É˙–ùNN1RØ>µX5Hßö∏≠ÉöùMSF‚ßFÊÅ´wˇJp59 <P-gèÎä3Ëi≠”ØJnr(&‡:öëdı8™¶úÄE‰óflÚ´(ÁÇz÷r?L’∏ÿÒú–ƒãÒ∞Ô˙U¥pqå÷zzɢöπ ÜŸ£ı˝*ÏLO¨¯[ß“ØDIn–‡dnxÈV4‡M∆}*ëÎ÷µ4¥·ò˙ÒT¬&¥cÎW~^?Z≠Z∂£äÅä)§ÛÌJ«
id“(n3GÁJ>¥wÈfiꃈŒ))H£åRÕihÒ˘ó—å˜Õfq“∑º;nÒ¬ä§3ß«bóπ£ı™ N˘§ßv§†c◊Ñ™Wdm´˝≥ØN˝*X#/”áZAÔN≈!Ö'ÎN¶ÊÄ“õK◊•!¶3WG_ıå}ÖhÃ3¥U= b?ÌUÈ~˙S\éi¨*\sök)Œî~”1R…˜€ÎL≈ åQäv9£ïò4òßcö)X‚î
\QfiùÇ‚c÷åSáR∞4aπ"´Mm¡‚ØÄ)CSœ‹AÄkÚ◊ Ò]u≈∏Ù¨{®#Œh8F»ÆH+≥öÌÔ`Vé撠̀≤ÉLFs(≈FE?>¥á˜†Dxò¡≈HGµ0é(1E”M)¶–Q¯“^(¥P9ÈK“ÅÖ&8•œZ;Pé(Õ&;“‚Ä1KÉG“ä#'µ&i›∫Sh(•˙Rw¶Ph4îoJL“QIÕ1!4})3#SGz;—#å—ÅG4¥Ä#)iÿ£Å#
∏§ó4¿_j‚öiq˘–w£øZ©3#÷åög4¥~I•œΩ3µ $›⁄ù∫¢öx4isÕ0}i{”ŸÊñô“é¸fÄûisLÔÕ&#Ô#â7bîè<äp8È#–ÏÒÌHHÔFxÎI‘”üJpÈ÷ô¯“ˆ«jx#5"öÉ>Ùı˙P"“0œ^j¬?÷≥¡©—ŒO|6G~(l≠SG®ÈS°˘hÿ~§‹bêÇi;ı†ˇúP+
#ûOJ3˙Rï…ˇ
CH Ë8´Qì¿5Iz:û6¸©∞π£˙Wb#¨ÿ»»≈\Ñ·®∞∞ê1ZvÆùªà¨x=øïi€∂Ä<ˇØ9≠Ω9B¡ı5ÇN4ËıI‡;UÜ—Í)
t•Z}+çMz‡dd{T√_ªÍ
˝6“éµç7=∫◊)ˇ fl¢ü®©^π$)„“ê—“Êɯ÷zµ√‡·#˙Rù^em•Tì“êÕ√Ì⁄ìÔYñ∑”œ"Æ’¡lg€/ÜîØ7Dz¸ü˝zb–Á{◊S·ƒ€éGV¿¶¬*?ÁÏˇflø˛ΩjYX5úPêsù∏¶ÉKE&i¢"z»‘ˇ≥©?6Ô™bO∏¶Ó#ìRãxÛ˜˙‘ÅtP?
5ÄYWƒ`÷¨üv±Ôè8©SsK#˙RÊÄ–JolR”I‚ê¿ìMÔÙ•'⁄õì#îò≥‘Ê≠≤Â≥PÈ£1˝*¡Ù™*7;
â˙C3ús¯”Ê•#&òigZCN#öiÎ#ÑÌG·KI˛M%ÊîQ# úih¡Õ†bÅHHQìN«ƒ¯™ÍE’Ç,•BDø(>‰“m!•s´ñ]√b≥.cŒN+â7O˙¬c1=M.tW≥gCssX÷‡É≈Tuö•*s÷éqrX°uéBjæ{∑*Ê™8¡™π
X_ÂHxÌL
FÍbΩ741¶gû¥\“fì4î摺P)h£QfiäZN‘w£éÙòß`w§ÔKfiÅáj^‘Rûî”ÔMÔN¿§§RR”iÄfõöZC≈i
:ê˛⁄JZi=®ÔHsKÌIL©•‚äNzfÄ4göJL–+é›A>ًٿ•§1CSŶ{SÈÖ≈Õ(4ÇóP!1flú”È1ı†““—⁄Äñä=ç≠8SE8P1A•“SÄ„µ.isE”◊•œ·Jzˆ£ß≈`˝(„•‚óΩ––I˜£ÚE/C#E“«Ω/÷ãà7q«Zv}©ü“î8†.Iül”—˚öÑqN–SR⁄1ÕYLö†¨}jƒm«Zfi•Â9Zêr:T1∑åTÈÈJ„RȔ۩1I‘~4Xd]Jë„q÷ò#ÔJ8„\ãbaWbl0•fD√ÈW!êg P>¶¥/ìÈZ0∑OJ∆ÜO|Vå2}h
é!Úç##/ œÎZM<
`áéÙÏORÎ-'&⁄WïYAÎäÜ5»º™Ò∏ı≈YÚõ8'<˜´)h|ΩÍßfxjróîùß*::‘…;dUë∞˜ß¨C§jÒ≈#B≤LYãD-Úı„öôTs”ÎR"~ÊyÍõr}Èuπ°†√Ê^Ÿ≈é≤+6„ü“ΩAW·[}⁄ÃDÄU∑LZÙ5ÌMóaÁÅE!#•J8uß
hıßåPÅA˙R“äNïãzOôèzÿòÒ“±.âÛy©cD]æ¥ú—ûsHH§O4∆4ÁÇ)ß•}È-I€fiÑ…uúöh÷Z é“5ÙQS”#˚£ÈRbô#H‚°óÓü•Nzu®&8CHeÎQûiÓy®ÿ“Üõfiå”wP©)3HÚ†båRNÌ“Ä
_zm.hkŒ¸VKk◊6:èÓäÙ:‡<Yg(’⁄HÊ•ä∆—å{Ù©íæ≈¬V‹¿QŘÿü®ß¿à¸i≠‡y?3Nܬ˛‚SnõèNMg»Õ=§F‰„èÁU‰˜ÕV˚\†Ûåè≠'õ#u«◊öj8â*è≠Tí<’ßFAÖB ∆≠&f⁄e\gz∑$f†d"¨ÃçÜiÜüœ4“(¥¥v†P–(•†QEsE.)c<—÷ú8•òƒ¢ùHi”ú”
<˛√# öBhÕ4öb∏§˛4fiqö
&hœ)¶óÎMÕûÙî¥S;—äZ9†ì¥RΩ®«L
u%1X1G·E܆”pi¿bòáÁ4£`‚îP1r(ÊÅú—é(Éù&(•†Ú?1GjJQúP1öJ^x† •¶géÙÒäpÈÕêSÜ)àLQ∑ú“–z“fi˛‘ÏRJ(ùÛÕı•¸h=•0∞ÉfiÅÈJz—#ÑÈIflÅJqH:P!FGjp¶ÁéÙ‡OC#/ìV5]*¬ìé)‹{#'<Ù´◊ä©˚µep}
#XQœ§sìM饷◊ìHwTc$”NŒ)‰»¶î¿rü^
Oû3Us¿≈HßûE
8ú˜´Ié3≈cFˇSWaóéΩË XÕ7 N+‹tÈ“ô"úUÿÃè160‡’≠GR1,vˆ˜QI∆§mˆ9'‹V[¬Â≤ßú˜5m-¡∫Æ~î1‹ÑOs!ƒ˵a-ıF0∏ıш„ÏŒL_!#í*fl€ß<âüÛ©–
Q«®A"…,xç'r“2yëæ’4Œ⁄0úÒUß∫ô‚*“ßØ=i˙tåˇæ¿`ÑëÙË(±W:fl∆Zˆ‚S¸?Sˇ÷ÆÃfπ«˛Ö<ƒ¨ìêˇÎ◊P*Jb—IA¶‘”≈Fµ(†C©Ñ”˙
âè4ÜApÿSXS∂e<Ê∂.[k
FÃåi1ã∫ê∑lS7zSI˙P+è-≈0”K{”qEÜ<ûjKoöÊ!Í¿~µX∑øgOʈ.?ã4 G\ßëSë≈Tฬ¨<®£ØÂL#’Zrd”.o„â2î=Õ`_ÎÒm+yzÙÜh;f£…Õ`m∑xœ˝ıM:—ˇûG˛˙¢¿oú˚“VˆÀu˚ÍèÌó«˙ظzÄ7h¨Ìóˇû?¯ı'ˆƒüÛÀˇ†gAKû:ů◊?˝Ø'¸Úù!’•Ì詆àc뢓ÒÍ?:Á?µf˛‚˛¥ÉUü˚©˙–#§»˛¸ÎãÒn÷$\˝’Q˙f¥?µgÏ©˘¬ΩïÓ/‰ë¿‹q”ß
êˆ≈]”a∫`˛UK'<qW,3∂~ÏLCTâ8∂L∑÷§U©guHâÌHDa)≠Z–S⁄Äπú˝j¥ë+VD‚™»ùie<dÄä–ï=™õéh≈fåRÅÔ#√˙RÅÈJ=ªPc>‘}i¿RÅ#
«4‡)v”ħ1∏ÔKä\PyÙ†“SÈM<PZò~¥ÛL4ƒ'˘ÊösJi¶Ä©9•Êõ#É<Qı¢ä%P)à(Ôä1J
JZ9ÕbbíñóP!>¥Ωh∆iq€µ/PÎH~¥ù®˚∏‚åÛIÉG÷ò){Që“é‹“(¸(È“é˝)Öƒ„8≈!ÎÈG~i?)s≈'j1œµaE8o≠≠dĘœûlTCÉNÅ‹êpi«ìQÉNœ=hy˙—J=3#†ˆı†Êú~¥ÑP&7úÊêRú“sú”j?ZOZJ®ÓH•õ¯“{–íÜı:I«z®≠=[ö—çÒfi¨∆«¨Ë»8ÈWc<wÊÜ4]èü•L¢†àÛ¸™“˛tÆ1ª{S
úëÔV#é)•:ö`ÆT+ˇ÷‚îúöô◊©Ãa±ä‘êc5f7„Ø÷™éN=jx¡Ë?Zbæ°ê1˘”$#ÆÔjÃüt{¿v„fi®¥ØÉÛŒò»Àåz’ÑGu ©∆zÅ\Øö‰˝„«Ωv\ÌÑQ®c$˜&ÜÏπ]é÷√qı•‹3å‘~"êπ∂;B∞n^ ïâ>GŒ›}h5ÔdŸl˘‡‚µÙ®ºü
…)2»ëÆGLeèÙ™ë[≈uj´"7•u⁄Vçm%¥pªJ—©‹∑ß“„Vÿ‹ºB=˝‚Õ˙ˇı´o5Z⁄̇H¢]®£Tʆ∞œîgöi†u† î‘√•BΩ*Q“Äú
ÅçHƒ‚´π?•Tº|!˜¨˘'ö‘ø'i˙V.NjGb]‡änÍe7&Ä∞˝‹`SKqQízRs÷ÄøfiÆÈ5ÍúÙ…˝+4ìZ:#ˇHc‹
h
´ª—gne€ªêŒ3XóÌÃπ
V5Ù^OÁVıø¯ÚQü„˛Üπ ñKáëã;31˛#…®ìN4√÷êÜ⁄åÙ„“ö‘ä∞d(‹:
n);”Ì‘Öª“JN¥Ä~„Kªfi£•4∆?uçDIRÁö~ÏûµúÌõá5xìT?´˝hB`j’ØÀmxƒ ÂU∞1V`ˇè[ØC~hõú÷Â/‘T —éÆ?:®"Fú
£ñƧŸ⁄Âà∂Ñs– ©å/6!¸j?O:õ◊Û≠h≠†)«◊?tVÒ†b†B…4_flQ¯’Y$ãìºSu܃pć+âı†fåíFz0™ím=
VÁ≠<uá`#÷ú(4ΩËw••ÈNQ≈!°ùÈ¿S∞(¿†bRûî∏§Ù†aM=)ÿ¶v†CIÊê˝h&öyÕ0cZöM)ÎM=(¿”ON¥¶õ#ÇõJiò KG“R•ÊéÙS
Ö¢óµêÏÁ•!ÊÉ÷ä>Çófiìî;—ö1J(ò&î(ÈJ{P“îJ4ßµ øJu≠;àŒ3G4„÷íÅXoµ%;©≈'µ&(•#ö)ÄùË:SáJLs#≠`ÈÈJ3ÔMÕ(˛T

û)≠7•:Ä∞p)€≥Qw• Ä…ø≠ ÈöwΩ1\OZCNÇh>îÉR<dQ”äìz˝iÑr(∏∂zP3N=)•;âj√◊Ò•iáÎN⁄ãÅf<µv"9«≠g'*‹,qBc5!8¡ÁØqW#8<VtÅüZæчsKq≤»È»≈!.)cʧaÅö°_å«ÈVvÇ*2¢Äzë)‰‘ G∑ÁQÌj#\t≈küˇŸ
--boundary
"next image"
What I tried so far is, removing --boundary, Content Type and Content Lenght and just saving it as an image. But that doesn't work. So now I don't exactly know how I should approach this problem. Any Help would be welcome. Thanks.

Regex for domain extraction [duplicate]

How can I check if a given string is a valid URL address?
My knowledge of regular expressions is basic and doesn't allow me to choose from the hundreds of regular expressions I've already seen on the web.
I wrote my URL (actually IRI, internationalized) pattern to comply with RFC 3987 (http://www.faqs.org/rfcs/rfc3987.html). These are in PCRE syntax.
For absolute IRIs (internationalized):
/^[a-z](?:[-a-z0-9\+\.])*:(?:\/\/(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&'\(\)\*\+,;=:])*#)?(?:\[(?:(?:(?:[0-9a-f]{1,4}:){6}(?:[0-9a-f]{1,4}:[0-9a-f]{1,4}|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3})|::(?:[0-9a-f]{1,4}:){5}(?:[0-9a-f]{1,4}:[0-9a-f]{1,4}|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3})|(?:[0-9a-f]{1,4})?::(?:[0-9a-f]{1,4}:){4}(?:[0-9a-f]{1,4}:[0-9a-f]{1,4}|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3})|(?:(?:[0-9a-f]{1,4}:){0,1}[0-9a-f]{1,4})?::(?:[0-9a-f]{1,4}:){3}(?:[0-9a-f]{1,4}:[0-9a-f]{1,4}|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3})|(?:(?:[0-9a-f]{1,4}:){0,2}[0-9a-f]{1,4})?::(?:[0-9a-f]{1,4}:){2}(?:[0-9a-f]{1,4}:[0-9a-f]{1,4}|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3})|(?:(?:[0-9a-f]{1,4}:){0,3}[0-9a-f]{1,4})?::[0-9a-f]{1,4}:(?:[0-9a-f]{1,4}:[0-9a-f]{1,4}|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3})|(?:(?:[0-9a-f]{1,4}:){0,4}[0-9a-f]{1,4})?::(?:[0-9a-f]{1,4}:[0-9a-f]{1,4}|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3})|(?:(?:[0-9a-f]{1,4}:){0,5}[0-9a-f]{1,4})?::[0-9a-f]{1,4}|(?:(?:[0-9a-f]{1,4}:){0,6}[0-9a-f]{1,4})?::)|v[0-9a-f]+\.[-a-z0-9\._~!\$&'\(\)\*\+,;=:]+)\]|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3}|(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&'\(\)\*\+,;=])*)(?::[0-9]*)?(?:\/(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&'\(\)\*\+,;=:#]))*)*|\/(?:(?:(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&'\(\)\*\+,;=:#]))+)(?:\/(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&'\(\)\*\+,;=:#]))*)*)?|(?:(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&'\(\)\*\+,;=:#]))+)(?:\/(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&'\(\)\*\+,;=:#]))*)*|(?!(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&'\(\)\*\+,;=:#])))(?:\?(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&'\(\)\*\+,;=:#])|[\x{E000}-\x{F8FF}\x{F0000}-\x{FFFFD}\x{100000}-\x{10FFFD}\/\?])*)?(?:\#(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&'\(\)\*\+,;=:#])|[\/\?])*)?$/i
To also allow relative IRIs:
/^(?:[a-z](?:[-a-z0-9\+\.])*:(?:\/\/(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&'\(\)\*\+,;=:])*#)?(?:\[(?:(?:(?:[0-9a-f]{1,4}:){6}(?:[0-9a-f]{1,4}:[0-9a-f]{1,4}|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3})|::(?:[0-9a-f]{1,4}:){5}(?:[0-9a-f]{1,4}:[0-9a-f]{1,4}|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3})|(?:[0-9a-f]{1,4})?::(?:[0-9a-f]{1,4}:){4}(?:[0-9a-f]{1,4}:[0-9a-f]{1,4}|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3})|(?:(?:[0-9a-f]{1,4}:){0,1}[0-9a-f]{1,4})?::(?:[0-9a-f]{1,4}:){3}(?:[0-9a-f]{1,4}:[0-9a-f]{1,4}|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3})|(?:(?:[0-9a-f]{1,4}:){0,2}[0-9a-f]{1,4})?::(?:[0-9a-f]{1,4}:){2}(?:[0-9a-f]{1,4}:[0-9a-f]{1,4}|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3})|(?:(?:[0-9a-f]{1,4}:){0,3}[0-9a-f]{1,4})?::[0-9a-f]{1,4}:(?:[0-9a-f]{1,4}:[0-9a-f]{1,4}|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3})|(?:(?:[0-9a-f]{1,4}:){0,4}[0-9a-f]{1,4})?::(?:[0-9a-f]{1,4}:[0-9a-f]{1,4}|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3})|(?:(?:[0-9a-f]{1,4}:){0,5}[0-9a-f]{1,4})?::[0-9a-f]{1,4}|(?:(?:[0-9a-f]{1,4}:){0,6}[0-9a-f]{1,4})?::)|v[0-9a-f]+\.[-a-z0-9\._~!\$&'\(\)\*\+,;=:]+)\]|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3}|(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&'\(\)\*\+,;=])*)(?::[0-9]*)?(?:\/(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&'\(\)\*\+,;=:#]))*)*|\/(?:(?:(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&'\(\)\*\+,;=:#]))+)(?:\/(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&'\(\)\*\+,;=:#]))*)*)?|(?:(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&'\(\)\*\+,;=:#]))+)(?:\/(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&'\(\)\*\+,;=:#]))*)*|(?!(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&'\(\)\*\+,;=:#])))(?:\?(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&'\(\)\*\+,;=:#])|[\x{E000}-\x{F8FF}\x{F0000}-\x{FFFFD}\x{100000}-\x{10FFFD}\/\?])*)?(?:\#(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&'\(\)\*\+,;=:#])|[\/\?])*)?|(?:\/\/(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&'\(\)\*\+,;=:])*#)?(?:\[(?:(?:(?:[0-9a-f]{1,4}:){6}(?:[0-9a-f]{1,4}:[0-9a-f]{1,4}|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3})|::(?:[0-9a-f]{1,4}:){5}(?:[0-9a-f]{1,4}:[0-9a-f]{1,4}|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3})|(?:[0-9a-f]{1,4})?::(?:[0-9a-f]{1,4}:){4}(?:[0-9a-f]{1,4}:[0-9a-f]{1,4}|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3})|(?:(?:[0-9a-f]{1,4}:){0,1}[0-9a-f]{1,4})?::(?:[0-9a-f]{1,4}:){3}(?:[0-9a-f]{1,4}:[0-9a-f]{1,4}|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3})|(?:(?:[0-9a-f]{1,4}:){0,2}[0-9a-f]{1,4})?::(?:[0-9a-f]{1,4}:){2}(?:[0-9a-f]{1,4}:[0-9a-f]{1,4}|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3})|(?:(?:[0-9a-f]{1,4}:){0,3}[0-9a-f]{1,4})?::[0-9a-f]{1,4}:(?:[0-9a-f]{1,4}:[0-9a-f]{1,4}|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3})|(?:(?:[0-9a-f]{1,4}:){0,4}[0-9a-f]{1,4})?::(?:[0-9a-f]{1,4}:[0-9a-f]{1,4}|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3})|(?:(?:[0-9a-f]{1,4}:){0,5}[0-9a-f]{1,4})?::[0-9a-f]{1,4}|(?:(?:[0-9a-f]{1,4}:){0,6}[0-9a-f]{1,4})?::)|v[0-9a-f]+\.[-a-z0-9\._~!\$&'\(\)\*\+,;=:]+)\]|(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])(?:\.(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])){3}|(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&'\(\)\*\+,;=])*)(?::[0-9]*)?(?:\/(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&'\(\)\*\+,;=:#]))*)*|\/(?:(?:(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&'\(\)\*\+,;=:#]))+)(?:\/(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&'\(\)\*\+,;=:#]))*)*)?|(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&'\(\)\*\+,;=#])+)(?:\/(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&'\(\)\*\+,;=:#]))*)*|(?!(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&'\(\)\*\+,;=:#])))(?:\?(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&'\(\)\*\+,;=:#])|[\x{E000}-\x{F8FF}\x{F0000}-\x{FFFFD}\x{100000}-\x{10FFFD}\/\?])*)?(?:\#(?:(?:%[0-9a-f][0-9a-f]|[-a-z0-9\._~\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}!\$&'\(\)\*\+,;=:#])|[\/\?])*)?)$/i
How they were compiled (in PHP):
<?php
/* Regex convenience functions (character class, non-capturing group) */
function cc($str, $suffix = '', $negate = false) {
return '[' . ($negate ? '^' : '') . $str . ']' . $suffix;
}
function ncg($str, $suffix = '') {
return '(?:' . $str . ')' . $suffix;
}
/* Preserved from RFC3986 */
$ALPHA = 'a-z';
$DIGIT = '0-9';
$HEXDIG = $DIGIT . 'a-f';
$sub_delims = '!\\$&\'\\(\\)\\*\\+,;=';
$gen_delims = ':\\/\\?\\#\\[\\]#';
$reserved = $gen_delims . $sub_delims;
$unreserved = '-' . $ALPHA . $DIGIT . '\\._~';
$pct_encoded = '%' . cc($HEXDIG) . cc($HEXDIG);
$dec_octet = ncg(implode('|', array(
cc($DIGIT),
cc('1-9') . cc($DIGIT),
'1' . cc($DIGIT) . cc($DIGIT),
'2' . cc('0-4') . cc($DIGIT),
'25' . cc('0-5')
)));
$IPv4address = $dec_octet . ncg('\\.' . $dec_octet, '{3}');
$h16 = cc($HEXDIG, '{1,4}');
$ls32 = ncg($h16 . ':' . $h16 . '|' . $IPv4address);
$IPv6address = ncg(implode('|', array(
ncg($h16 . ':', '{6}') . $ls32,
'::' . ncg($h16 . ':', '{5}') . $ls32,
ncg($h16, '?') . '::' . ncg($h16 . ':', '{4}') . $ls32,
ncg($h16 . ':' . $h16, '?') . '::' . ncg($h16 . ':', '{3}') . $ls32,
ncg(ncg($h16 . ':', '{0,2}') . $h16, '?') . '::' . ncg($h16 . ':', '{2}') . $ls32,
ncg(ncg($h16 . ':', '{0,3}') . $h16, '?') . '::' . $h16 . ':' . $ls32,
ncg(ncg($h16 . ':', '{0,4}') . $h16, '?') . '::' . $ls32,
ncg(ncg($h16 . ':', '{0,5}') . $h16, '?') . '::' . $h16,
ncg(ncg($h16 . ':', '{0,6}') . $h16, '?') . '::',
)));
$IPvFuture = 'v' . cc($HEXDIG, '+') . cc($unreserved . $sub_delims . ':', '+');
$IP_literal = '\\[' . ncg(implode('|', array($IPv6address, $IPvFuture))) . '\\]';
$port = cc($DIGIT, '*');
$scheme = cc($ALPHA) . ncg(cc('-' . $ALPHA . $DIGIT . '\\+\\.'), '*');
/* New or changed in RFC3987 */
$iprivate = '\x{E000}-\x{F8FF}\x{F0000}-\x{FFFFD}\x{100000}-\x{10FFFD}';
$ucschar = '\x{A0}-\x{D7FF}\x{F900}-\x{FDCF}\x{FDF0}-\x{FFEF}' .
'\x{10000}-\x{1FFFD}\x{20000}-\x{2FFFD}\x{30000}-\x{3FFFD}' .
'\x{40000}-\x{4FFFD}\x{50000}-\x{5FFFD}\x{60000}-\x{6FFFD}' .
'\x{70000}-\x{7FFFD}\x{80000}-\x{8FFFD}\x{90000}-\x{9FFFD}' .
'\x{A0000}-\x{AFFFD}\x{B0000}-\x{BFFFD}\x{C0000}-\x{CFFFD}' .
'\x{D0000}-\x{DFFFD}\x{E1000}-\x{EFFFD}';
$iunreserved = '-' . $ALPHA . $DIGIT . '\\._~' . $ucschar;
$ipchar = ncg($pct_encoded . '|' . cc($iunreserved . $sub_delims . ':#'));
$ifragment = ncg($ipchar . '|' . cc('\\/\\?'), '*');
$iquery = ncg($ipchar . '|' . cc($iprivate . '\\/\\?'), '*');
$isegment_nz_nc = ncg($pct_encoded . '|' . cc($iunreserved . $sub_delims . '#'), '+');
$isegment_nz = ncg($ipchar, '+');
$isegment = ncg($ipchar, '*');
$ipath_empty = '(?!' . $ipchar . ')';
$ipath_rootless = ncg($isegment_nz) . ncg('\\/' . $isegment, '*');
$ipath_noscheme = ncg($isegment_nz_nc) . ncg('\\/' . $isegment, '*');
$ipath_absolute = '\\/' . ncg($ipath_rootless, '?'); // Spec says isegment-nz *( "/" isegment )
$ipath_abempty = ncg('\\/' . $isegment, '*');
$ipath = ncg(implode('|', array(
$ipath_abempty,
$ipath_absolute,
$ipath_noscheme,
$ipath_rootless,
$ipath_empty
))) . ')';
$ireg_name = ncg($pct_encoded . '|' . cc($iunreserved . $sub_delims . '#'), '*');
$ihost = ncg(implode('|', array($IP_literal, $IPv4address, $ireg_name)));
$iuserinfo = ncg($pct_encoded . '|' . cc($iunreserved . $sub_delims . ':'), '*');
$iauthority = ncg($iuserinfo . '#', '?') . $ihost . ncg(':' . $port, '?');
$irelative_part = ncg(implode('|', array(
'\\/\\/' . $iauthority . $ipath_abempty . '',
'' . $ipath_absolute . '',
'' . $ipath_noscheme . '',
'' . $ipath_empty . ''
)));
$irelative_ref = $irelative_part . ncg('\\?' . $iquery, '?') . ncg('\\#' . $ifragment, '?');
$ihier_part = ncg(implode('|', array(
'\\/\\/' . $iauthority . $ipath_abempty . '',
'' . $ipath_absolute . '',
'' . $ipath_rootless . '',
'' . $ipath_empty . ''
)));
$absolute_IRI = $scheme . ':' . $ihier_part . ncg('\\?' . $iquery, '?');
$IRI = $scheme . ':' . $ihier_part . ncg('\\?' . $iquery, '?') . ncg('\\#' . $ifragment, '?');
$IRI_reference = ncg($IRI . '|' . $irelative_ref);
Edit 7 March 2011: Because of the way PHP handles backslashes in quoted strings, these are unusable by default. You'll need to double-escape backslashes except where the backslash has a special meaning in regex. You can do that this way:
$escape_backslash = '/(?<!\\)\\(?![\[\]\\\^\$\.\|\*\+\(\)QEnrtaefvdwsDWSbAZzB1-9GX]|x\{[0-9a-f]{1,4}\}|\c[A-Z]|)/';
$absolute_IRI = preg_replace($escape_backslash, '\\\\', $absolute_IRI);
$IRI = preg_replace($escape_backslash, '\\\\', $IRI);
$IRI_reference = preg_replace($escape_backslash, '\\\\', $IRI_reference);
I've just written up a blog post for a great solution for recognizing URLs in most used formats such as:
www.google.com
http://www.google.com
mailto:somebody#google.com
somebody#google.com
www.url-with-querystring.com/?url=has-querystring
The regular expression used is:
/((([A-Za-z]{3,9}:(?:\/\/)?)(?:[-;:&=\+\$,\w]+#)?[A-Za-z0-9.-]+|(?:www.|[-;:&=\+\$,\w]+#)[A-Za-z0-9.-]+)((?:\/[\+~%\/.\w-_]*)?\??(?:[-\+=&;%#.\w_]*)#?(?:[\w]*))?)/
What platform? If using .NET, use System.Uri.TryCreate, not a regex.
For example:
static bool IsValidUrl(string urlString)
{
Uri uri;
return Uri.TryCreate(urlString, UriKind.Absolute, out uri)
&& (uri.Scheme == Uri.UriSchemeHttp
|| uri.Scheme == Uri.UriSchemeHttps
|| uri.Scheme == Uri.UriSchemeFtp
|| uri.Scheme == Uri.UriSchemeMailto
/*...*/);
}
// In test fixture...
[Test]
void IsValidUrl_Test()
{
Assert.True(IsValidUrl("http://www.example.com"));
Assert.False(IsValidUrl("javascript:alert('xss')"));
Assert.False(IsValidUrl(""));
Assert.False(IsValidUrl(null));
}
(Thanks to #Yoshi for the tip about javascript:)
Here's what RegexBuddy uses.
(\b(https?|ftp|file)://)?[-A-Za-z0-9+&##/%?=~_|!:,.;]+[-A-Za-z0-9+&##/%=~_|]
It matches these below (inside the ** ** marks):
**http://www.regexbuddy.com**
**http://www.regexbuddy.com/**
**http://www.regexbuddy.com/index.html**
**http://www.regexbuddy.com/index.html?source=library**
**http://www.regexbuddy.com/index.html?source=library#copyright**
You can download RegexBuddy at http://www.regexbuddy.com/download.html.
Mathias Bynens has a great article on the best comparison of a lot of regular expressions: In search of the perfect URL validation regex
The best one posted is a little long, but it matches just about anything you can throw at it.
JavaScript version
/^(?:(?:(?:https?|ftp):)?\/\/)(?:\S+(?::\S*)?#)?(?:(?!(?:10|127)(?:\.\d{1,3}){3})(?!(?:169\.254|192\.168)(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z0-9\u00a1-\uffff][a-z0-9\u00a1-\uffff_-]{0,62})?[a-z0-9\u00a1-\uffff]\.)+(?:[a-z\u00a1-\uffff]{2,}\.?))(?::\d{2,5})?(?:[/?#]\S*)?$/i
PHP version (uses % symbol as delimiter)
%^(?:(?:(?:https?|ftp):)?\/\/)(?:\S+(?::\S*)?#)?(?:(?!(?:10|127)(?:\.\d{1,3}){3})(?!(?:169\.254|192\.168)(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z0-9\x{00a1}-\x{ffff}][a-z0-9\x{00a1}-\x{ffff}_-]{0,62})?[a-z0-9\x{00a1}-\x{ffff}]\.)+(?:[a-z\x{00a1}-\x{ffff}]{2,}\.?))(?::\d{2,5})?(?:[/?#]\S*)?$%iuS
With regard to eyelidness' answer post that reads "This is based on my reading of the URI specification.": Thanks Eyelidness, yours is the perfect solution I sought, as it is based on the URI spec! Superb work. :)
I had to make two amendments. The first to get the regexp to match IP address URLs correctly in PHP (v5.2.10) with the preg_match() function.
I had to add one more set of parenthesis to the line above "IP Address" around the pipes:
)|((\d|[1-9]\d|1\d{2}|2[0-4][0-9]|25[0-5])\.){3}(?#
Not sure why.
I have also reduced the top level domain minimum length from 3 to 2 letters to support .co.uk and similar.
Final code:
/^(https?|ftp):\/\/(?# protocol
)(([a-z0-9$_\.\+!\*\'\(\),;\?&=-]|%[0-9a-f]{2})+(?# username
)(:([a-z0-9$_\.\+!\*\'\(\),;\?&=-]|%[0-9a-f]{2})+)?(?# password
)#)?(?# auth requires #
)((([a-z0-9]\.|[a-z0-9][a-z0-9-]*[a-z0-9]\.)*(?# domain segments AND
)[a-z][a-z0-9-]*[a-z0-9](?# top level domain OR
)|((\d|[1-9]\d|1\d{2}|2[0-4][0-9]|25[0-5])\.){3}(?#
)(\d|[1-9]\d|1\d{2}|2[0-4][0-9]|25[0-5])(?# IP address
))(:\d+)?(?# port
))(((\/+([a-z0-9$_\.\+!\*\'\(\),;:#&=-]|%[0-9a-f]{2})*)*(?# path
)(\?([a-z0-9$_\.\+!\*\'\(\),;:#&=-]|%[0-9a-f]{2})*)(?# query string
)?)?)?(?# path and query string optional
)(#([a-z0-9$_\.\+!\*\'\(\),;:#&=-]|%[0-9a-f]{2})*)?(?# fragment
)$/i
This modified version was not checked against the URI specification so I can't vouch for it's compliance, it was altered to handle URLs on local network environments and two digit TLDs as well as other kinds of Web URL, and to work better in the PHP setup I use.
As PHP code:
define('URL_FORMAT',
'/^(https?):\/\/'. // protocol
'(([a-z0-9$_\.\+!\*\'\(\),;\?&=-]|%[0-9a-f]{2})+'. // username
'(:([a-z0-9$_\.\+!\*\'\(\),;\?&=-]|%[0-9a-f]{2})+)?'. // password
'#)?(?#'. // auth requires #
')((([a-z0-9]\.|[a-z0-9][a-z0-9-]*[a-z0-9]\.)*'. // domain segments AND
'[a-z][a-z0-9-]*[a-z0-9]'. // top level domain OR
'|((\d|[1-9]\d|1\d{2}|2[0-4][0-9]|25[0-5])\.){3}'.
'(\d|[1-9]\d|1\d{2}|2[0-4][0-9]|25[0-5])'. // IP address
')(:\d+)?'. // port
')(((\/+([a-z0-9$_\.\+!\*\'\(\),;:#&=-]|%[0-9a-f]{2})*)*'. // path
'(\?([a-z0-9$_\.\+!\*\'\(\),;:#&=-]|%[0-9a-f]{2})*)'. // query string
'?)?)?'. // path and query string optional
'(#([a-z0-9$_\.\+!\*\'\(\),;:#&=-]|%[0-9a-f]{2})*)?'. // fragment
'$/i');
Here is a test program in PHP which validates a variety of URLs using the regex:
<?php
define('URL_FORMAT',
'/^(https?):\/\/'. // protocol
'(([a-z0-9$_\.\+!\*\'\(\),;\?&=-]|%[0-9a-f]{2})+'. // username
'(:([a-z0-9$_\.\+!\*\'\(\),;\?&=-]|%[0-9a-f]{2})+)?'. // password
'#)?(?#'. // auth requires #
')((([a-z0-9]\.|[a-z0-9][a-z0-9-]*[a-z0-9]\.)*'. // domain segments AND
'[a-z][a-z0-9-]*[a-z0-9]'. // top level domain OR
'|((\d|[1-9]\d|1\d{2}|2[0-4][0-9]|25[0-5])\.){3}'.
'(\d|[1-9]\d|1\d{2}|2[0-4][0-9]|25[0-5])'. // IP address
')(:\d+)?'. // port
')(((\/+([a-z0-9$_\.\+!\*\'\(\),;:#&=-]|%[0-9a-f]{2})*)*'. // path
'(\?([a-z0-9$_\.\+!\*\'\(\),;:#&=-]|%[0-9a-f]{2})*)'. // query string
'?)?)?'. // path and query string optional
'(#([a-z0-9$_\.\+!\*\'\(\),;:#&=-]|%[0-9a-f]{2})*)?'. // fragment
'$/i');
/**
* Verify the syntax of the given URL.
*
* #access public
* #param $url The URL to verify.
* #return boolean
*/
function is_valid_url($url) {
if (str_starts_with(strtolower($url), 'http://localhost')) {
return true;
}
return preg_match(URL_FORMAT, $url);
}
/**
* String starts with something
*
* This function will return true only if input string starts with
* niddle
*
* #param string $string Input string
* #param string $niddle Needle string
* #return boolean
*/
function str_starts_with($string, $niddle) {
return substr($string, 0, strlen($niddle)) == $niddle;
}
/**
* Test a URL for validity and count results.
* #param url url
* #param expected expected result (true or false)
*/
$numtests = 0;
$passed = 0;
function test_url($url, $expected) {
global $numtests, $passed;
$numtests++;
$valid = is_valid_url($url);
echo "URL Valid?: " . ($valid?"yes":"no") . " for URL: $url. Expected: ".($expected?"yes":"no").". ";
if($valid == $expected) {
echo "PASS\n"; $passed++;
} else {
echo "FAIL\n";
}
}
echo "URL Tests:\n\n";
test_url("http://localserver/projects/public/assets/javascript/widgets/UserBoxMenu/widget.css", true);
test_url("http://www.google.com", true);
test_url("http://www.google.co.uk/projects/my%20folder/test.php", true);
test_url("https://myserver.localdomain", true);
test_url("http://192.168.1.120/projects/index.php", true);
test_url("http://192.168.1.1/projects/index.php", true);
test_url("http://projectpier-server.localdomain/projects/public/assets/javascript/widgets/UserBoxMenu/widget.css", true);
test_url("https://2.4.168.19/project-pier?c=test&a=b", true);
test_url("https://localhost/a/b/c/test.php?c=controller&arg1=20&arg2=20", true);
test_url("http://user:password#localhost/a/b/c/test.php?c=controller&arg1=20&arg2=20", true);
echo "\n$passed out of $numtests tests passed.\n\n";
?>
Thanks again to eyelidness for the regex!
The post Getting parts of a URL (Regex) discusses parsing a URL to identify its various components. If you want to check if a URL is well-formed, it should be sufficient for your needs.
If you need to check if it's actually valid, you'll eventually have to try to access whatever's on the other end.
In general, though, you'd probably be better off using a function that's supplied to you by your framework or another library. Many platforms include functions that parse URLs. For example, there's Python's urlparse module, and in .NET you could use the System.Uri class's constructor as a means of validating the URL.
This might not be a job for regexes, but for existing tools in your language of choice. You probably want to use existing code that has already been written, tested, and debugged.
In PHP, use the parse_url function.
Perl: URI module.
Ruby: URI module.
.NET: 'Uri' class
Regexes are not a magic wand you wave at every problem that happens to involve strings.
This will match all URLs
with or without http/https
with or without www
...including sub-domains and those new top-level domain name extensions such as
.museum,
.academy,
.foundation
etc. which can have up to 63 characters (not just .com, .net, .info etc.)
(([\w]+:)?//)?(([\d\w]|%[a-fA-f\d]{2,2})+(:([\d\w]|%[a-fA-f\d]{2,2})+)?#)?([\d\w][-\d\w]{0,253}[\d\w]\.)+[\w]{2,63}(:[\d]+)?(/([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)*(\?(&?([-+_~.\d\w]|%[a-fA-f\d]{2,2})=?)*)?(#([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)?
Because today maximum length of the available top-level domain name extension is 13 characters such as .international, you can change the number 63 in expression to 13 to prevent someone misusing it.
as javascript
var urlreg=/(([\w]+:)?\/\/)?(([\d\w]|%[a-fA-f\d]{2,2})+(:([\d\w]|%[a-fA-f\d]{2,2})+)?#)?([\d\w][-\d\w]{0,253}[\d\w]\.)+[\w]{2,63}(:[\d]+)?(\/([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)*(\?(&?([-+_~.\d\w]|%[a-fA-f\d]{2,2})=?)*)?(#([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)?/;
$('textarea').on('input',function(){
var url = $(this).val();
$(this).toggleClass('invalid', urlreg.test(url) == false)
});
$('textarea').trigger('input');
textarea{color:green;}
.invalid{color:red;}
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<textarea>http://www.google.com</textarea>
<textarea>http//www.google.com</textarea>
<textarea>googlecom</textarea>
<textarea>https://www.google.com</textarea>
Wikipedia Article: List of all internet top-level domains
Non-validating URI-reference Parser
For reference purposes, here's the IETF Spec: (TXT | HTML). In particular, Appendix B. Parsing a URI Reference with a Regular Expression demonstrates how to parse a valid regex. This is described as,
for an example of a non-validating URI-reference parser that will take any given string and extract the URI components.
Here's the regex they provide:
^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?
As someone else said, it's probably best to leave this to a lib/framework you're already using.
^(http:\/\/www\.|https:\/\/www\.|http:\/\/|https:\/\/)?[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(:[0-9]{1,5})?(\/.*)?$
live demo: https://regex101.com/r/HUNasA/2
I have tested various expressions to match my requirements.
As a user I can hit browser search bar with following strings:
valid urls
https://www.google.com
http://www.google.com
http://google.com/
https://google.com/
www.google.com
google.com
https://www.google.com.ua
http://www.google.com.ua
http://google.com.ua
https://google.com.ua/
www.google.com.ua
google.com.ua
https://mail.google.com
http://mail.google.com
mail.google.com
invalid urls
http://google
https://google.c
google
google.
.google
.google.com
goole.c
...
The best regular expression for URL for me would be:
"(([\w]+:)?//)?(([\d\w]|%[a-fA-F\d]{2,2})+(:([\d\w]|%[a-fA-f\d]{2,2})+)?#)?([\d\w][-\d\w]{0,253}[\d\w]\.)+[\w]{2,4}(:[\d]+)?(/([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)*(\?(&?([-+_~.\d\w]|%[a-fA-f\d]{2,2})=?)*)?(#([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)?"
Here is a good rule that covers all possible cases: ports, params and etc
/(https?:\/\/(?:[a-z0-9](?:[a-z0-9-]{0,61}[a-z0-9])?\.)+[a-z0-9][a-z0-9-]{0,61}[a-z0-9])(:?\d*)\/?([a-z_\/0-9\-#.]*)\??([a-z_\/0-9\-#=&]*)/g
I wrote a little groovy version that you can run
it matches the following URLs (which is good enough for me)
public static void main(args) {
String url = "go to http://www.m.abut.ly/abc its awesome"
url = url.replaceAll(/https?:\/\/w{0,3}\w*?\.(\w*?\.)?\w{2,3}\S*|www\.(\w*?\.)?\w*?\.\w{2,3}\S*|(\w*?\.)?\w*?\.\w{2,3}[\/\?]\S*/ , { it ->
"woof${it}woof"
})
println url
}
http://google.com
http://google.com/help.php
http://google.com/help.php?a=5
http://www.google.com
http://www.google.com/help.php
http://www.google.com?a=5
google.com?a=5
google.com/help.php
google.com/help.php?a=5
http://www.m.google.com/help.php?a=5 (and all its permutations)
www.m.google.com/help.php?a=5 (and all its permutations)
m.google.com/help.php?a=5 (and all its permutations)
The important thing for any URLs that don't start with http or www is that they must include a / or ?
I bet this can be tweaked a little more but it does the job pretty nice for being so short and compact... because you can pretty much split it in 3:
find anything that starts with http:
https?:\/\/w{0,3}\w*?\.\w{2,3}\S*
find anything that starts with www:
www\.\w*?\.\w{2,3}\S*
or find anything that must have a text then a dot then at least 2 letters and then a ? or /:
\w*?\.\w{2,3}[\/\?]\S*
I was not able to find the regex I was looking for so I modified a regex to fullfill my requirements, and apparently it seems to work fine now. My requirements were:
Match URLs w/o protocol (www.gooogle.com)
Match URLs with query parameters and path (http://subdomain.web-site.com/cgi-bin/perl.cgi?key1=value1&key2=value2e)
Don't match URLs where there are not acceptable characters (e.g. "'£), for instance: (www.google.com/somthing"/somethingmore)
Here what I came up with, any suggestion is appreciated:
#Test
public void testWebsiteUrl(){
String regularExpression = "((http|ftp|https):\\/\\/)?[\\w\\-_]+(\\.[\\w\\-_]+)+([\\w\\-\\.,#?^=%&:/~\\+#]*[\\w\\-\\#?^=%&/~\\+#])?";
assertTrue("www.google.com".matches(regularExpression));
assertTrue("www.google.co.uk".matches(regularExpression));
assertTrue("http://www.google.com".matches(regularExpression));
assertTrue("http://www.google.co.uk".matches(regularExpression));
assertTrue("https://www.google.com".matches(regularExpression));
assertTrue("https://www.google.co.uk".matches(regularExpression));
assertTrue("google.com".matches(regularExpression));
assertTrue("google.co.uk".matches(regularExpression));
assertTrue("google.mu".matches(regularExpression));
assertTrue("mes.intnet.mu".matches(regularExpression));
assertTrue("cse.uom.ac.mu".matches(regularExpression));
assertTrue("http://www.google.com/path".matches(regularExpression));
assertTrue("http://subdomain.web-site.com/cgi-bin/perl.cgi?key1=value1&key2=value2e".matches(regularExpression));
assertTrue("http://www.google.com/?queryparam=123".matches(regularExpression));
assertTrue("http://www.google.com/path?queryparam=123".matches(regularExpression));
assertFalse("www..dr.google".matches(regularExpression));
assertFalse("www:google.com".matches(regularExpression));
assertFalse("https://www#.google.com".matches(regularExpression));
assertFalse("https://www.google.com\"".matches(regularExpression));
assertFalse("https://www.google.com'".matches(regularExpression));
assertFalse("http://www.google.com/path'".matches(regularExpression));
assertFalse("http://subdomain.web-site.com/cgi-bin/perl.cgi?key1=value1&key2=value2e'".matches(regularExpression));
assertFalse("http://www.google.com/?queryparam=123'".matches(regularExpression));
assertFalse("http://www.google.com/path?queryparam=12'3".matches(regularExpression));
}
function validateURL(textval) {
var urlregex = new RegExp(
"^(http|https|ftp)\://([a-zA-Z0-9\.\-]+(\:[a-zA-Z0-9\.&%\$\-]+)*#)*((25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9])\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[0-9])|localhost|([a-zA-Z0-9\-]+\.)*[a-zA-Z0-9\-]+\.(com|edu|gov|int|mil|net|org|biz|arpa|info|name|pro|aero|coop|museum|[a-zA-Z]{2}))(\:[0-9]+)*(/($|[a-zA-Z0-9\.\,\?\'\\\+&%\$#\=~_\-]+))*$");
return urlregex.test(textval);
}
Matches
http://site.com/dir/file.php?var=moo | ftp://user:pass#site.com:21/file/dir
Non-Matches
site.com | http://site.com/dir//
function validateURL(textval) {
var urlregex = new RegExp(
"^(http|https|ftp)\://[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(:[a-zA-Z0-9]*)?/?([a-zA-Z0-9\-\._\?\,\'/\\\+&%\$#\=~])*$");
return urlregex.test(textval);
}
Matches
http://www.asdah.com/~joe | ftp://ftp.asdah.co.uk:2828/asdah%20asdah.gif | https://asdah.gov/asdh-ah.as
If you really search for the ultimate match, you probably find it on "A Good Url Regular Expression?".
But a regex that really matches all possible domains and allows anything that is allowed according to RFCs is horribly long and unreadable, trust me ;-)
I hope it's helpful for you...
^(http|https):\/\/+[\www\d]+\.[\w]+(\/[\w\d]+)?
Here is a regex I made which extracts the different parts from an URL:
^((?:(?:http|ftp|ws)s?|sftp):\/\/?)?([^:/\s.#?]+\.[^:/\s#?]+|localhost)(:\d+)?((?:\/\w+)*\/)?([\w\-.]+[^#?\s]+)?([^#]+)?(#[\w-]*)?$
((?:(?:http|ftp|ws)s?|sftp):\/\/?)?(group 1): extracts the protocol
([^:/\s.#?]+\.[^:/\s#?]+|localhost)(group 2): extracts the hostname
(:\d+)?(group 3): extracts the port number
((?:\/\w+)*\/)?([\w\-.]+[^#?\s]+)?(groups 4 & 5): extracts the path part
([^#]+)?(group 6): extracts the query part
(#[\w-]*)?(group 7): extracts the hash part
For every part of the regex listed above, you can remove the ending ? to force it (or add one to make it facultative). You can also remove the ^ at the beginning and $ at the end of the regex so it won't need to match the whole string.
See it on regex101.
Note: this regex is not 100% safe and may accept some strings which are not necessarily valid URLs but it does indeed validate some criterias. Its main goal was to extract the different parts of an URL not to validate it.
I've been working on an in-depth article discussing URI validation using regular expressions. It is based on RFC3986.
Regular Expression URI Validation
Although the article is not yet complete, I have come up with a PHP function which does a pretty good job of validating HTTP and FTP URLs. Here is the current version:
// function url_valid($url) { Rev:20110423_2000
//
// Return associative array of valid URI components, or FALSE if $url is not
// RFC-3986 compliant. If the passed URL begins with: "www." or "ftp.", then
// "http://" or "ftp://" is prepended and the corrected full-url is stored in
// the return array with a key name "url". This value should be used by the caller.
//
// Return value: FALSE if $url is not valid, otherwise array of URI components:
// e.g.
// Given: "http://www.jmrware.com:80/articles?height=10&width=75#fragone"
// Array(
// [scheme] => http
// [authority] => www.jmrware.com:80
// [userinfo] =>
// [host] => www.jmrware.com
// [IP_literal] =>
// [IPV6address] =>
// [ls32] =>
// [IPvFuture] =>
// [IPv4address] =>
// [regname] => www.jmrware.com
// [port] => 80
// [path_abempty] => /articles
// [query] => height=10&width=75
// [fragment] => fragone
// [url] => http://www.jmrware.com:80/articles?height=10&width=75#fragone
// )
function url_valid($url) {
if (strpos($url, 'www.') === 0) $url = 'http://'. $url;
if (strpos($url, 'ftp.') === 0) $url = 'ftp://'. $url;
if (!preg_match('/# Valid absolute URI having a non-empty, valid DNS host.
^
(?P<scheme>[A-Za-z][A-Za-z0-9+\-.]*):\/\/
(?P<authority>
(?:(?P<userinfo>(?:[A-Za-z0-9\-._~!$&\'()*+,;=:]|%[0-9A-Fa-f]{2})*)#)?
(?P<host>
(?P<IP_literal>
\[
(?:
(?P<IPV6address>
(?: (?:[0-9A-Fa-f]{1,4}:){6}
| ::(?:[0-9A-Fa-f]{1,4}:){5}
| (?: [0-9A-Fa-f]{1,4})?::(?:[0-9A-Fa-f]{1,4}:){4}
| (?:(?:[0-9A-Fa-f]{1,4}:){0,1}[0-9A-Fa-f]{1,4})?::(?:[0-9A-Fa-f]{1,4}:){3}
| (?:(?:[0-9A-Fa-f]{1,4}:){0,2}[0-9A-Fa-f]{1,4})?::(?:[0-9A-Fa-f]{1,4}:){2}
| (?:(?:[0-9A-Fa-f]{1,4}:){0,3}[0-9A-Fa-f]{1,4})?:: [0-9A-Fa-f]{1,4}:
| (?:(?:[0-9A-Fa-f]{1,4}:){0,4}[0-9A-Fa-f]{1,4})?::
)
(?P<ls32>[0-9A-Fa-f]{1,4}:[0-9A-Fa-f]{1,4}
| (?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}
(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
)
| (?:(?:[0-9A-Fa-f]{1,4}:){0,5}[0-9A-Fa-f]{1,4})?:: [0-9A-Fa-f]{1,4}
| (?:(?:[0-9A-Fa-f]{1,4}:){0,6}[0-9A-Fa-f]{1,4})?::
)
| (?P<IPvFuture>[Vv][0-9A-Fa-f]+\.[A-Za-z0-9\-._~!$&\'()*+,;=:]+)
)
\]
)
| (?P<IPv4address>(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}
(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?))
| (?P<regname>(?:[A-Za-z0-9\-._~!$&\'()*+,;=]|%[0-9A-Fa-f]{2})+)
)
(?::(?P<port>[0-9]*))?
)
(?P<path_abempty>(?:\/(?:[A-Za-z0-9\-._~!$&\'()*+,;=:#]|%[0-9A-Fa-f]{2})*)*)
(?:\?(?P<query> (?:[A-Za-z0-9\-._~!$&\'()*+,;=:#\\/?]|%[0-9A-Fa-f]{2})*))?
(?:\#(?P<fragment> (?:[A-Za-z0-9\-._~!$&\'()*+,;=:#\\/?]|%[0-9A-Fa-f]{2})*))?
$
/mx', $url, $m)) return FALSE;
switch ($m['scheme']) {
case 'https':
case 'http':
if ($m['userinfo']) return FALSE; // HTTP scheme does not allow userinfo.
break;
case 'ftps':
case 'ftp':
break;
default:
return FALSE; // Unrecognized URI scheme. Default to FALSE.
}
// Validate host name conforms to DNS "dot-separated-parts".
if ($m['regname']) { // If host regname specified, check for DNS conformance.
if (!preg_match('/# HTTP DNS host name.
^ # Anchor to beginning of string.
(?!.{256}) # Overall host length is less than 256 chars.
(?: # Group dot separated host part alternatives.
[A-Za-z0-9]\. # Either a single alphanum followed by dot
| # or... part has more than one char (63 chars max).
[A-Za-z0-9] # Part first char is alphanum (no dash).
[A-Za-z0-9\-]{0,61} # Internal chars are alphanum plus dash.
[A-Za-z0-9] # Part last char is alphanum (no dash).
\. # Each part followed by literal dot.
)* # Zero or more parts before top level domain.
(?: # Explicitly specify top level domains.
com|edu|gov|int|mil|net|org|biz|
info|name|pro|aero|coop|museum|
asia|cat|jobs|mobi|tel|travel|
[A-Za-z]{2}) # Country codes are exactly two alpha chars.
\.? # Top level domain can end in a dot.
$ # Anchor to end of string.
/ix', $m['host'])) return FALSE;
}
$m['url'] = $url;
for ($i = 0; isset($m[$i]); ++$i) unset($m[$i]);
return $m; // return TRUE == array of useful named $matches plus the valid $url.
}
This function utilizes two regexes; one to match a subset of valid generic URIs (absolute ones having a non-empty host), and a second to validate the DNS "dot-separated-parts" host name. Although this function currently validates only HTTP and FTP schemes, it is structured such that it can be easily extended to handle other schemes.
I use this regex:
((https?:)?//)?(([\d\w]|%[a-fA-f\d]{2,2})+(:([\d\w]|%[a-fA-f\d]{2,2})+)?#)?([\d\w][-\d\w]{0,253}[\d\w]\.)+[\w]{2,63}(:[\d]+)?(/([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)*(\?(&?([-+_~.\d\w]|%[a-fA-f\d]{2,2})=?)*)?(#([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)?
To support both:
http://stackoverflow.com
https://stackoverflow.com
And:
//stackoverflow.com
Here's a ready-to-go Java version from the Android source code. This is the best one I've found.
public static final Matcher WEB = Pattern.compile(new StringBuilder()
.append("((?:(http|https|Http|Https|rtsp|Rtsp):")
.append("\\/\\/(?:(?:[a-zA-Z0-9\\$\\-\\_\\.\\+\\!\\*\\'\\(\\)")
.append("\\,\\;\\?\\&\\=]|(?:\\%[a-fA-F0-9]{2})){1,64}(?:\\:(?:[a-zA-Z0-9\\$\\-\\_")
.append("\\.\\+\\!\\*\\'\\(\\)\\,\\;\\?\\&\\=]|(?:\\%[a-fA-F0-9]{2})){1,25})?\\#)?)?")
.append("((?:(?:[a-zA-Z0-9][a-zA-Z0-9\\-]{0,64}\\.)+") // named host
.append("(?:") // plus top level domain
.append("(?:aero|arpa|asia|a[cdefgilmnoqrstuwxz])")
.append("|(?:biz|b[abdefghijmnorstvwyz])")
.append("|(?:cat|com|coop|c[acdfghiklmnoruvxyz])")
.append("|d[ejkmoz]")
.append("|(?:edu|e[cegrstu])")
.append("|f[ijkmor]")
.append("|(?:gov|g[abdefghilmnpqrstuwy])")
.append("|h[kmnrtu]")
.append("|(?:info|int|i[delmnoqrst])")
.append("|(?:jobs|j[emop])")
.append("|k[eghimnrwyz]")
.append("|l[abcikrstuvy]")
.append("|(?:mil|mobi|museum|m[acdghklmnopqrstuvwxyz])")
.append("|(?:name|net|n[acefgilopruz])")
.append("|(?:org|om)")
.append("|(?:pro|p[aefghklmnrstwy])")
.append("|qa")
.append("|r[eouw]")
.append("|s[abcdeghijklmnortuvyz]")
.append("|(?:tel|travel|t[cdfghjklmnoprtvwz])")
.append("|u[agkmsyz]")
.append("|v[aceginu]")
.append("|w[fs]")
.append("|y[etu]")
.append("|z[amw]))")
.append("|(?:(?:25[0-5]|2[0-4]") // or ip address
.append("[0-9]|[0-1][0-9]{2}|[1-9][0-9]|[1-9])\\.(?:25[0-5]|2[0-4][0-9]")
.append("|[0-1][0-9]{2}|[1-9][0-9]|[1-9]|0)\\.(?:25[0-5]|2[0-4][0-9]|[0-1]")
.append("[0-9]{2}|[1-9][0-9]|[1-9]|0)\\.(?:25[0-5]|2[0-4][0-9]|[0-1][0-9]{2}")
.append("|[1-9][0-9]|[0-9])))")
.append("(?:\\:\\d{1,5})?)") // plus option port number
.append("(\\/(?:(?:[a-zA-Z0-9\\;\\/\\?\\:\\#\\&\\=\\#\\~") // plus option query params
.append("\\-\\.\\+\\!\\*\\'\\(\\)\\,\\_])|(?:\\%[a-fA-F0-9]{2}))*)?")
.append("(?:\\b|$)").toString()
).matcher("");
For Python, this is the actual URL validating regex used in Django 1.5.1:
import re
regex = re.compile(
r'^(?:http|ftp)s?://' # http:// or https://
r'(?:(?:[A-Z0-9](?:[A-Z0-9-]{0,61}[A-Z0-9])?\.)+(?:[A-Z]{2,6}\.?|[A-Z0-9-]{2,}\.?)|' # domain...
r'localhost|' # localhost...
r'\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}|' # ...or ipv4
r'\[?[A-F0-9]*:[A-F0-9:]+\]?)' # ...or ipv6
r'(?::\d+)?' # optional port
r'(?:/?|[/?]\S+)$', re.IGNORECASE)
This does both ipv4 and ipv6 addresses as well as ports and GET parameters.
Found in the code here, Line 44.
This one works for me very well. (https?|ftp)://(www\d?|[a-zA-Z0-9]+)?\.[a-zA-Z0-9-]+(\:|\.)([a-zA-Z0-9.]+|(\d+)?)([/?:].*)?
For convenience here's a one-liner regexp for URL's that will also match localhost where you're more likely to have ports than .com or similar.
(http(s)?:\/\/.)?(www\.)?[-a-zA-Z0-9#:%._\+~#=]{2,256}(\.[a-z]{2,6}|:[0-9]{3,4})\b([-a-zA-Z0-9#:%_\+.~#?&\/\/=]*)
I found the following Regex for URLs, tested successfully with 500+ URLs:
/\b(?:(?:https?|ftp):\/\/)(?:\S+(?::\S*)?#)?(?:(?!10(?:\.\d{1,3}){3})(?!127(?:\.\d{1,3}){3})(?!169\.254(?:\.\d{1,3}){2})(?!192\.168(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\x{00a1}-\x{ffff}0-9]+-?)*[a-z\x{00a1}-\x{ffff}0-9]+)(?:\.(?:[a-z\x{00a1}-\x{ffff}0-9]+-?)*[a-z\x{00a1}-\x{ffff}0-9]+)*(?:\.(?:[a-z\x{00a1}-\x{ffff}]{2,})))(?::\d{2,5})?(?:\/[^\s]*)?\b/gi
I know it looks ugly, but the good thing is that it works. :)
Explanation and demo with 581 random URLs on regex101.
Source: In search of the perfect URL validation regex
To Match a URL there are various option and it depend on you requirement.
below are few.
_(^|[\s.:;?\-\]<\(])(https?://[-\w;/?:#&=+$\|\_.!~*\|'()\[\]%#,☺]+[\w/#](\(\))?)(?=$|[\s',\|\(\).:;?\-\[\]>\)])_i
#\b(([\w-]+://?|www[.])[^\s()<>]+(?:\([\w\d]+\)|([^[:punct:]\s]|/)))#iS
And there is a link which gives you more than 10 different variations of validation for URL.
https://mathiasbynens.be/demo/url-regex
I tried to formulate my version of url. My requirement was to capture instances in a String where possible url can be cse.uom.ac.mu - noting that it is not preceded by http nor www
String regularExpression = "((((ht{2}ps?://)?)((w{3}\\.)?))?)[^.&&[a-zA-Z0-9]][a-zA-Z0-9.-]+[^.&&[a-zA-Z0-9]](\\.[a-zA-Z]{2,3})";
assertTrue("www.google.com".matches(regularExpression));
assertTrue("www.google.co.uk".matches(regularExpression));
assertTrue("http://www.google.com".matches(regularExpression));
assertTrue("http://www.google.co.uk".matches(regularExpression));
assertTrue("https://www.google.com".matches(regularExpression));
assertTrue("https://www.google.co.uk".matches(regularExpression));
assertTrue("google.com".matches(regularExpression));
assertTrue("google.co.uk".matches(regularExpression));
assertTrue("google.mu".matches(regularExpression));
assertTrue("mes.intnet.mu".matches(regularExpression));
assertTrue("cse.uom.ac.mu".matches(regularExpression));
//cannot contain 2 '.' after www
assertFalse("www..dr.google".matches(regularExpression));
//cannot contain 2 '.' just before com
assertFalse("www.dr.google..com".matches(regularExpression));
// to test case where url www must be followed with a '.'
assertFalse("www:google.com".matches(regularExpression));
// to test case where url www must be followed with a '.'
//assertFalse("http://wwwe.google.com".matches(regularExpression));
// to test case where www must be preceded with a '.'
assertFalse("https://www#.google.com".matches(regularExpression));
whats wrong with plain and simple FILTER_VALIDATE_URL ?
$url = "http://www.example.com";
if(!filter_var($url, FILTER_VALIDATE_URL))
{
echo "URL is not valid";
}
else
{
echo "URL is valid";
}
I know its not the question exactly but it did the job for me when I needed to validate urls so thought it might be useful to others who come across this post looking for the same thing

Run a jar for specific emails in Outlook

I have to run some java code every time I receive a specific email.
I just download the attachment in the email, run the jar and reply with the response from executing that jar with the attachment.
Is it possible to somehow automate this ?
I have checked out VBA routines that can be called using outlook rules but I am not sure whether I can execute my jar file with this.
Any ideas ?
Here is the structure, I'll let you tune it!
You can set a rule to use SaveToDiskAndReply which is the main program.
Paste this at the start of your Outlook module :
Option Explicit
Private Declare Sub Sleep Lib "kernel32" (ByVal dwMilliseconds As Long)
Private Sub RunSleep( _
exec As WshExec, _
Optional timeSegment As Long = 800 _
)
Do While exec.Status = WshRunning
Sleep timeSegment
Loop
End Sub
Private Function RunProgram( _
program As String, _
Optional command As String = "" _
) As WshExec
Dim wsh As New WshShell
Dim exec As WshExec
Set exec = wsh.exec(program)
Call exec.StdIn.WriteLine(command)
Call RunSleep(exec)
Set RunProgram = exec
End Function
Public Function Run_Jar() As String
Dim program As WshExec
Dim value As String
'''Set the path (jar and log)
Set program = RunProgram("java -jar ""D:\\Demo.jar"" 8861ccd621")
DoEvents
Run_Jar = program.StdOut.ReadAll
End Function
And use that as the script launched by the rule :
Public Sub SaveToDiskAndReply(ItM As Outlook.MailItem)
Dim oAttS As Outlook.Attachments
Dim objAtt As Outlook.Attachment
Dim oItM As Outlook.MailItem
Dim saveFolder As String
Dim dateFormat As String
Dim JarReturn As String
dateFormat = Format(Now, "yyyy-mm-dd")
saveFolder = "c:\temp\"
Set oAttS = ItM.Attachments
'''Save the attachements
For Each objAtt In oAttS
objAtt.SaveAsFile saveFolder & objAtt.FileName
Next objAtt
'''Run your jar
JarReturn = Run_Jar
'''Fill the email
Set oItM = OutApp.CreateItem(0)
'''Decomment the next line when you're done testing
'On Error Resume Next
With oItM
.To = ItM.SenderEmailAddress
.CC = ""
.BCC = ""
.Subject = ItM.Subject
.Body = JarReturn
For Each objAtt In oAttS
.Attachments.Add saveFolder & objAtt.FileName
Next objAtt
.Send 'or use .Display
End With
On Error GoTo 0
Set oAttS = Nothing
Set objAtt = Nothing
End Sub

searching and replacing content from one file to another file using a keyword using python or java

Locked. There are disputes about this question’s content being resolved at this time. It is not currently accepting new answers or interactions.
I have 2 files which contains some data like this!!
File 1 contains:
/begin MENT AE0DAQ0O41 ""
ECU_ADDRESS 0x8111DSCC
ECU_ADDRESS_EXTENSION 0x0
/begin IF_DATA CAN_EXT
120
LINK_MAP "AE0DAQ0O41" 0x8111DSCC 0x0 0 0x2 1 0x2F 0x1
DISPLAY 0 0 655
/end IF_DATA
SYMBOL_LINK "AE0DAQ0O41" 0
/end MENT
File 2 contains:
name value line keyword
.data 80008114+000005 AE0DAQ0O43
.data 80008116+000005 AE0DAQ0O41
.data 80008118+000005 EA0DAQ0O45
.data 8000811a+000005 AE0DAF0O89
Now what we need to do is take a keyword AE0DAQ0O41 and need to search in the next file.
It has some value before the keyword, so we need to take that value 80008116 and need to replace it in
ECU_ADDRESS 0x8111DSCC and also LINK_MAP AE0DAQ0O41 0x8111DSCC 0x0 0 0x2 1 0x2F 0x1 for (0x8111DSCC it needs to be 0x80008116) and save it to FILE 1.
FILE 1 is to be saved like this :
/begin MENT AE0DAQ0O41 ""
ECU_ADDRESS 0x80008116
ECU_ADDRESS_EXTENSION 0x0
/begin IF_DATA CAN_EXT
120
LINK_MAP "AE0DAQ0O41" 0x80008116 0x0 0 0x2 1 0x2F 0x1
DISPLAY 0 0 655
/end IF_DATA
SYMBOL_LINK "AE0DAQ0O41" 0
/end MENT
How do we do that ??? because it has multiple lines like this ????
Thanks in advance!!!!!!!
If you consider your File 2 as a tab seperated value file then you could read the File 1 line by line and then compare the keyword in the file1 with each line in file2.
When you get a match then write another file with the new inputs
Quick and dirty solution:
(Assuming that the inputs are both text files...)
The code creates a dictionary by mining the second file.
The first file is processed line by line and written to the output file after the required modifications.
This is certainly not the best way to go about it.
If you know the exact format of the files, you can optimize the code to run a lot faster.
fout = open('output.txt' , 'w')
beg, ecu, lnk = '/begin','ECU_ADDRESS', 'LINK_MAP'
keyVal = dict()
with open('file2.txt') as f2:
for line in f2:
b = line.split(' ')
newK, newV = b[-1].replace('\n','') , b[-2].split('+')[0]
keyVal[newK] = newV
with open('file1.txt') as f1:
value,keyword = '', ''
for line in f1:
a = line.split(' ')
loc = 0
if beg in a and 'MENT' in a:
keyword = a[a.index(beg)+2]
value = '0x'+keyVal.get(keyword,keyword)
elif ecu in a:
loc = a.index(ecu) + 1
elif lnk in a: loc = a.index(lnk) + 2
else : loc = 0
if loc != 0:
a[loc] = value
a = ' '.join(a)
fout.writelines(a)
fout.close()

Get contents of brackets using regex in a list of values

I'm trying to look for a regex (Coldfusion or Java) that can get me the contents between the brackets for each (param \d+) without fail. I've tried dozens of different types of regexes and the closest one I got is this one:
\(param \d+\) = \[(type='[^']*', class='[^']*', value='(?:[^']|'')*', sqltype='[^']*')\]
Which would be perfect, if the string that I get back from CF escaped single quotes from the value parameter. But it doesn't so it fails miserably. Going the route of a negative lookahead like so:
\[(type='[^']*', class='[^']*', value='(?:(?!', sqltype).)*', sqltype='[^']*')\]
Is great, unless for some unnatured reason there's a piece of code that quite literally has , sqltype in the value. I find it hard to believe I can't simply tell regex to scoop out the contents of every open and closed bracket it finds but then again, I don't know enough regex to know its limits.
Here's an example string of what I'm trying to parse:
(param 1) = [type='IN', class='java.lang.Integer', value='47', sqltype='cf_sql_integer'] , (param 2) = [type='IN', class='java.lang.String', value='asf , O'Reilly, really?', sqltype='cf_sql_varchar'] , (param 3) = [type='IN', class='java.lang.String', value='Th[is]is Ev'ery'thing That , []can break it ', sqltype= ', sqltype='cf_sql_varchar']
For the curious this is a sub-question to Copyable Coldfusion SQL Exception.
EDIT
This is my attempt at implementing #Mena's answer in CF9.1. Sadly it doesn't finish processing the string. I had to replace the \\ with \ just to get it to run at first, but my implementation might still be at fault.
This is the string given (pipes are just to denote boundary):
| (param 1) = [type='IN', class='java.lang.Integer', value='47', sqltype='cf_sql_integer'] , (param 2) = [type='IN', class='java.lang.String', value='asf , O'Reilly], really?', sqltype='cf_sql_varchar'] , (param 3) = [type='IN', class='java.lang.String', value='Th[is]is Ev'ery'thing That , []can break it ', sqltype ', sqltype='cf_sql_varchar'] |
This is my implementation:
<cfset var outerPat = createObject("java","java.util.regex.Pattern").compile(javaCast("string", "\((.+?)\)\s?\=\s?\[(.+?)\](\s?,|$)"))>
<cfset var innerPat = createObject("java","java.util.regex.Pattern").compile(javaCast("string", "(.+?)\s?\=\s?'(.+?)'\s?,\s?"))>
<cfset var outerMatcher = outerPat.matcher(javaCast("string", arguments.params))>
<cfdump var="Start"><br />
<cfloop condition="outerMatcher.find()">
<cfdump var="#outerMatcher.group(1)#"> (<cfdump var="#outerMatcher.group(2)#">)<br />
<cfset var innerMatcher = innerPat.matcher(javaCast("string", outerMatcher.group(2)))>
<cfloop condition="innerMatcher.find()">
<cfoutput>|__</cfoutput><cfdump var="#innerMatcher.group(1)#"> --> <cfdump var="#innerMatcher.group(2)#"><br />
</cfloop>
<br />
</cfloop>
<cfabort>
And this is what printed:
Start
param 1 ( type='IN', class='java.lang.Integer', value='47', sqltype='cf_sql_integer' )
|__ type --> IN
|__ class --> java.lang.Integer
|__ value --> 47
param 2 ( type='IN', class='java.lang.String', value='asf , O'Reilly )
|__ type --> IN
|__ class --> java.lang.String
End
Here's a Java regex pattern that works for your sample input.
(?x)
# lookbehind to check for start of string or previous param
# java lookbehinds must have max length, so limits sqltype
(?<=^|sqltype='cf_sql_[a-z]{1,16}']\ ,\ )
# capture the full string for replacing in the orig sql
# and just the position to verify against the match position
(\(param\ (\d+)\))
\ =\ \[
# type and class wont contain quotes
type='([^']++)'
,\ class='([^']++)'
# match any non-quote, then lazily keep going
,\ value='([^']++.*?)'
# sqltype is always alphanumeric
,\ sqltype='cf_sql_[a-z]+'
\]
# lookahead to check for end of string or next param
(?=$|\ ,\ \(param\ \d+\)\ =\ \[)
(The (?x) flag is for comment mode, which ignores unescaped whitespace and between a hash and end of line.)
And here's that pattern implemented in CFML (tested on CF9,0,1,274733). It uses cfRegex (a library which makes it easier to work with Java regex in CFML) to get the results of that pattern, and then does a couple of checks to make sure the expected number of params are found.
<cfsavecontent variable="Input">
(param 1) = [type='IN', class='java.lang.Integer', value='47', sqltype='cf_sql_integer']
, (param 2) = [type='IN', class='java.lang.String', value='asf , O'Reilly, really?', sqltype='cf_sql_varchar']
, (param 3) = [type='IN', class='java.lang.String', value='Th[is]is Ev'ery'thing That , []can break it ', sqltype= ', sqltype='cf_sql_varchar']
</cfsavecontent>
<cfset Input = trim(Input).replaceall('\n','')>
<cfset cfcatch =
{ params = input
, sql = 'SELECT stuff FROM wherever WHERE (param 3) is last param'
}/>
<cfsavecontent variable="ParamRx">(?x)
# lookbehind to check for start or previous param
# java lookbehinds must have max length, so limits sqltype
(?<=^|sqltype='cf_sql_[a-z]{1,16}']\ ,\ )
# capture the full string for replacing in the orig sql
# and just the position to verify against the match position
(\(param\ (\d+)\))
\ =\ \[
# type and class wont contain quotes
type='([^']++)'
,\ class='([^']++)'
# match any non-quote, then lazily keep going if needed
,\ value='([^']++.*?)'
# sqltype is always alphanumeric
,\ sqltype='cf_sql_[a-z]+'
\]
# lookahead to check for end or next param
(?=$|\ ,\ \(param\ \d+\)\ =\ \[)
</cfsavecontent>
<cfset FoundParams = new Regex(ParamRx).match
( text = cfcatch.params
, returntype = 'full'
)/>
<cfset LastParamPos = cfcatch.sql.lastIndexOf('(param ') + 7 />
<cfset LastParam = ListFirst( Mid(cfcatch.sql,LastParamPos,3) , ')' ) />
<cfif LastParam NEQ ArrayLen(FoundParams) >
<cfset ProblemsDetected = true />
<cfelse>
<cfset ProblemsDetected = false />
<cfloop index="i" from=1 to=#ArrayLen(FoundParams)# >
<cfif i NEQ FoundParams[i].Groups[2] >
<cfset ProblemsDetected = true />
</cfif>
</cfloop>
</cfif>
<cfif ProblemsDetected>
<big>Something went wrong!</big>
<cfelse>
<big>All seems fine</big>
</cfif>
<cfdump var=#FoundParams# />
This will actually work if you embed an entire param inside the value of another param. It fails if you try two (or more), but at least least the checks should detect this failure.
Here's what the dump output should look like:
Hopefully everything here makes sense - let me know if any questions.
I would probably use a dedicated parser for that, but here's an example on how to do it with two Patterns and nested loops:
// the input String
String input = "(param 1) = " +
"[type='IN', class='java.lang.Integer', value='47', sqltype='cf_sql_integer'] , " +
"(param 2) = " +
"[type='IN', class='java.lang.String', value='asf , O'Reilly, really?', " +
"sqltype='cf_sql_varchar'] , " +
"(param 3) = " +
"[type='IN', class='java.lang.String', value='Th[is]is Ev'ery'thing That , " "[]can break it ', sqltype= ', sqltype='cf_sql_varchar']";
// the Pattern defining the round-bracket expression and the following
// square-bracket list. Both values within the brackets are grouped for back-reference
// note that what prevents the 3rd case from breaking is that the closing square bracket
// is expected to be either followed by optional space + comma, or end of input
Pattern outer = Pattern.compile("\\((.+?)\\)\\s?\\=\\s?\\[(.+?)\\](\\s?,|$)");
// the Pattern defining the key-value pairs within the square-bracket groups
// note that both key and value are grouped for back-reference
Pattern inner = Pattern.compile("(.+?)\\s?\\=\\s?'(.+?)'\\s?,\\s?");
Matcher outerMatcher = outer.matcher(input);
// iterating over the outer Pattern (type x) = [myKey = myValue, ad lib.], or end of input
while (outerMatcher.find()) {
System.out.println(outerMatcher.group(1));
Matcher innerMatcher = inner.matcher(outerMatcher.group(2));
// iterating over the inner Pattern myKey = myValue
while (innerMatcher.find()) {
System.out.println("\t" + innerMatcher.group(1) + " --> " + innerMatcher.group(2));
}
}
Output:
param 1
type --> IN
class --> java.lang.Integer
value --> 47
param 2
type --> IN
class --> java.lang.String
value --> asf , O'Reilly, really?
param 3
type --> IN
class --> java.lang.String
value --> Th[is]is Ev'ery'thing That , []can break it

Categories