<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	>
<channel>
	<title>Comments on: URL Validation in Ruby/Rails</title>
	<atom:link href="http://actsasblog.wordpress.com/2006/10/16/url-validation-in-rubyrails/feed/" rel="self" type="application/rss+xml" />
	<link>http://actsasblog.wordpress.com/2006/10/16/url-validation-in-rubyrails/</link>
	<description>Software Engineering, Ruby, Rails, Games &#38; Art. All the food groups for a balanced diet.</description>
	<pubDate>Sun, 07 Sep 2008 15:56:55 +0000</pubDate>
	<generator>http://wordpress.org/?v=MU</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Dean</title>
		<link>http://actsasblog.wordpress.com/2006/10/16/url-validation-in-rubyrails/#comment-625</link>
		<dc:creator>Dean</dc:creator>
		<pubDate>Wed, 16 Jul 2008 01:32:47 +0000</pubDate>
		<guid isPermaLink="false">http://actsasblog.wordpress.com/2006/10/16/url-validation-in-rubyrails/#comment-625</guid>
		<description>Excellent.  Thanks for the info!</description>
		<content:encoded><![CDATA[<p>Excellent.  Thanks for the info!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: jaren</title>
		<link>http://actsasblog.wordpress.com/2006/10/16/url-validation-in-rubyrails/#comment-624</link>
		<dc:creator>jaren</dc:creator>
		<pubDate>Mon, 23 Jun 2008 05:38:09 +0000</pubDate>
		<guid isPermaLink="false">http://actsasblog.wordpress.com/2006/10/16/url-validation-in-rubyrails/#comment-624</guid>
		<description>f4hvYk dfv078fnw8f934ndvkg2l</description>
		<content:encoded><![CDATA[<p>f4hvYk dfv078fnw8f934ndvkg2l</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: JR</title>
		<link>http://actsasblog.wordpress.com/2006/10/16/url-validation-in-rubyrails/#comment-621</link>
		<dc:creator>JR</dc:creator>
		<pubDate>Thu, 10 Apr 2008 04:13:36 +0000</pubDate>
		<guid isPermaLink="false">http://actsasblog.wordpress.com/2006/10/16/url-validation-in-rubyrails/#comment-621</guid>
		<description>Very informative.</description>
		<content:encoded><![CDATA[<p>Very informative.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: alan</title>
		<link>http://actsasblog.wordpress.com/2006/10/16/url-validation-in-rubyrails/#comment-620</link>
		<dc:creator>alan</dc:creator>
		<pubDate>Thu, 27 Mar 2008 23:26:29 +0000</pubDate>
		<guid isPermaLink="false">http://actsasblog.wordpress.com/2006/10/16/url-validation-in-rubyrails/#comment-620</guid>
		<description>in case its hard to see...

  PORT = /(([:]\d+)?)/
  DOMAIN = /([a-z0-9\-]+\.?)*([a-z0-9]{2,})\.[a-z]{2,}/
  NUMERIC_IP = /(?&#62;(?:1?\d?\d&#124;2[0-4]\d&#124;25[0-5])\.){3}(?:1?\d?\d&#124;2[0-4]\d&#124;25[0-5])(?:\/(?:[12]?\d&#124;3[012])&#124;-(?&#62;(?:1?\d?\d&#124;2[0-4]\d&#124;25[0-5])\.){3}(?:1?\d?\d&#124;2[0-4]\d&#124;25[0-5]))?/

  validates_format_of :name, :with =&#62; /^((localhost)&#124;#{DOMAIN}&#124;#{NUMERIC_IP})#{PORT}$/</description>
		<content:encoded><![CDATA[<p>in case its hard to see&#8230;</p>
<p>  PORT = /(([:]\d+)?)/<br />
  DOMAIN = /([a-z0-9\-]+\.?)*([a-z0-9]{2,})\.[a-z]{2,}/<br />
  NUMERIC_IP = /(?&gt;(?:1?\d?\d|2[0-4]\d|25[0-5])\.){3}(?:1?\d?\d|2[0-4]\d|25[0-5])(?:\/(?:[12]?\d|3[012])|-(?&gt;(?:1?\d?\d|2[0-4]\d|25[0-5])\.){3}(?:1?\d?\d|2[0-4]\d|25[0-5]))?/</p>
<p>  validates_format_of :name, :with =&gt; /^((localhost)|#{DOMAIN}|#{NUMERIC_IP})#{PORT}$/</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: alan</title>
		<link>http://actsasblog.wordpress.com/2006/10/16/url-validation-in-rubyrails/#comment-619</link>
		<dc:creator>alan</dc:creator>
		<pubDate>Thu, 27 Mar 2008 23:25:32 +0000</pubDate>
		<guid isPermaLink="false">http://actsasblog.wordpress.com/2006/10/16/url-validation-in-rubyrails/#comment-619</guid>
		<description>Check this model, towards the bottom there is a pretty good regex. It only needs to recognise urls with http&#124;https, but that is very easy to do.

http://sample.caboo.se/weed2/app/models/domain.rb</description>
		<content:encoded><![CDATA[<p>Check this model, towards the bottom there is a pretty good regex. It only needs to recognise urls with http|https, but that is very easy to do.</p>
<p><a href="http://sample.caboo.se/weed2/app/models/domain.rb" rel="nofollow">http://sample.caboo.se/weed2/app/models/domain.rb</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: jacques</title>
		<link>http://actsasblog.wordpress.com/2006/10/16/url-validation-in-rubyrails/#comment-618</link>
		<dc:creator>jacques</dc:creator>
		<pubDate>Tue, 11 Mar 2008 20:58:54 +0000</pubDate>
		<guid isPermaLink="false">http://actsasblog.wordpress.com/2006/10/16/url-validation-in-rubyrails/#comment-618</guid>
		<description>in a description text i have to find and replace urls...
hi how can i search for reg in a string? How should the pattern look like?

thank you</description>
		<content:encoded><![CDATA[<p>in a description text i have to find and replace urls&#8230;<br />
hi how can i search for reg in a string? How should the pattern look like?</p>
<p>thank you</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: here it is</title>
		<link>http://actsasblog.wordpress.com/2006/10/16/url-validation-in-rubyrails/#comment-617</link>
		<dc:creator>here it is</dc:creator>
		<pubDate>Fri, 07 Mar 2008 02:32:06 +0000</pubDate>
		<guid isPermaLink="false">http://actsasblog.wordpress.com/2006/10/16/url-validation-in-rubyrails/#comment-617</guid>
		<description>When I said domain names up to 5 characters I meant domain extensions, like .info, .com, .org, .tv etc</description>
		<content:encoded><![CDATA[<p>When I said domain names up to 5 characters I meant domain extensions, like .info, .com, .org, .tv etc</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: here it is</title>
		<link>http://actsasblog.wordpress.com/2006/10/16/url-validation-in-rubyrails/#comment-616</link>
		<dc:creator>here it is</dc:creator>
		<pubDate>Fri, 07 Mar 2008 02:31:07 +0000</pubDate>
		<guid isPermaLink="false">http://actsasblog.wordpress.com/2006/10/16/url-validation-in-rubyrails/#comment-616</guid>
		<description>This is the best regex i've found so far

/^(http&#124;https):\/\/[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(:[0-9]{1,5})?(\/.*)?$/ix


It handles both http, https, ip addresses, domain names, port numbers, domain names up to 5 characters, and even domains like the one Tom Harrison said could not be matched, were matched correctly by the above regex.

If you want to try it yourself

go into the console mode: 

ruby ./script/console

url = "http://www.target.com/gp/detail.html/602-4045909-4263801?ASIN=B000NPCK3W&#38;AFID=Froogle&#38;LNM=B000NPCK3W&#124;Lexmark_AllInOne_Printer_with_Scanner_and_Copier__X1240&#38;ci_src=14110944&#38;ci_sku=B000NPCK3W&#38;ref=tgt_adv_XSG10001"

reg = /^(http&#124;https):\/\/[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(:[0-9]{1,5})?(\/.*)?$/ix

reg.match(url) ? true : false


you'll see it'll return true</description>
		<content:encoded><![CDATA[<p>This is the best regex i&#8217;ve found so far</p>
<p>/^(http|https):\/\/[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(:[0-9]{1,5})?(\/.*)?$/ix</p>
<p>It handles both http, https, ip addresses, domain names, port numbers, domain names up to 5 characters, and even domains like the one Tom Harrison said could not be matched, were matched correctly by the above regex.</p>
<p>If you want to try it yourself</p>
<p>go into the console mode: </p>
<p>ruby ./script/console</p>
<p>url = &#8220;http://www.target.com/gp/detail.html/602-4045909-4263801?ASIN=B000NPCK3W&amp;AFID=Froogle&amp;LNM=B000NPCK3W|Lexmark_AllInOne_Printer_with_Scanner_and_Copier__X1240&amp;ci_src=14110944&amp;ci_sku=B000NPCK3W&amp;ref=tgt_adv_XSG10001&#8243;</p>
<p>reg = /^(http|https):\/\/[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(:[0-9]{1,5})?(\/.*)?$/ix</p>
<p>reg.match(url) ? true : false</p>
<p>you&#8217;ll see it&#8217;ll return true</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tom Harrison</title>
		<link>http://actsasblog.wordpress.com/2006/10/16/url-validation-in-rubyrails/#comment-615</link>
		<dc:creator>Tom Harrison</dc:creator>
		<pubDate>Wed, 23 Jan 2008 17:38:44 +0000</pubDate>
		<guid isPermaLink="false">http://actsasblog.wordpress.com/2006/10/16/url-validation-in-rubyrails/#comment-615</guid>
		<description>Didn't find any alternatives: the problem in the URL above is the vertical bar.  If I escape it with %7D URI is happy as a clam.

So reading the URI specification, it does appear that this character falls into a group that while specifically excluded are in a class that should not be used in URLs (or escaped if they are).  So I would be fine with Ruby's picky URI class, except that in Rails, URLs are sometimes generated with ids in [123] square brackets, which are also not allowed by the spec, but which URI seems fine with.

So anyway, if anyone runs into this, just gsub replace out any characters like carat, backtick, tilde and possibly others with their CGI encoded variants.</description>
		<content:encoded><![CDATA[<p>Didn&#8217;t find any alternatives: the problem in the URL above is the vertical bar.  If I escape it with %7D URI is happy as a clam.</p>
<p>So reading the URI specification, it does appear that this character falls into a group that while specifically excluded are in a class that should not be used in URLs (or escaped if they are).  So I would be fine with Ruby&#8217;s picky URI class, except that in Rails, URLs are sometimes generated with ids in [123] square brackets, which are also not allowed by the spec, but which URI seems fine with.</p>
<p>So anyway, if anyone runs into this, just gsub replace out any characters like carat, backtick, tilde and possibly others with their CGI encoded variants.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tom Harrison</title>
		<link>http://actsasblog.wordpress.com/2006/10/16/url-validation-in-rubyrails/#comment-614</link>
		<dc:creator>Tom Harrison</dc:creator>
		<pubDate>Wed, 23 Jan 2008 17:03:38 +0000</pubDate>
		<guid isPermaLink="false">http://actsasblog.wordpress.com/2006/10/16/url-validation-in-rubyrails/#comment-614</guid>
		<description>Ah Google.  It led me here, but I have found that URI is very, very picky about URLs.

For example, this one from target.com cannot be parsed:

http://www.target.com/gp/detail.html/602-4045909-4263801?ASIN=B000NPCK3W&#38;AFID=Froogle&#38;LNM=B000NPCK3W&#124;Lexmark_AllInOne_Printer_with_Scanner_and_Copier__X1240&#38;ci_src=14110944&#38;ci_sku=B000NPCK3W&#38;ref=tgt_adv_XSG10001

I think it is the vertical bar in the URL, but we have found numerous other characters (e.g. carat) that URI wont accept.

Seeking alternatives...</description>
		<content:encoded><![CDATA[<p>Ah Google.  It led me here, but I have found that URI is very, very picky about URLs.</p>
<p>For example, this one from target.com cannot be parsed:</p>
<p><a href="http://www.target.com/gp/detail.html/602-4045909-4263801?ASIN=B000NPCK3W&amp;AFID=Froogle&amp;LNM=B000NPCK3W" rel="nofollow">http://www.target.com/gp/detail.html/602-4045909-4263801?ASIN=B000NPCK3W&amp;AFID=Froogle&amp;LNM=B000NPCK3W</a>|Lexmark_AllInOne_Printer_with_Scanner_and_Copier__X1240&amp;ci_src=14110944&amp;ci_sku=B000NPCK3W&amp;ref=tgt_adv_XSG10001</p>
<p>I think it is the vertical bar in the URL, but we have found numerous other characters (e.g. carat) that URI wont accept.</p>
<p>Seeking alternatives&#8230;</p>
]]></content:encoded>
	</item>
</channel>
</rss>
