<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Wikipedia Scraper</title>
	<atom:link href="http://www.nickycakes.com/wikipedia-scraper/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.nickycakes.com/wikipedia-scraper/</link>
	<description>Confessions of a Reformed Blackhat</description>
	<lastBuildDate>Fri, 10 Feb 2012 17:34:28 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
	<item>
		<title>By: Steve Lownds</title>
		<link>http://www.nickycakes.com/wikipedia-scraper/comment-page-1/#comment-121946</link>
		<dc:creator>Steve Lownds</dc:creator>
		<pubDate>Mon, 02 May 2011 17:17:04 +0000</pubDate>
		<guid isPermaLink="false">http://www.nickycakes.com/wikipedia-scraper/#comment-121946</guid>
		<description>Oh yeah! Oh well, still a handy little script - ones I have used in the past stopped working a while ago.</description>
		<content:encoded><![CDATA[<p>Oh yeah! Oh well, still a handy little script &#8211; ones I have used in the past stopped working a while ago.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: nickycakes</title>
		<link>http://www.nickycakes.com/wikipedia-scraper/comment-page-1/#comment-120618</link>
		<dc:creator>nickycakes</dc:creator>
		<pubDate>Wed, 27 Apr 2011 22:48:13 +0000</pubDate>
		<guid isPermaLink="false">http://www.nickycakes.com/wikipedia-scraper/#comment-120618</guid>
		<description>to be fair, i posted that several years ago</description>
		<content:encoded><![CDATA[<p>to be fair, i posted that several years ago</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Steve Lownds</title>
		<link>http://www.nickycakes.com/wikipedia-scraper/comment-page-1/#comment-120530</link>
		<dc:creator>Steve Lownds</dc:creator>
		<pubDate>Wed, 27 Apr 2011 17:13:57 +0000</pubDate>
		<guid isPermaLink="false">http://www.nickycakes.com/wikipedia-scraper/#comment-120530</guid>
		<description>Thanks for the script - very useful. The script can be a little picky with regard to capital letters - by constructing the exact wikipedia url from the topic you provide, the capitalisation has to be exact or the search will fail. I solved this by pointing my script at the search function instead. This also gets around the problem of topics with more than one word - the wikipedia urls use underscores ( _ ) rather than spaces, so wouldn&#039;t return a result.

It&#039;s easy and just requires one line to be changed: 

change: 
$target = &quot;http://en.wikipedia.org/wiki/&quot;.urlencode($topic);

to:
$target=&quot;http://en.wikipedia.org/w/index.php?title=Special:Search&amp;search=&quot;.urlencode($topic).&quot;&amp;ns0=1&amp;redirs=0&quot;;

I also put a character limit in to prevent very small paragraphs being returned:

change:
if ($paragraph){

to:
if ($paragraph AND strlen($paragraph)&gt;80){</description>
		<content:encoded><![CDATA[<p>Thanks for the script &#8211; very useful. The script can be a little picky with regard to capital letters &#8211; by constructing the exact wikipedia url from the topic you provide, the capitalisation has to be exact or the search will fail. I solved this by pointing my script at the search function instead. This also gets around the problem of topics with more than one word &#8211; the wikipedia urls use underscores ( _ ) rather than spaces, so wouldn&#8217;t return a result.</p>
<p>It&#8217;s easy and just requires one line to be changed: </p>
<p>change:<br />
$target = &#8220;http://en.wikipedia.org/wiki/&#8221;.urlencode($topic);</p>
<p>to:<br />
$target=&#8221;http://en.wikipedia.org/w/index.php?title=Special:Search&amp;search=&#8221;.urlencode($topic).&#8221;&amp;ns0=1&amp;redirs=0&#8243;;</p>
<p>I also put a character limit in to prevent very small paragraphs being returned:</p>
<p>change:<br />
if ($paragraph){</p>
<p>to:<br />
if ($paragraph AND strlen($paragraph)&gt;80){</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Daniel</title>
		<link>http://www.nickycakes.com/wikipedia-scraper/comment-page-1/#comment-25972</link>
		<dc:creator>Daniel</dc:creator>
		<pubDate>Fri, 23 Apr 2010 22:21:28 +0000</pubDate>
		<guid isPermaLink="false">http://www.nickycakes.com/wikipedia-scraper/#comment-25972</guid>
		<description>Hello!

I am using your wonderful script, but only offline, when using it on my live webserver nothing is displayed!

Is there any settings or modules I need to install on the webserver in order for it to get working?

Please help, I must get this Wikipedia scraping working.

Kind regards and thanks for a wonderful article!

/Daniel</description>
		<content:encoded><![CDATA[<p>Hello!</p>
<p>I am using your wonderful script, but only offline, when using it on my live webserver nothing is displayed!</p>
<p>Is there any settings or modules I need to install on the webserver in order for it to get working?</p>
<p>Please help, I must get this Wikipedia scraping working.</p>
<p>Kind regards and thanks for a wonderful article!</p>
<p>/Daniel</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Joe</title>
		<link>http://www.nickycakes.com/wikipedia-scraper/comment-page-1/#comment-4894</link>
		<dc:creator>Joe</dc:creator>
		<pubDate>Thu, 15 Jan 2009 19:18:03 +0000</pubDate>
		<guid isPermaLink="false">http://www.nickycakes.com/wikipedia-scraper/#comment-4894</guid>
		<description>Thx for the script - it works great! But how can you limit the amount of data returned? Ie - stop the data returned after about 3 paragraphs, or limit the number of characters returned?</description>
		<content:encoded><![CDATA[<p>Thx for the script &#8211; it works great! But how can you limit the amount of data returned? Ie &#8211; stop the data returned after about 3 paragraphs, or limit the number of characters returned?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: digga</title>
		<link>http://www.nickycakes.com/wikipedia-scraper/comment-page-1/#comment-4166</link>
		<dc:creator>digga</dc:creator>
		<pubDate>Thu, 04 Sep 2008 02:04:22 +0000</pubDate>
		<guid isPermaLink="false">http://www.nickycakes.com/wikipedia-scraper/#comment-4166</guid>
		<description>u the man yet again ..thnx nickycakes.</description>
		<content:encoded><![CDATA[<p>u the man yet again ..thnx nickycakes.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: nickycakes</title>
		<link>http://www.nickycakes.com/wikipedia-scraper/comment-page-1/#comment-3970</link>
		<dc:creator>nickycakes</dc:creator>
		<pubDate>Mon, 04 Aug 2008 20:47:12 +0000</pubDate>
		<guid isPermaLink="false">http://www.nickycakes.com/wikipedia-scraper/#comment-3970</guid>
		<description>=))</description>
		<content:encoded><![CDATA[<p>=))</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: steve</title>
		<link>http://www.nickycakes.com/wikipedia-scraper/comment-page-1/#comment-3969</link>
		<dc:creator>steve</dc:creator>
		<pubDate>Mon, 04 Aug 2008 19:52:30 +0000</pubDate>
		<guid isPermaLink="false">http://www.nickycakes.com/wikipedia-scraper/#comment-3969</guid>
		<description>sweet.  worked.  thanks, script works charmingly.</description>
		<content:encoded><![CDATA[<p>sweet.  worked.  thanks, script works charmingly.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: nickycakes</title>
		<link>http://www.nickycakes.com/wikipedia-scraper/comment-page-1/#comment-3964</link>
		<dc:creator>nickycakes</dc:creator>
		<pubDate>Mon, 04 Aug 2008 15:58:25 +0000</pubDate>
		<guid isPermaLink="false">http://www.nickycakes.com/wikipedia-scraper/#comment-3964</guid>
		<description>that&#039;s a matter of encoding.  if you&#039;re in firefox and go to view-&gt;character encoding and select utf8 or whatever, it should work fine.</description>
		<content:encoded><![CDATA[<p>that&#8217;s a matter of encoding.  if you&#8217;re in firefox and go to view->character encoding and select utf8 or whatever, it should work fine.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: steve</title>
		<link>http://www.nickycakes.com/wikipedia-scraper/comment-page-1/#comment-3963</link>
		<dc:creator>steve</dc:creator>
		<pubDate>Mon, 04 Aug 2008 15:34:56 +0000</pubDate>
		<guid isPermaLink="false">http://www.nickycakes.com/wikipedia-scraper/#comment-3963</guid>
		<description>seems like pretty much all the results have a bunch of special chars that don&#039;t want to render.  any ideas?</description>
		<content:encoded><![CDATA[<p>seems like pretty much all the results have a bunch of special chars that don&#8217;t want to render.  any ideas?</p>
]]></content:encoded>
	</item>
</channel>
</rss>

