Filed under Web Development

Adding a local search engine to your website not only serves your visitors: it will provide you with a tremendous amount of precious data. You will learn about what your visitors are looking for, how do they search for it, the most popular searches, etc. You can then put this data to your service by optimizing your pages, writing new content, etc.

I used to use phpDig a lot but since it hasn’t been updated since 2005, so I need to find an alternative. I’ve found out about Sphider, a free PHP crawler, so we’re going to try it out today.

Installing Sphider

This is a quick and dirty guide to installing Sphider.

  1. Grab yourself a copy of Sphider.
  2. Extract the content from the archive and upload it to your web server. For my part, I’ve uploaded it into http://www.mydomain.com/search
  3. Unless you want to use an existing MySQL database, create a new one.
  4. From your Sphider directory, edit settings/database.php and set the connexion parameters for your database. Save the file and exit the editor. Upload the file to your web server.
  5. Again from the Sphider directory, edit admin/auth.php and set the username and password you want to access the administration interface. Save the file and exit the file editor.
  6. Open your browser and point it to http://www.yourdomain/search/admin/install.php (or whatever directory you installed Sphider into).
  7. The database tables should’ve been created successfully at this point. If not, verify your connexion settings in settings/database.php
  8. Click on the admin.php link, it will take you to the administration interface.
  9. Under the Site tab, click on Add Site and enter the information for the web site you wish to index.
  10. Click on the Reindex All link. This may consume a lot o resource though, make sure you don’t get your hosting account suspended.
  11. Once the indexing process is completed, open your browser at http://www.yourdomain.com/search/search.php and try searching your website. I was impressed, Sphider seems pretty fast considering I had over 1000 pages to index.

So that’s pretty much it, you now got a local search engine on your website. Here are a few tips to make it better:

Disallow Indexing Of Unwanted Directories

Create a robots.txt file a the root of your website and include the following content to disallow spidering of unecessary folders:

User-agent: *
Disallow: /admin
Disallow: /go
Disallow: /oa
Disallow: /search
Disallow: /visit
Disallow: /feed

Setup A Scheduled Task To Reindex Your Website Automatically

Use CRON (on Linux-based servers) to schedule a reindexing task. If you’re running a dedicated server and that your control panel does not allow you to manage cron tasks, create a file named sphider.sh in /etc/cron.daily and insert the following content:

#!/bin/sh
/path/to/php /path/to/sphider/admin/spider.php -all >> /dev/null

This will reindex all websites everyday. Of course you could set a different indexing cycle and set different spidering options.

If you are using cPanel, here’s how to setup a cron job to execute spider.php once a day at 1am:

sphider-cron-cpanel.jpg

Change the default search page

Instead of using http://www.yourdomain.com/search/search.php, you can make the search page the default page. Simply rename /search/search.php to /search/index.php and replace all occurences of search.php by index.php in /search/templates/search_form.php.

Of course if you use another template than the standard one, you’ll have to modify it too.


Related Posts

Comments (18) Posted by Stephane on Saturday, March 1st, 2008


You can follow any responses to this entry through the magic of "RSS 2.0" and leave a trackback from your own site.

18 Responses to “Adding A Search Engine To Your Site With Sphider”

Post A Comment