Sphider – search service for HTML pages

So I have this site (my catalog of services), which has HTML and PHP in it (see my earlier posts).
Now I need some way to search this site.
The idea is to create a MySQL DB with data from the site, and a interface towards the users, for a search.

1. What is needed
MySQL, PHP and a text editor.
Also a DB admin tool, for example PHP My Admin (it can be easily installed using # yum install phpmyadmin*, and its configuration is in /etc/phpMyAdmin).
2. PHPMyAdmin
This is not strictly necessary, since I have found out that I can do all the stuff from CLI…..
# whereis phpMyAdmin
phpMyAdmin: /etc/phpMyAdmin /usr/share/phpMyAdmin
Since my Apache has a root dir in /var/www/html, and this must not change, I will do the following :
# ln -s /usr/share/phpmyadmin /var/www/html/phpmyadmin
# service httpd restart
And now I can access PHPMyAdmin from http://localhost/phpmyadmin
Note : username is root, and pass is the same as for MySQL root…..
3. MySQL DB (good link)
What needs to be done, after installing MySQL server :
# service mysqld start
Make a new pass for MySQL/root :
# mysqladmin -u root password ‘new-pass’
Enter MySQL :
# mysql -u root -p
Create DB for my site :
mysql> create database katalog2;
Test it :
mysql> show databases;
| Database |
| information_schema |
| katalog2 |
| mysql |
| test |
4. Connect PHPMyAdmin to MySQL DB
Go to : and enter necessary data (localhost, root pass, etc)
Log in as root/MySQL root pass.
From here we can make tables in our new DB (which can be done also by hand from CLI, which I prefer).

OPTION – Sphider search engine (link)
Sphider is a SW which indexes the site, and puts the results in MySQL DB, from where they are later searched.
It is a old SW, but in a internal network it is quite OK.
# unzip sphider-1.3.6.zip
The unzioppd folder put in /var/www/html folder.
I have a ready DB : katalog2
In settings directory, edit database.php file and change $database, $mysql_user, $mysql_password and $mysql_host to correct values (in my case : katalog2, root and new-pass).
Go to web browser and open file localhost/Sphider/admin/install.php. You get the text :
In admin directory, edit auth.php to change the administrator user name and password (default values are ‘admin’ and ‘admin’).
Open localhost/Sphider/admin/admin.php in browser and start indexing.
Note : always put “/” at the URL end, so that the indexing may be done as it should!
In the bottom of the page there will be a text : “Indexing completed”, and that is that.
When something changes on site, only do the reindexing.

5. Connecting Sphider to the site starting page
Search frame is (in my case) : /var/www/html/Sphider/search.php and link towards it I should put on the starting site of my service catalog site (in my case : /var/www/html/katalog2/index.html).
Now I insert this link in index.html, by putting this text in index.html :

How to make Sphider NOT index some pages
In part “Advanced options” under “URL must not include”, you put a list (one per line) in form of Pearl regular expressions, in form : */part-of-the-file-name/
For example :

How to make Sphider (or some other search engine) does not give as a result “Untitled documents”
In head of the HTML document you need to put tag “title” :
To make this wok Sphider needs to reindex the site.

Note 1 : if you use a free version of Sphider, all changes to configuration can only be done using CLI and file ~Sphider/settings/conf.php
Note 2 : if Sphider does not index the whole site, here is what should be done :
a) in ~Sphider/settings/conf.php put :
// Min words per page required for indexing
$min_words_per_page = 1;
// Words shorter than this will not be indexed
$min_word_length = 1;
b) Put option “Indexing options”/”To depth” to “4” instead of “Full”
Note 3 : do not put “localhost” as a server name, unless it is ONLY for testing!! Put the AD name (if all is in a AD) or the IP address.
Note 4 : in ~Sphider/settings/conf.php “0” is NO, and “1” is YES
Note 5 : how to put a link to HomePage at the top of the Sphider search result page :
Change the file : /var/www/html/Sphider/templates/standard/search_results.html
And add a link to the file start :
(mine is in color, so a bit more complicated)

This entry was posted in Linux and tagged , , , . Bookmark the permalink.

Comments are closed.