Monday, February 18, 2008

Limitations on Search Databases

For the purposes of automating search queries to add them into the search box there are specific limitations that I have run into so far. Vendor's products tend to fall into two specific categories:

In the first instance the search query is built into the URL that is provided for searching (or it can be harvested by checking to see if there is a persistent link to the exact search). If that is the case then we can just extract out the search string that is necessary for the search. For example running a search for "information technology" in the library catalog here at Ohio University:

Here the first part of the search is:
string1 = http://alice.library.ohiou.edu/search/X?SEARCH=

The search string is:
foo = information technology

The last part of the search is:
string2 = &l=&m=&b=&searchscope=7&SORT=D&p=&Da=&Db=

To automate the search you just need to add the pieces together:
query = string1 + foo + string2

Then there is a third variable we tend to need to handle which is authentication. Each library tends to have their own strings for this so the final query actually ends up being:

final = authentication + query

To test at this point you could just copy "final" into a web browser and see if it returned a search. If you get results it works!

Then there is the second type of query and that is the type where there is a persistent session id. At this point I do not have a solution for how to run searches to these databases. There are several databases that I offer to my Engineers which can only be accessed by running queries through Metalib so that is what I offer at the moment.

No comments: