Pre-defined URLs search script?

4 replies
I would like to take a pre-defined list of URLs and make them searchable by keyword(s).

In other words:

Using an HTML page...

"keyword" in search box, press Find, program searches through a bunch of existing http:// addresses for that keyword or phrase, then returns the found results in a clickable list via an HTML results page.

Could this be done via a php script or a search engine api or other interactive tool?

If anyone could possibly point me in the right direction or know of such an animal (script), that would greatly appreciated!

Thank you!
Charles
#predefined #script #search #urls
  • Profile picture of the author sautaja
    Is that predefined list of URLs provided by google or you? If you were to build the list yourself, one of the ways is to make a script to scrap the contents from those URLs (sort of like a mini crawler) on a daily-basis (it's up to you), and dump them to a database table that stores fields like url, page_content, page_title, date, etc. Open source databases like MySQL and PostgreSQL let you search for the records that match a given keyword/pattern.

    Full text search support in database is relatively slow compared to a dedicated full text search engine. You might want to look into software such as Apache Lucene if performance is the main concern, you then have to explore terms like stemming, indexing and stopwords.
    Signature
    Jomify - Free multi-channel shopping cart. Open your free store now.
    {{ DiscussionBoard.errors[5635382].message }}
  • Profile picture of the author Nochek
    Where are these Urls at? Are they stored in a Database, on a static Page on your server, or generated Dynamically on the fly?

    You've got multiple options. One is jQuery's .find() command. You can use that to search through all the elements in the DOM and return them all on a search, and since its an asynch call you can put out results without refreshing the page.

    If the URLs are stored in a database, you can use the LIKE command in SQL statements to find matching URLs. Here's a clip I use for people searching for domain names on my network:
    Code:
     = "SELECT * FROM `ud_Domains` WHERE `Domain_Name` LIKE '%term%' AND `User_ID`=0";
    If they are generated dynamically, you could still use the jQuery method, or you could go ahead and parse down the list using PHP when someone does a search, but that would require more overhead processing than is really called for.

    Hope that Helps!
    Signature
    Nochek Solutions Presents:
    The Hydrurga WSO - Rank Your Site #1 And Score Over The Penguin Updates!
    {{ DiscussionBoard.errors[5635514].message }}
    • Profile picture of the author Brandon Tanner
      Not sure if it's exactly what you're looking for, but you can create a custom Google search engine that can search within specific URL's...

      Google Custom Search - Site search and more
      Signature

      {{ DiscussionBoard.errors[5635880].message }}
      • Profile picture of the author realworldincome
        @sautaja
        @Nochek
        @Brandon Tanner

        Thanks to each of you for replying and offering suggestions!

        It looks like I should explain this better:


        - Let's say I have a list of a 350 URLs to websites that are known to contain, let's say, Widgets

        - Not every site has every kind of widget made, but some may contain some blue or yellow widgets

        - Another set of sites in that list of URLs may contain orange and red widgets, and a few blue ones

        - So the end user wants to find the sites that contain only red widgets

        - Ideally he would go to a search box and type in 'red widgets'

        - After a bit he would be presented with a list of URLs that relate only to red widgets

        - From there he can select each URL in turn and drill down on his own to find exactly which red widget he is looking for

        - So it would be like him doing a Google search for 'red widgets', except that the search would be limited to those 350 URLs.

        The current list is in .csv format, but can be converted to HTML, or whatever works

        There is no database set up, although this may be one of the needed components as it appears it would have to be dynamically generated at some level.

        Again, ideally, the Search box would be placed in an app (in this case, Android) --- and the Results of his search would show up in the same app.

        I'm not the expert here, but it would appear that those URLs and the process to search through each one would have to be contained within an interactive database somewhere (perhaps on a webhost server).

        What kind of load would be placed on a server running say MySQL I'm not sure, considering also that this app could be used by a large number of people over time.

        I did notice in my research that you can customize some search engines, but a large number of requests per month could become prohibitively expensive.

        I also looked around some php script sites, and perhaps overlooked it, but was suprised that such an animal doesn't already exist.

        Hopefully this provides a clearer vision of what I would like to do, if possible.

        Thanks again for any ideas or suggestions,
        Charles
        {{ DiscussionBoard.errors[5642811].message }}

Trending Topics