which language do i need to use for processing 500gb of data?

2 replies
i have a server filled with 500GB of docx, pptx, html files, pdfs etc. I wanted to create a search engine where all these files will b crawled nd indexed! I wud like to know which programming language should i rely on? Am an expert php developer! I found php works good, but it takes alot of time to crawl nd process each file! Any solution or suggestions? Do i need to code in python, perl, c, c++, java or in ruby on rails?
#500gb #data #language #processing
  • Profile picture of the author RichBeck
    Originally Posted by riksworld View Post

    i have a server filled with 500GB of docx, pptx, html files, pdfs etc. I wanted to create a search engine where all these files will b crawled nd indexed! I wud like to know which programming language should i rely on? Am an expert php developer! I found php works good, but it takes alot of time to crawl nd process each file! Any solution or suggestions? Do i need to code in python, perl, c, c++, java or in ruby on rails?
    riksworld,

    It depends on which OS you are using.....

    The language (C++, Python, etc.) doesn't matter... The vendor implementation does... For example, Borland C++ was faster than Microsoft C++ back in the day...

    If I where tearing through a bunch of data, I would create some tests and perform benchmarks with a few different vendor implementations... That way you know how it will perform given what you want it to do.

    What I would look at...

    PureBasic
    (many OSes)
    PowerBasic (Windows and DOS)

    Both are fast.... And easy to use...

    God Bless,

    Rich Beck BCIP, MCSD, MCIS
    {{ DiscussionBoard.errors[7557899].message }}
  • Profile picture of the author mojojuju
    Have you considered using solr?
    Signature

    :)

    {{ DiscussionBoard.errors[7558509].message }}

Trending Topics