![]() | | ||||||||
| | #1 |
| www.chandan.in Join Date: Oct 2008
Posts: 7
Thanks: 7
Thanked 0 Times in 0 Posts
|
how to do word splitting if i give buynow it should give buy now if i give worldtraveltour then world travel tour even (world travel rave our tour ) such combo if i give domainsitea it should give domain site a etc any dictionary tools , class files are available for this task ? thanks |
| | |
| | |
| | #2 |
| Business Pro War Room Member Join Date: Jul 2009 Location: Scarborough
Posts: 169
Thanks: 2
Thanked 11 Times in 10 Posts
| |
|
*NEW* Wordpress Auction Theme - your own Flippa or eBay website in minutes! Wordpress Directory Script - creating a directory website is easy! Wordpress Shopping Cart - setup your own online store or Amazon affiliate store! Wordpress Classifieds Theme *NEW* Wordpress Coupon Websites - Earn BIG affiliate commissions with a coupon code website. | |
| | |
| | #3 |
| www.chandan.in Join Date: Oct 2008
Posts: 7
Thanks: 7
Thanked 0 Times in 0 Posts
|
thanks actually the input is random can be anything so explode function not fits i just given example with buynow , worldtraveltour but it can be like ksadas a junk name which should be splitted with sad das words too |
| | |
| | |
| | #4 |
| Business Pro War Room Member Join Date: Jul 2009 Location: Scarborough
Posts: 169
Thanks: 2
Thanked 11 Times in 10 Posts
|
how would it know which words to split? u can add the words u want to an array and then just check the array, if the word is found then split it. |
|
*NEW* Wordpress Auction Theme - your own Flippa or eBay website in minutes! Wordpress Directory Script - creating a directory website is easy! Wordpress Shopping Cart - setup your own online store or Amazon affiliate store! Wordpress Classifieds Theme *NEW* Wordpress Coupon Websites - Earn BIG affiliate commissions with a coupon code website. | |
| | |
| | #5 |
| Senior Warrior Member War Room Member Join Date: Apr 2006 Location: Tucson, AZ, USA.
Posts: 1,025
Thanks: 120
Thanked 158 Times in 115 Posts
|
No, explode isn't going to help because you don't know where the words are divided. That's the whole point. The only way to do this is with a dictionary lookup, as the OP implied. I don't know of any existing classes that do this. It wouldn't be too hard to write one if you had a good dictionary, but the tricky part would be making it quick and efficient. (Obviously, Google is very good at it.) Steve |
| Executive I.T. consulting for small/medium business Website development | PHP - MySQL - JavaScript expert programming Software requirements analysis | Specification writing Project management | Vendor relationship management | |
| | |
| | #6 |
| Business Pro War Room Member Join Date: Jul 2009 Location: Scarborough
Posts: 169
Thanks: 2
Thanked 11 Times in 10 Posts
|
Steve, you miss understand, If you have an array of words already, you can use this array to check if the word exists within a string and then extract it using explode. Either than or you can do it manually but i know which one i would prefer... |
|
*NEW* Wordpress Auction Theme - your own Flippa or eBay website in minutes! Wordpress Directory Script - creating a directory website is easy! Wordpress Shopping Cart - setup your own online store or Amazon affiliate store! Wordpress Classifieds Theme *NEW* Wordpress Coupon Websites - Earn BIG affiliate commissions with a coupon code website. | |
| | |
| | #7 | |
| www.chandan.in Join Date: Oct 2008
Posts: 7
Thanks: 7
Thanked 0 Times in 0 Posts
| Quote:
because it will be too lengthy to to put the dictionary words in array
| |
| | ||
| | |
| | #8 | |
| Senior Warrior Member War Room Member Join Date: Apr 2006 Location: Tucson, AZ, USA.
Posts: 1,025
Thanks: 120
Thanked 158 Times in 115 Posts
| Quote:
If you have a dedicated server with plenty of RAM, you could possibly write a C application taking this approach. Or you could virtualize the array. Or you could pre-load only a subset of the most common words in the dictionary, then do a database lookup as a last resort. As I indicated in my first post, the tricky part is to do it quickly and efficiently. Steve | |
| Executive I.T. consulting for small/medium business Website development | PHP - MySQL - JavaScript expert programming Software requirements analysis | Specification writing Project management | Vendor relationship management | ||
| | |
| | #9 |
| Lisa Dozois War Room Member Join Date: Jan 2006 Location: Florida, USA.
Posts: 612
Thanks: 85
Thanked 221 Times in 110 Posts
|
Sometimes us programmers are guilty of trying to provide a solution to a problem we don't fully understand. Chandan, you told us WHAT you want to do, but not WHY you want to do it. If we understand why you are trying to do this, maybe a clear solution will pop up. |
|
-- Lisa G
| |
| | |
| | #10 | |
| Business Pro War Room Member Join Date: Jul 2009 Location: Scarborough
Posts: 169
Thanks: 2
Thanked 11 Times in 10 Posts
| Quote:
| |
|
*NEW* Wordpress Auction Theme - your own Flippa or eBay website in minutes! Wordpress Directory Script - creating a directory website is easy! Wordpress Shopping Cart - setup your own online store or Amazon affiliate store! Wordpress Classifieds Theme *NEW* Wordpress Coupon Websites - Earn BIG affiliate commissions with a coupon code website. | ||
| | |
| | #11 | |
| www.chandan.in Join Date: Oct 2008
Posts: 7
Thanks: 7
Thanked 0 Times in 0 Posts
| Quote:
like when user searching a whois of domain, or simple name search
| |
| | ||
| | |
| | #12 |
| Lisa Dozois War Room Member Join Date: Jan 2006 Location: Florida, USA.
Posts: 612
Thanks: 85
Thanked 221 Times in 110 Posts
|
I would start here: Eight word lists to help you creating the perfect word game : Emanuele Feronato Grab those keyword lists and build a MySQL table. Since you aren't looking for anagrams; that is you don't want to find characters in random order, just linear order, you need to iterate through the string, one character at a time, concatenating the next character as you go. So, you take the string and you search for the first character. If a word is found you push it on to an array. Here's a matrix for the 11 character string: isthisright Character Position 1 1,2 1,2,3 1,2,3,4 1,2,3,4,5 1,2,3,4,5,6 1,2,3,4,5,6,7 1,2,3,4,5,6,7,8 1,2,3,4,5,6,7,8,9 1,2,3,4,5,6,7,8,9,10,11 2 2,3 2,3,4 2,3,4,5 2,3,4,5,6 2,3,4,5,6,7 2,3,4,5,6,7,8 2,3,4,5,6,7,8,9 2,3,4,5,6,7,8,9,10,11 3 3,4 3,4,5 3,4,5,6 3,4,5,6,7 3,4,5,6,7,8 3,4,5,6,7,8,9 3,4,5,6,7,8,9,10,11 ... Continue through all permutations until you have tested all the combinations against your word list. I think this is the correct progression order but someone feel free to chime in if I got it wrong. let's test: isthisright *= found word 1=I* 1,2=IS* 1,2,3 = IST 1,2,3,4 = ISTH 1,2,3,4,5 = ISTHI 1,2,3,4,5,6 = ISTHIS (ISTHIS is NOT a word). You already found Is, the word This will come later in the progression. 1,2,3,4,5,6,7 = ISTHISR 1,2,3,4,5,6,7,8 = ISTHISRI 1,2,3,4,5,6,7,8,9 = ISTHISRIG 1,2,3,4,5,6,7,8,9,10 = ISTHISRIGH 1,2,3,4,5,6,7,8,9,10,11 = ISTHISRIGHT 2 = S 2,3 = ST 2,3,4 = STH ... Continue through the matrix and you'll eventually make all the words. |
|
-- Lisa G
| |
| | |
| | #13 |
| HyperActive Warrior War Room Member Join Date: Oct 2002
Posts: 360
Thanks: 112
Thanked 48 Times in 39 Posts
|
Whatever solution you use be careful with situations like: wordsexpress wordsexchange Doing it on a character by character case to find dictionary words might give you some unexpected/undesired results ![]() Even Google makes mistakes when analyzing/splitting such kind of strings into words... and it was (don't know if still is) one of the reasons that many domains were flagged as adult domains. Carlos |
| | |
| | #14 | |
| Lisa Dozois War Room Member Join Date: Jan 2006 Location: Florida, USA.
Posts: 612
Thanks: 85
Thanked 221 Times in 110 Posts
| Quote:
| |
|
-- Lisa G
| ||
| | |
| | #15 | |
| Lisa Dozois War Room Member Join Date: Jan 2006 Location: Florida, USA.
Posts: 612
Thanks: 85
Thanked 221 Times in 110 Posts
| Quote:
** WARNING ** This link leads to a dirty word list that you may find offensive. It's intended use is to build a dirty word filter and not to cater to anyone's prurient interests. If dirty words offend you, don't click. http://drupal.org/files/issues/dirtywords.txt | |
|
-- Lisa G
| ||
| | |
| | #16 | |
| HyperActive Warrior War Room Member Join Date: Oct 2002
Posts: 360
Thanks: 112
Thanked 48 Times in 39 Posts
| Quote:
- wordsexpress should be split as: words express - wordsexchange should be split as: words exchange Hmmm... but then who guarantees me or anyone else if the way they are split are in fact the correct way? Maybe the domain owner really registered "word sex press" or "word sex change" ![]() In other words... there will be always domain strings that can be split in several ways with very different meanings. Developing an algorithm to deal with these (and many others) type of situations can be very complex if there's a need to be somewhat "perfect" when splitting domain strings into words, not to mention if there's also a need to optimize it for speed. Carlos | |
| | |
| | #17 |
| Advanced Warrior War Room Member Join Date: Jun 2009 Location: Chesterton, IN
Posts: 923
Thanks: 129
Thanked 193 Times in 153 Posts
|
Interesting... Here's you some dictionaries Kevin's Word List Page Now find a good one and loop through the words counting characters and picking out words from your concatenated string. Via PHP use strpos() to grab your first word,mark the position of the next character to start a new loop at and check what is left over... then print or assign these to your array and discard garbage. Of course you have to deal with if someone tries xapplesandoranges So for every first word loop you have that finds an initial match you have to run inner pattern matching until you run out of characters or dictionary words. Then move to your next potential phrase. All while checking currently found phrases. I think there are close to 3/4 of a million words in the English dialect not counting slang..not sure how many words are in any of those dictionaries ![]() Of course you would want to include a thesaurus so you can have related phrases sent back also. jokesYeah, that would take some thought how to optimize.... The system would need to "learn" somehow so it would record common phrases in order to become faster over time utilizing the dictionary less and less. Might make for an interesting project. We'll develop it on your servers though since it may take hours to run killing everything else while it ran LOL good luck |
| Webmaster Services List Your Wealth Building Systems and Services for Free Insanity is doing the same thing over and over and expecting a different result ~ Einstein Insanity is doing the same thing over and over and never getting the same results ~ Ken | |
| | |
| | #18 |
| Advanced Warrior War Room Member |
Hi, As per my programming experience goes the logic can be very simple excluding some special cases. Even a sentence can be splited to words. Before that we should have a structural analysis of the Dictionary to be followed. So once the permutation and combination procedure works for a simple word split then it can work for a sentence even if for a paragraph too whichever inputted by the user. It is of course hard to structure but not too hard. Satya Das |
| ARTICLES :- Article Writing & Article Marketing WEBSITE :- Minisite Developer & Designer AFFILIATE MARKETING :- Make Money as an Affiliate FORUM :- Make Money Online From Home | |
| | |
![]() |
|
| Tags |
| splitting, words |
| Thread Tools | |
| |
![]() |