Go Back   WarriorForum - Internet Marketing Forums > The Warrior Forum > Adsense / PPC / SEO Discussion Forum
Register Blogs FAQ Social Groups CalendarHelp Desk

Reply
 
LinkBack Thread Tools
Old 10-16-2010, 02:10 AM   #1
Warrior Member
 
seduce's Avatar
 
Join Date: Jan 2010
Posts: 6
Thanks: 0
Thanked 0 Times in 0 Posts
Social Networking View Member's Twitter Profile  View Member's YouTube Profile
Lightbulb Data Mining SEO Risk Advice

Quick Q

I want to create a restaurants-classifieds style site alike to delivery.com and I want the site to have as comprehensive listings as they have. If I were to data-scrape their sites and rip their listings (and from other sites too) and auto-add the restaurants what risks am I facing?

I know google hates duplicate content but ultimately even if I contact the businesses individually they content will inevitably be the same anyhow.

Ideas?
seduce is offline   Reply With Quote
Old 10-16-2010, 02:45 AM   #2
Advanced Warrior
War Room Member
 
orvn's Avatar
 
Join Date: Oct 2010
Location: Toronto, Canada
Posts: 756
Thanks: 86
Thanked 178 Times in 102 Posts
Social Networking View Member's FaceBook Profile  View Member's Twitter Profile 
Contact Info
Send a message via Skype™ to orvn
Default Re: Data Mining SEO Risk Advice

Quote:
Originally Posted by seduce View Post
Quick Q

I want to create a restaurants-classifieds style site alike to delivery.com and I want the site to have as comprehensive listings as they have. If I were to data-scrape their sites and rip their listings (and from other sites too) and auto-add the restaurants what risks am I facing?

I know google hates duplicate content but ultimately even if I contact the businesses individually they content will inevitably be the same anyhow.

Ideas?
This will work because you're not duplicating mass amounts of text. You won't be penalized for duplicate content unless you really copy something of substance.

That being said some points I would note:

1. Don't duplicate descriptions, reviews or comments, wait until you gather your site gathers its own comments/reviews (or make some, if you're sneaky).

2. So you write a script that extrapolates information about establishments from a couple of large directory sites (based on their layout) and populate your own database.
Do you realize that if your server does this in bulk, you may use a great deal of bandwidth from the target site and they may personally penalize you somehow, or launch a complaint? If you're flagged as malicious, Google won't want to to touch you with a ten foot pole.
To circumnavigate this, write a time-delay into your extrapolation script or run the whole thing from [a] remote server[s] (it's kind of black hat, I know).

3.When you design your template page, make sure it looks nothing like the pages from whom you're pulling the data. Change the name of the fields a little, Stir the design and style around so it has some originality to it. Google shouldn't care about this too much, but it's worth considering all the same.

Good luck, what you propose is a massive undertaking.

✄- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Orun Kabir Programming, security, design, writing, semantic web.
SEO Jedi.

[@orvn] [linkedin] [klout]
orvn is offline   Reply With Quote
Reply

  WarriorForum - Internet Marketing Forums > The Warrior Forum > Adsense / PPC / SEO Discussion Forum

Tags
advice, data, mining, risk, seo

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off



All times are GMT -6. The time now is 11:07 PM.