Register Blogs Advertise with usHelp Desk Today's Posts Search

Thread Tools Search this Thread
Unread 2nd Aug 2018, 10:19 PM   #1
New Warrior Member
Join Date: 2018
Posts: 0
Thanks: 0
Thanked 0 Times in 0 Posts
AI-powered Web article data extraction
Share on: 
fb share twitter share gplus share more share

Usually, to extract data from web pages, one has to develop a scraper. Each website requires a scraper tailored to it. If the design of the website changes, the scraper stops working and has to be manually fixed, or, sometimes, re-built from scratch.

Artificial intelligence and machine vision, in particular, have a high potential to solve this problem and create a universal scraper that will extract data from any page, independently of design, and be trainable by example.

The first step in this direction is the service called The service provides an API that takes any Web article URL as input and extracts only what's important on the page: title, headline, authors, dates, images (with corresponding captions), quotes, tags, and, of course, the text. also classifies the extracted content and extracts key phrases from the text.

(Disclaimer: I work for
RudyWurlitzer is offline   Reply With Quote
Unread 15th Nov 2018, 05:30 PM   #2
New Warrior Member
War Room Member
visionbuilder's Avatar
Join Date: 2007
Location: Steilacoom, WA, USA.
Posts: 16
Thanks: 7
Thanked 10 Times in 6 Posts
Re: AI-powered Web article data extraction
Share on: 
fb share twitter share gplus share more share

Liking what I'm reading, what is the "next step"? I suppose you will make a special offer of your api core since you are here in the Warrior Forum.

What's invisible, unspoken, and enigmatic when present........but it's scarceness makes creation blind, foolish, and base???
visionbuilder is offline   Reply With Quote


aipowered, article, artificial intelligence, data, data extraction, extraction, scraper, web

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off

All times are GMT -6. The time now is 12:37 PM.