技术资源 -- Knowlesys

首页 | 方案 | 产品 | 服务 | 技术 | 支持 | 公司

资源

免费软件

Search & Replace Master
	利用简单通配符模式对文本文件进行搜索与替换的工具

Glyph Font Viewer
	图标字体浏览器
Free Rename Master
	利用简单通配符模式对文本文件进行批量重命名的工具

定义

Screen Scraping
v.
The act of capturing data from a system or program by snooping the contents of some display that is not actually intended for data transport or inspection by programs. Around 1980 this term referred to tricks like reading the display memory of a smart terminal through its auxiliary port. Nowadays it often refers to parsing the HTML in generated web pages with programs designed to mine out particular patterns of content. In either guise screen-scraping is an ugly, ad-hoc, last-resort technique that is very likely to break on even minor changes to the format of the data being snooped.

Deep Web/Hidden Web
n.
The Deep Web (or Hidden Web) comprises all information that resides in autonomous databases behind portals and information providers' web front-ends. Web pages in the Deep Web are dynamically-generated in response to a query through a web site's search form and often contain rich content. A recent study has estimated the size of the Deep Web to be more than 500 billion pages, whereas the size of the "crawlable" web is only 1% of the Deep Web (i.e., less than 5 billion pages). Even those web sites with some static links that are "crawlable" by a search engine often have much more information available only through a query interface. Unlocking this vast deep web content presents a major research challenge.

垂直搜索
垂直搜索的本质是对垂直门户信息提供方式的一次简化性的整合。
普通水平搜索引擎的搜索范围为网页级，而垂直搜索的搜索范围为数据项级，粒度更小，精确度更高。垂直搜索是服务于某项功能的，比如：用户搜索租房，买房信息就是一种垂直搜索。对信息的再加工处理是非常关键的，不管是结构化的数据，还是非结构化的数据。垂直搜索的内容来源： A门户网站自身的资源 B以开放接口方式让行业用户提供的资源 C普通用户发布的资源 D抓取行业用户的资源更多...

友情链接

Articles on Web Data Extraction

北京亚库
	电话QQ供应商

我们愿意与你交换链接如果你有一个网站的话 -) 请将你的网站信息发给我们，就像我们的 : 标题: 乐思软件 - 专业的网页数据抓取服务与软件提供商 URL: http://www.knowlesys.com 描述: 提供专业的网页数据抓取，网站内容抓取，网络新闻采集等网络信息采集与整合软件。