cse-sql-api


CSE-API, Cross Search Engine SQL-API, is an SQL-like programmatic interface for various search services

(Updated on Jan 11, 2011)

General description

CSE-SQL-API is a software module that retrieves search results from various search services without using API. Equipped with the common SQL interface for JAVA development, it allows applications to access search services as they do to databases.

What you can do

  • BuildingApplication Build your application that accesses search services ---> Using Google accessor / ACM accessor
  • Create search service accesors ---> create custom accessor

Search services already implemented

Currently CSE-SQL-API is already equipped with 2 sub modules for accessing search services listed below. CSE-SQL-API also defines general interface file which can be implemented with smaller costs.

General web search * http://google.co.jp'>Google -> Google accessor's page --> No longer available, due to the possible conflict with the terms of service of the company.

Academic paper search

Search services possibly implemented * Bing * Yahoo! Search * IEEExplore

Install/How to start

See Install/How to start

Advantage

  1. Enables building application that seamlessly accesses DBMS and Search Services - CSE-SQL-API can be operated almost in the same way as java.sql
  2. No need to use each search engines' API - CSE-SQL-API doesn't require any user accounts for each search service such as Google AJAX API, Yahoo! Search API etc, since CSE-SQL-API does NOT access to each Search service's API but parses html of search result pages. http://cse-sql-api.googlecode.com/files/SystemDiagram_FocusOnApplication_CSE-SQL-API.jpg
    Figure 1. Application can iteratively access to databases and search services with the same interface. As mentioned earlier, Google.com accessor is no longer available, due to the possible conflict with the terms of service of the company.

Similar approach

|Name|Language intended for app|License|Google.com|Bing|Yahoo! Search|Other services| |:-------|:----------------------------|:----------|:-------------|:-------|:----------------|:-----------------| |http://dooji.org/'>同時検索エンジン (Simultaneously searchable service)|(Unknown) |(Unknown) |API |API | | | |Search Engine Wrapper (by TAN Yee Fan, Ph.D Student at National University of Singapore)|Java |GPL |API |API |API |Yahoo! Search BOSS API, Twitter Search API| |Argos|(Not known, no access to downloadable)|Apache |API |API |API |(All via API) MSN, Technorati, Feedster, Del.icio.us, Blogdigger| |CSE-SQL-API|Java |GPL |W/o API |Implementation is possible|Implementation is possible|ACM Portal w/o API.|

Software design

System diagram

http://cse-sql-api.googlecode.com/files/ModuleDiagram_CSE-SQL-API.jpg

Two rounded rectangle lines represent CSE-SQL-API modules. The regtangle line on the right means the search service accessors where currently the ones of google.com and portal.acm.org have been already implemented, and anyone interested can implement for other search services.

Technical limitation on google.com accessor

(As of Jan 11, 2011) from technical regulation that is derived from the specification of the search result page on google.com, google.com accessor actually accesses to google.co.jp which (still) allows the way CSE-SQL-API collects the search results. Unit testing proved that the 93% of results from the google.com accessor (which accesses google.co.jp) corresond to what google.com returns on web browser.

Project Information

Labels:
API SearchEngine Yahoo ACM IEEE Java Bing