A new site called Open Test Search has created demos of several enterprise search engine, and allow you to test them out online. Have a look on http://www.opentestsearch.com/ .
-
Recent Posts
Recent Comments
Archives
Categories
Meta
A new site called Open Test Search has created demos of several enterprise search engine, and allow you to test them out online. Have a look on http://www.opentestsearch.com/ .
China is definitely going there own way when it comes to search. Already data from China research firm iresearch.cn showed that local China search engine Baidu had 73% of the Q4 2010 China search engine revenue.
Today the plot thickens when the state-run China Mobile and state-run news agency Xinhua together launched Panguso. A new, Chinese internet scale search engine.
This comes only days after the Chinese government agency “State Administration for Industry and Commerce” open what appearer to be Chinas first anti-monopoly investigation into Baidu businesses practices, after request from Chinese encyclopedia website, Hudong.com. Baidu may have lost the governments favor.
Google have released a new version of the Search Appliance that have a “Cloud Connect” featur that enables unified search of your Google Docs, Google Sites and Twitter from the GSA. This can then be merged with information about the people in the organization taken from LDAP/Microsoft AD and more traditionally sources like file en email.
The Google Sites function is especially interesting because it allow you to make vertical search engines using the Google index. For example one can create a collection of blogs and industry websites and see the results in the GSA. All this without having to crawl then yourself. One could properly also add the whole Google index as a Google Sites and display Google results together with your own. Great for searching for technical documentation that may have newer versions on the web.
Read more at http://googleenterprise.blogspot.com/2010/10/new-google-search-appliance-bridge-to.html
I have been a fan of Exalead for a long time. They have great technology normally not available in web search. Like support for queries with regular expressions, wildcard, phonetic search and proximity search. Se http://www.exalead.com/search/web/search-syntax/ for full list.
Exalead is one of the few organizations that cold compete with Google if they wanted. Recently they have been acquired by Dassault Systèmes. Lest cross our fingers in hope that this mean they cold have the funding to make a run at the global search marked. Even if they only managed to grab a 1% market share, they would have 1.5% of Google’s money. With is a lot.
Unfortunately it is little public statistics about the size of their marked share, but they appear to be big in France.
Read more about what is happening to Exalead her: http://blog.exalead.com/2010/07/26/exalead-at-the-forefront-of-search-and-innovation-in-europe/
Captcha generally (but not always)solve the problem of comment and other spam. But this comes at a price. Users with low visibility and other disablities find solving captcha hard. And blind users cant solve it unless you provide an alternative audio captcha. Why, even Seth hates it!
I am not sure even humans can decide with 100% assurance that this is spam. I have been fighting this for a while. Recently come across this thesis writen by Ben O’Connor http://maths.dur.ac.uk/Ug/projects/library/CM3/000424248r.pdf short version: http://www.fmnetwork.org.uk/files/spam.pdf . We are looking into implementing it. If we have any luck I will post en update here.
Those that have been in the search industry a while probably remember AlltheWeb, the internet arm of Norwegian enterprise search company Fast Search & Transfer. AlltheWeb newer really took off, but did give Google a run for its money. Sin’s 2004 it has been own by Yahoo, but have had some kind of independent life for itself. Apparently using the Yahoo index, but displaying different search result, and having some more tools.
Today it appears that this is coming to an end. All searches on AlltheWeb.com is now being redirected to search.yahoo.com .
I think it is sad seeing the last remainder of this internet pioneer disappearing.
From time to time I see someone proclaiming there “new” search engine. For me, working with search technology it is interesting to know if this is a real new search engine, based on ther own technology, or just a metasearch of Google/Yahoo/Bing.
To test for this you can search for “your ip” in the search engine. The search results will then show pages that shows the ip address of the visitor. For a search engine result page this is the ip address of the crawler boot.
Oh behold, the ip belongs to Google Inc: http://whois.domaintools.com/66.249.71.77 . Meaning this is metasearch of Google. Normally not so interesting for me.
From time to time I meet developers that are contemplating writing ther own search engine from scratch.
At list writing a successful web search engine is hard. It is like doing many startups at ones. There are currently startups working on labeling spam, on data clusters, on cloud storage, distributed search and hardware monitor.
As a search company you will have to do all thus part, and preferably be as good as the big players. Anna Patterson hav written an article a good about this. It is from 2004 but I still feel it is relevant.
Anna Patterson, Why Writing Your Own Search Engine Is Hard:
http://queue.acm.org/detail.cfm?id=988407
There is also a thread on this at Sirdf: http://www.sirdf.com/forum/viewtopic.php?t=5
Bjørn Olstad, CEO at Fast posted yesterday that the Next version of Fast esp will not run on Linux or Unix:
http://blogs.msdn.com/b/enterprisesearch/archive/2010/02/04/innovation-on-linux-and-unix.aspx .
This leaves a lot of users without an upgrade path, and can be a great opportune for the competition.