Main

Search Engine Archives

January 25, 2007

The new askx.com beta site

The new beta site http://www.askx.com/ is pretty impressive overall.  Fast results, AJAX'y UI, thumbnail preview, facade search, vertical search, etc.  Just the "Save to MyStuff" is a bit confusing at the beginning.  After adding a few items to mystuff, I'm expecting to be able to see all my stuff by clicking some tab.  But it turns out I have to click one of the item that I added into MyStuff.  Not very intuitive.  The space utilization is not very efficient.  The main search result column in the middle is narrow and often times the 2 side columns are half empty.

February 11, 2007

Customized search box and live search macro

Windows live search macros enable you to build your own customized search engines. You could share your macros in the public directories or put it on your own websites or blogs as a search box. For example, if you have multiple websites, including 2 technical blogs, 1 personal blog, 2 bookmarking feeds, and so on, you could build a federated search engine that covers all your sites and feeds. Put it on your personal website or blog so that your readers and fans can search just within all your sites and feeds. Isn’t that cool? You could also build your search engines to cover your favorite entertainment websites or technical reference sites, so on and so forth.
Here is an example how I created mine (which you could see on the right side bar of my blog), go to http://search.live.com/macros, click “Get started”, Choose “Advanced”, then type in the following macro. Replace the following sample URLs with your own, otherwise you are sending traffic to me :)
(
site:stanblog.jojoyao.com OR
site:stanyrocks.spaces.live.com OR
site:clipmarks.com/clipper/stanyao/
)
OR
(
link:stanblog.jojoyao.com OR
link:stanyrocks.spaces.live.com
)
The above macro searches all your sites and other sites that link to yours (find out what others are discussing about your sites).  Test search some keywords and verify your code. Then you could name (we will use later) and save this macro. Then go to http://search.live.com/siteowner?mkt=en-us, click “Advanced Search Box – Get Started”, uncheck “Site search”, check “Search macro”, type in the name of the macro we did above and a more descriptive display name, and we are about done. After clicking “Next” you will see the search box code that you could add to your site.

February 18, 2007

Build your own video search on Yahoo Pipes

Yahoo Pipes is a recently released new innovative product. The name explained itself. It's the web counterpart of UNIX Pipes. UNIX pipes help connecting multiple command line tools to produce a new customized tool, while Yahoo Pipes help connecting the countless web services and boost the power of those web services exponentially. By wiring up the web services on the web, you can create amazing new products of your own.

The pipe network consists of modules (doing processing like filtering, sorting, etc.) and pipes (directing the data flow). RSS and text flow through the pipes and produce the final result in the RSS format. Here is an easy example to show how it works. Let's build a video search that covers YouTube and Soapbox and sort the results based on the rating and play count.

We need 2 user input, keyword and minimum rating required. The Soapbox search is easier. Do a search on Soapbox and copy the RSS URL (notice the parameters on the URL). Create a URLBuilder in the Pipes, paste the URL in the Base, and it will automatically parse the parameters and fill them in below the Base. Then extract the result by a Fetch module and direct them into a filter on the ratings. Finally sort on the rating column and dump the result into Pipe Output module. YouTube is a little bit trickier, because they don't support parameters on the RSS URL. But have a tag based URL explained here. Fortunately Yahoo Pipes has a String Concatenate module, which could be used to connect the 3 pieces "feed://www.youtube.com/rss/tag/", user search keyword, and ".rss". Connect the resulting URL to the Fetcher too and we are done. One small problem though. Because YouTube feed doesn't supply the rating information, so the sorting is only effect on the result item from Soapbox. Now try this new video search here.

I suggest Yahoo Pipes to also include the following features:

  1. When several sources are Union'ed their items are listed in the order of the sources.  So if the sources in the front has pages of items, it's very difficult to surface the items from the source at the end.  It'd be helpful if a Shuffle module is provided so that when doing the Union, it can shuffle the items from multiple sources.  Ideally some sort of weighting function can be specifid to differentiate the relative importance of each source.
  2. When clicking on an pipe, it highlight the 2 modules it connects.
  3. When clicking on an module, it's directly connected modules are hightlighted.

March 1, 2007

Google custom search engine

Custom search another interesting tool from Google.  Comparing with live search macro and yahoo pipes, I think allowing public contribution of adding good sites to cover and a handy "marker" tool is a big plus.  This helps take advantage of the community.  But some more powerful operators (like those on live search macro and yahoo pipes) are missing.  Another idea I have is to provide integration with bookmark tools like del.icio.us, so all the sites I bookmarks are added automatically to the custom search site list.  Here is a prototype of the site extraction part.  One stone two birds.  Hey, just created one of these custom search covering my sites.  Try it on the right sidebar.  Here is some interesting discussion from Matt Cutts.

March 22, 2007

Microsoft's Vista-Live Strategy Already Impacting Google

Windows Vista is finally out and along with that Microsoft seems to have kick-started its Vista-Live joint initiative. This initiative aims to push Microsoft's new web properties in tandem with their dominant Windows operating system - and so become a leader in the web industry as well. Basically this means that Microsoft makes its Windows Live web properties the default in Windows Vista PCs, where possible - for example Live Search is the default search engine in IE7 on new Vista machines.

 

Measuring Vista's impact on Microsoft web properties

 

Live.com:

 
 

MSN.com:

 
 

Measuring Vista's impact on Google web properties

 
 

As a final note, Alexa competitor Compete's results are in parallel with ours. According to Compete, these giants all have an increasing trend in page views, but Google's slope is apparently lower than those of MSN and Live.

  powered by clipmarks blog it
The users of Vista or IE7 not only can install the google tool bar with a single click when going to the google homepage, but also can easily add google as the default search provider without even visiting the google homepage or installing the google tool bar. As shown on the picture in this article, click the little drop down right next to the search button and you can "Find More Providers...", in which google is there. Simply click it and google is added and you can also set it to be default.

In terms of search relevancy, I still think google is better, especially when I look for coding related reference materials. Google is faster in the time it takes to index a newly created web page and more frequent and sensitive in adjusting the rankings. Live search however is better in the overall search experience. For example, the color theme looks more comfortable. The "related searches" on the right side of the result page is very helpful. The image search is very nice with infinite scrolling bar and scratch pad. In the academic search, users can adjust the level of detail in snippets. The 3D map is also impressive. (try adding 3D maps to your Movable Type blog using my plug-in) At this moment, for average users, a little bit less performance in relevancy might not be that obvious but the overall search experience and the richness of the features might be more attractive.

It will be a very tough job to compete with google in search engine market. They quietly accumulated 5+ years of experience before they became the hot company and search is their major business. Brand recognition is also on google's side. Keep up improving, Live Search! Competition makes the industry advance more rapidly. If you are interested, you can check my Internet Dashboard page where I've aggregated some search engine comparison real time diagrams and I'll keep adding more related charts and diagrams there.

April 17, 2007

Use Windows Desktop Search API Inside of Managed Code

Starting from Windows Desktop Search (WDS) 3.0, a new helper API ISearchQueryHelper is added to help simplify the construction of index OLE DB connection string and search queries. The search service is implemented and the API is exposed as COM objects. Fortunately, there is a way for managed code to invoke the WDS too.
Download the Windows Search SDK. After unzip the package, the search interop assembly (Microsoft.Search.Interop.dll) is in the "Managed" folder. Add it to the reference inside of your solution and put “using Microsoft.Search.Interop; ” at the top of your C# code. Now it’s ready to write C# code and make use of WDS API. Some of the unmanaged and managed counterparts of classes are the following:
Unmanaged
Managed
ISearchManager
CSearchManager
ISearchCatalogManager
CSearchCatalogManager
ISearchQueryHelper
CSearchQueryHelper
 
The following sample code constructs a search query using ISearchQueryHelper to display the sender, recipients, and summary of the first 100 emails that contain a particular keyword (“share” in this example).
int maxSumLength = 100;
 
// Setup the catalog and search query helper
CSearchManager srchMngr = new CSearchManager();
CSearchCatalogManager srchCatMngr = srchMngr.GetCatalog("SystemIndex");
CSearchQueryHelper srchQueryHelper = srchCatMngr.GetQueryHelper();
 
// Assemble the query
srchQueryHelper.QuerySelectColumns = "System.Message.FromAddress, System.Message.ToAddress, System.Search.AutoSummary";
srchQueryHelper.QueryWhereRestrictions = "AND CONTAINS(System.Kind, '\"email\"')";
string sqlQuery = srchQueryHelper.GenerateSQLFromUserQuery("share");
 
// Setup the OLE DB connection
OleDbConnection conn = new OleDbConnection(srchQueryHelper.ConnectionString);
conn.Open();
 
// Execute the query
OleDbCommand cmd = new OleDbCommand(sqlQuery, conn);
OleDbDataReader srchResult = cmd.ExecuteReader();
 
// Process the search result and send to output
for (int i = 0; (i < 100) && srchResult.Read(); ++i)
{
                string fromAddr = "";
                string toAddr = "";
                string sumAddr = "";
 
                if (null != srchResult.GetValue(0) && !(srchResult.GetValue(0) is System.DBNull))
                {
                    string[] addrs = (string[])srchResult.GetValue(0);
                    fromAddr = "From: " + System.String.Join(",", addrs);
                }
                if (null != srchResult.GetValue(1) && !(srchResult.GetValue(1) is System.DBNull))
                {
                    string[] addrs = (string[])srchResult.GetValue(1);
                    toAddr = "To: " + System.String.Join(",", addrs);
                }
                if (null != srchResult.GetValue(2) && !(srchResult.GetValue(2) is System.DBNull))
                {
                    sumAddr = (string)srchResult.GetString(2);
                }
 
                textBox1.AppendText("(" + i + ") ");
                textBox1.AppendText(fromAddr + " " + toAddr + " ");
                textBox1.AppendText("Content: " + (sumAddr.Length <= maxSumLength ? sumAddr : sumAddr.Substring(0, maxSumLength)));
                textBox1.AppendText(Environment.NewLine);
}
 
srchResult.Close();
conn.Close();
           

Use Windows Desktop Search API Inside of Managed Code

Starting from Windows Desktop Search (WDS) 3.0, a new helper API ISearchQueryHelper is added to help simplify the construction of index OLE DB connection string and search queries. The search service is implemented and the API is exposed as COM objects. Fortunately, there is a way for managed code to invoke the WDS too.
Download the Windows Search SDK. After unzip the package, the search interop assembly (Microsoft.Search.Interop.dll) is in the "Managed" folder. Add it to the reference inside of your solution and put “using Microsoft.Search.Interop; ” at the top of your C# code. Now it’s ready to write C# code and make use of WDS API. Some of the unmanaged and managed counterparts of classes are the following:
Unmanaged
Managed
ISearchManager
CSearchManager
ISearchCatalogManager
CSearchCatalogManager
ISearchQueryHelper
CSearchQueryHelper
 
The following sample code constructs a search query using ISearchQueryHelper to display the sender, recipients, and summary of the first 100 emails that contain a particular keyword (“share” in this example).
int maxSumLength = 100;
 
// Setup the catalog and search query helper
CSearchManager srchMngr = new CSearchManager();
CSearchCatalogManager srchCatMngr = srchMngr.GetCatalog("SystemIndex");
CSearchQueryHelper srchQueryHelper = srchCatMngr.GetQueryHelper();
 
// Assemble the query
srchQueryHelper.QuerySelectColumns = "System.Message.FromAddress, System.Message.ToAddress, System.Search.AutoSummary";
srchQueryHelper.QueryWhereRestrictions = "AND CONTAINS(System.Kind, '\"email\"')";
string sqlQuery = srchQueryHelper.GenerateSQLFromUserQuery("share");
 
// Setup the OLE DB connection
OleDbConnection conn = new OleDbConnection(srchQueryHelper.ConnectionString);
conn.Open();
 
// Execute the query
OleDbCommand cmd = new OleDbCommand(sqlQuery, conn);
OleDbDataReader srchResult = cmd.ExecuteReader();
 
// Process the search result and send to output
for (int i = 0; (i < 100) && srchResult.Read(); ++i)
{
                string fromAddr = "";
                string toAddr = "";
                string sumAddr = "";
 
                if (null != srchResult.GetValue(0) && !(srchResult.GetValue(0) is System.DBNull))
                {
                    string[] addrs = (string[])srchResult.GetValue(0);
                    fromAddr = "From: " + System.String.Join(",", addrs);
                }
                if (null != srchResult.GetValue(1) && !(srchResult.GetValue(1) is System.DBNull))
                {
                    string[] addrs = (string[])srchResult.GetValue(1);
                    toAddr = "To: " + System.String.Join(",", addrs);
                }
                if (null != srchResult.GetValue(2) && !(srchResult.GetValue(2) is System.DBNull))
                {
                    sumAddr = (string)srchResult.GetString(2);
                }
 
                textBox1.AppendText("(" + i + ") ");
                textBox1.AppendText(fromAddr + " " + toAddr + " ");
                textBox1.AppendText("Content: " + (sumAddr.Length <= maxSumLength ? sumAddr : sumAddr.Substring(0, maxSumLength)));
                textBox1.AppendText(Environment.NewLine);
}
 
srchResult.Close();
conn.Close();
           

April 29, 2007

Usability, Usability, Usability

Just begin with some of the many good and bad examples of usability cases.  In google search, when you click the "search" button after typing in the keywords, at the top of the search result a tip will appear saying "Tip: Save time by hitting the return key instead of clicking on 'search' ".  This approach is light footprint yet effective.  The users see the tip right after they did a "stupid" thing.  They have the motivation to try the tip right on the spot and found it really helped.  This whole process of stupidity->gotcha->cool happens in a streamline and I believe the user will remember the tip in the bone.

In live search, when users type in a non-English keyword, there are very small amount of snippets/summary for the non-English sites in its own language, which is not very international user friendly.  Those who typed in a non-English language keyword most likely know that language well enough to read the results in that language.  Having summary in the website's native language is a very natural thing to do and an expected result for the users.  Even if the search engine is facing NA market, since there are a large number of foreign people in NA, having some degree of International language support makes sense.

The importance of usability is elevated to a even higher level in the web 2.0 era.  The center of computing is shifting from computers to human users.  After many years' of advancement, computers begin to fade into the background as commodity.  The real power is organizing and combining exploding information generated by people and eventually used by people.  However, people are "lazy", "impatient", and "greedy".  Making the human users satisfied becomes more and more critical to the success of software and services providers.  The business model and vehicle to deliver the software and services in Web 2.0 era is so unique that contributes in "spoiling" the users too.  The barrier of entering and Web 2.0 industry is so low that a few good programmers with a good idea could make a very successful website.  The services of huge number of those websites are very innovative, powerful, and, more importantly, free most of the time thanks to the online ads business model.  The release and delivery of newer and nicer online services has so low impact to the users, with no hassle of installation, upgrading, or troubleshooting on the client side.  With all these things together, the users end up being the spoiled king!  They have so many choices and they are free.  So they can switch to any better product without much difficulty.  Now it's time for the software and service providers to worry about the user experience their products bring to the customers.

One big resistance to achieving good usability is from the software developers.  Most of them are passionate about technology and tackling challenging problems, while feeling bored about non-technical stuff.  Being customer obsessed and having a mind set of building software for human instead for machines would be a very appreciated capability of an excellent software developer.  The new ways of organizing emails in gmail, the search engine keyword spelling correction feature, and so on are all from developers who made this step ahead.

About Search Engine

This page contains an archive of all entries posted to Stanley Yao's Blog in the Search Engine category. They are listed from oldest to newest.

Programming is the previous category.

Tech is the next category.

Many more can be found on the main index page or by looking through the archives.

Powered by
Movable Type 3.33