| Thousands of servers ...billions of web pages.... the | | | | the subject is unfamiliar. Similarly, the concept |
| possibility of individually sifting through the WWW | | | | based search of Excite (instead of individual |
| is null. The search engine gods cull the information | | | | words, the words that you enter into a search |
| you need from the Internet...from tracking down | | | | are grouped and attempted to determine the |
| an elusive expert for communication to presenting | | | | meaning) is a difficult task and yields inconsistent |
| the most unconventional views on the planet. | | | | results. |
| Name it and click it. Beyond all the hype created | | | | |
| about the web heavens they rule, let's attempt to | | | | |
| keep the argument balanced. From Google to | | | | |
| Voice of the Shuttle (for humanities research) | | | | Besides who reviews or evaluates these sites for |
| these ubiquitous gods that enrich the net, can be | | | | quality or authority? They are simply compiled by |
| unfair ...and do wear pitfalls. And considering the | | | | a computer program. These active search engines |
| rate at which the Internet continues to grow, the | | | | rely on computerized retrieval mechanisms called |
| problems of these gods are only exacerbated | | | | "spiders", "crawlers", or "robots", to visit Web |
| further. | | | | sites, on a regular basis and retrieve relevant |
| | | | keywords to index and store in a searchable |
| | | | database. And from this huge database yields |
| | | | often unmanageable and comprehensive |
| Primarily, what you need to digest is the fact that | | | | results....results whose relevance is determined by |
| search engines fall short of Mandrake's magic | | | | their computers. The irrelevant sites (high |
| mechanism! They simply don't create URLs out of | | | | percentage of noise, as it's called), questionable |
| thin air but instead send their spiders crawling | | | | ranking mechanisms and poor quality control may |
| across those sites that have rendered prayers | | | | be the result of less human involvement to weed |
| (and expensive offerings!) to them for | | | | out junk. Thought human intervention would solve |
| consideration. Even when sites like Google claim to | | | | all probes....read on. |
| have a massive 3 billion web pages in its | | | | |
| database, a large portion of the web nation is | | | | |
| invisible to these spiders. To think they are simply | | | | |
| ignorant of the Invisible Web. This invisible web | | | | From the very first search engine - Yahoo to |
| holds that content, normal search engines can't | | | | about.com, Snap.com, Magellan, NetGuide, Go |
| index because the information on many web sites | | | | Network, LookSmart, NBCi and Starting Point, all |
| is in databases that are only searchable within that | | | | subject directories index and review documents |
| site. Sites like - The Internet Movie Database , - | | | | under categories - making them more |
| IncyWincy, the invisible web search engine and - | | | | manageable. Unlike active search engines, these |
| The Complete Planet that cover this area are | | | | passive or human-selected search engines like |
| perhaps the only way you can access content | | | | don't roam the web directly and are human |
| from that portion of the Internet, invisible to the | | | | controlled, relying on individual submissions. Perhaps |
| search gods. Here, you don't perform a direct | | | | the easiest to use in town, but the indexing |
| content search but search for the resources that | | | | structure these search engines cover only a small |
| may access the content. (Meaning - be sure to | | | | portion of the actual number of WWW sites and |
| set aside considerable time for digging.) | | | | thus is certainly not your bet if you intend |
| | | | specific, narrow or complex topics. Subject |
| | | | designations may be arbitrary, confusing or wrong. |
| | | | A search looks for matches only in the |
| None of the search engines indexes everything on | | | | descriptions submitted. Never contains full text of |
| the Web (I mean none). Tried research literature | | | | the web they link to - you can only search what |
| on popular search engines? AltaVista to Yahoo, will | | | | you see titles, descriptions, subject categories, |
| list thousands of sources on education, human | | | | etc. Human-labor intensive process limits database |
| resource development, etc. etc. but mostly from | | | | currency, size, rate of growth and timeliness. You |
| magazines, newspapers, and various organizations' | | | | may have to branch through the categories |
| own Web pages, rather than from research | | | | repeatedly before arriving at the right page. They |
| journals and dissertations- the main sources of | | | | may be several months behind the times because |
| research literature. That's because most of the | | | | of the need for human organization. Try looking |
| journals and dissertations are not yet available | | | | for some obscure topic....chances for the people |
| publicly on the Web. Thought they'll get you all | | | | that maintain the directory to have excluded |
| that's hosted on the web? Think again. | | | | those pages. Obviously, machines can blindly count |
| | | | keywords but they can't make common-sense |
| | | | judgement as humans can. But then why does |
| | | | human-edited directories respond with all this |
| The Web is huge and growing exponentially. | | | | junk?! |
| Simple searches, using a single word or phrase, will | | | | |
| often yield thousands of "hits", most of which will | | | | |
| be irrelevant. A layman going in for a piece of info | | | | |
| to the internet has to deal with a more severe | | | | And here's about those meta search engines. A |
| issue - too much information! And if you don't | | | | comprehensive search on the entire WWW using |
| learn how to control the information overload | | | | The Big Hub, Dogpile, Highway61, Internet Sleuth |
| from these websites, returned by a search result, | | | | or Savvysearch , covering as many documents |
| roll out the red carpet for some frustration. A | | | | as possible may sound as good an idea as a one |
| very common problem results from sites that | | | | stop shopping.Meta search engines do not create |
| have a lot of pages with similar content. For e.g., if | | | | their own databases. They rely on existing active |
| a discussion thread (in a forum) goes on for a | | | | and passive search engine indexes to retrieve |
| hundred posts there will be a hundred pages all | | | | search results. And the very fact that they |
| with similar titles, each containing a wee bit of | | | | access multiple keyword indexes reduces their |
| information. Now instead of just one link, all | | | | response time. It sure does save your time by |
| hundred of those darn pages will crop up your | | | | searching several search engines at once but at |
| search result, crowding out other relevant site. | | | | the expense of redundant, unwanted and |
| Regardless of all the sophistication technology has | | | | overwhelming results....much more - important |
| brought in, many well thought-out search phrases | | | | misses. The default search mode differs from |
| produce list after list of irrelevant web pages. The | | | | search site to search site, so the same search is |
| typical search still requires sifting through dirt to | | | | not always appropriate in different search engine |
| find the gold. If you are not specific enough, you | | | | software. The quality and size of the databases |
| may get too many irrelevant hits. | | | | vary widely. |
| | | | |
| | | | |
| | | | |
| As said, these search engines do not actually | | | | Weighted Search Engines like Ask Jeeves and |
| search the web directly but their centralized | | | | RagingSearch allows the user to type queries in |
| server instead. And unless this database is | | | | plain English without advanced searching |
| updated continually to index modified, moved, | | | | knowledge, again at the expense of inaccurate |
| deleted or renamed documents, you will land | | | | and undetailed searching. Review or Ranking |
| yourself amidst broken links and stale copies of | | | | Sources like Argus Clearinghouse ( (eblast.com) |
| web pages. So if they inadequately handle | | | | and Librarian's Index to the Internet (lii.org). They |
| dynamic web pages whose content changes | | | | evaluate website quality from sources they find |
| frequently, chances are for the information they | | | | or accept submissions from but cover a minimal |
| reference to quickly go out-of-date. After they | | | | number of sites. |
| wage their never ending war with over-zealous | | | | |
| promoters (spamdexers rather), where do they | | | | |
| have time to keep their databases current and | | | | |
| their search algorithms tuned? No surprise if a | | | | As a webmaster, your site registration with the |
| perfectly worthwhile site may go unlisted! | | | | biggest billboards in Times Square can get you |
| | | | closer to bingo! for the searcher. Those who didn't |
| | | | even know you existed before are in your living |
| | | | room in New York time! |
| Similarly, many of the Web search engines are | | | | |
| undergoing rapid development and are not well | | | | |
| documented. You will have only an approximate | | | | |
| idea of how they are working, and unknown | | | | Your URL registration is a no-brainer, considering |
| shortcomings may cause them to miss desired | | | | the generation of flocking traffic to your site. |
| information. Not to mention, amongst the first | | | | Certainly a quick and inexpensive method, yet is |
| class information, the web also houses false, | | | | only a component of the overall marketing |
| misleading, deceptive and dressed up information | | | | strategy that in itself offers no guarantees, no |
| actually produced by charlatans. The Web itself is | | | | instant results and demands continued effort for |
| unstable and tomorrow they may not find you | | | | the webmaster. Commerce rules the web. Like |
| the site they found you today. Well if you could | | | | how a notable Internet caveman put it, "Web |
| predict them, they would not be god!...would they?! | | | | publishers also find dealing with search engines to |
| The syntax (word order and punctuation) for | | | | be a frustrating pursuit. Everybody wants their |
| various types of complex searches varies some | | | | pages to be easy for the world to find, but |
| from search engine to search engine, and small | | | | getting your site listed can be tough. Search sites |
| errors in the syntax can seriously compromise | | | | may take a long time to list your site, may never |
| the search. For instance, try the same phrase | | | | list it at all, and may drop it after a few months |
| search on different search engines and you'll know | | | | for no reason. If you resubmit often, as it is very |
| what I mean. Novices... read this line - using search | | | | tempting to do, you may even be branded a |
| engines does involve a learning curve. Many | | | | spamdexer and barred from a search site. And as |
| beginning Internet users, because of these | | | | for trying to get a good ranking, forget it! You |
| disadvantages, become discouraged and | | | | have to keep up with all the arcane and |
| frustrated. Like a journalist put it, "Not showing | | | | ever-changing rules of a dozen different search |
| favoritism to its business clients is certainly a rare | | | | engines, and adjust the keywords on your pages |
| virtue in these times." Search engines have | | | | just so...all the while fighting against the very |
| increasingly turned to two significant revenue | | | | plausible theory that in fact none of this stuff |
| streams. Paid placement: In addition to the main | | | | matters, and the search sites assign rankings at |
| editorial-driven search results, the search engines | | | | random or by whim. |
| display a second - and sometimes third - listing | | | | |
| that's usually commercial in nature. The more you | | | | |
| pay, the higher you'll appear in the search results. | | | | |
| Paid inclusion: An advertiser or content partner | | | | "To make the best use of Web search |
| pays the search engine to crawl its site and | | | | engines--to find what you need and avoid an |
| include the results in the main editorial listing. | | | | avalanche of irrelevant hits-- pick search engines |
| So?...more likely to be in the hit list but then again | | | | that are well suited to your needs. And lest you'd |
| - no guarantees. Of course those refusing to | | | | want to cry "Ye immortal gods! where in the |
| favor certain devotees are industry leaders like | | | | world are we?", spend a few hours becoming |
| Google that publishes paid listings, but clearly | | | | moderately proficient with each. Each works |
| marks them as 'Sponsored Links.' | | | | somewhat differently, most importantly in respect |
| | | | to how you broaden or narrow a search. |
| | | | |
| | | | |
| The possibility of these 'for-profit' search gods | | | | |
| (which haven't yet made much profit) for taking | | | | Finding the appropriate search engine for your |
| fees to skew their searches, can't be ruled out. | | | | particular information need, can be frustrating. To |
| But as a searcher, the hit list you are provided | | | | effectively use these search engines, it is |
| with by the engine should obviously rank in the | | | | important to understand what they are, how they |
| order of relevancy and interest. Search command | | | | work, and how they differ. For e.g. while using a |
| languages can often be complex and confusing | | | | meta search engine, remember that each engine |
| and the ranking algorithm is unique to each god | | | | has its own methods of displaying and ranking |
| based on the number of occurrences of the | | | | results. Remember, search strategies affect the |
| search phrase in a page, if it appears in the page | | | | results. If the user is unaware of basic search |
| title, or in a heading, or the URL itself, or the | | | | strategies, results may be spotty. |
| meta tag etc. or on a weighted average of a | | | | |
| number of these relevance scores. E.g. Google ( | | | | |
| uses its patented PageRank TM and ranks the | | | | |
| importance of search results by examining the | | | | Quoting Charlie Morris (the former editor of The |
| links that lead to a specific site. The more links | | | | Web developer's journal) - "Search engines and |
| that lead to a site, the higher the site is ranked. | | | | directories survive, and indeed flourish, because |
| Pop on popularity! | | | | they're all we've got. If you want to use the |
| | | | wealth of information that is the Web, you've got |
| | | | to be able to find what you want, and search |
| | | | engines and directories are the only way to do |
| Alta Vista, HotBot, Lycos, Infoseek and MSN | | | | that. Getting good search results is a matter of |
| Search use keyword indexes - fast access to | | | | chance. Depending on what you're searching for, |
| millions of documents. The lack of an index | | | | you may get a meaty list of good resources, or |
| structure and poor accuracy of the size of the | | | | you may get page after page of irrelevant drivel. |
| WWW, will not make searching any easier. Large | | | | By laboriously refining your search, and using |
| number of sites indexed. Keyword searching can | | | | several different search engines and directories |
| be difficult to get right.In reality, however, the | | | | (and especially by using appropriate specialty |
| prevalence of a certain keyword is not always in | | | | directories), you can usually find what you need in |
| proportion to the relevance of a page. Take this | | | | the end." |
| example. A search on sari - the national costume | | | | |
| of India -in a popular search engine, returned | | | | |
| among it's top sites, the following links: | | | | |
| | | | Search engines are very useful, no doubt. Right |
| ? of the Scottish Crop research Institute | | | | from getting a quick view of a topic to finding |
| | | | expert contact info...verily certain issues lie in their |
| ? -a health resort in Indonesia | | | | lap. Now the very reason we bother about these |
| | | | search engines so much is because they're all |
| ? - The South Asia Regional Initiative for Energy | | | | we've got! Though there sure is a lot of room for |
| Cooperation and Development | | | | improvement, the hour's need is to not get |
| | | | caught in the middle of the road. By simply |
| | | | understanding what, how and where to seek, |
| | | | you'd spare yourself the fate of chanting that old |
| Pretty useful sites for someone very much | | | | Jewish proverb "If God lived on earth, people |
| interested in knowing how to drape or the | | | | would break his windows." |
| tradition of the sari?! (Well, no prayer goes | | | | |
| unanswered...whether you like the answer or not!) | | | | |
| By using keywords to determine how each page | | | | |
| will be ranked in search results and not simply | | | | Happy searching!Liji is a PostGraduate in Software |
| counting the number of instances of a word on a | | | | Science, with a flair for writing on anything under |
| page, search engines are attempting to make the | | | | the sun. She puts her dexterity to work, writing |
| rankings better by assigning more weight to | | | | technical articles in her areas of interest which |
| things like titles, subheadings, and so on.Now, | | | | include Internet programming, web design and |
| unless you have a clear idea of what you're | | | | development, ecommerce and other related |
| looking for, it may be difficult or impossible to use | | | | issues. |
| a keyword search, especially if the vocabulary of | | | | |