Looking for specific
information among the thousands of
documents on the Web can be overwhelming unless you have a search
strategy.
(It will also help to have a few
hints and tips
and to know a few good secrets.)
Four basic ways to find information on the Web are:
- Use starting point documents;
- search by content using subject trees and search engines;
- search by geographic location of Web sites; and
- use Web guides.
Starting Points Pages
A starting points page is usually designed with the information-searching
needs of a Web beginner in mind, providing interesting and useful URLS to
explore. It doesn't try to list every available resource on a subject, but,
instead, provides a few entry points for the beginning of your exploration.
The following sites have pages that are good starting points:
- 100 Hot Spots contains the 100
top Web
sites.
- What's New maintains a
list of new Web sites.
- NCSA'S Starting Points for Internet
Exploration contains hyperlinks to many common Internet-based
information resources.
Content-Oriented Search
A content-oriented search is useful for identify links to sites
containing information about a specific topic. A
content-oriented search can be conducted using a subject tree or a
search engine.
A subject tree is an alphabetically-organized list of Web
resources,
usually organized with major headings leading to subtopics leading to
hyperlinks. A subject tree is similar to a library subject card catalog, but
many have the disadvantage of not being comprehensive and not conforming
to established library subject classification. Some of the more
comprehensive subject trees are:
-
The WWW Virtual Library is a distributed subject catalog
whose different subjects are maintained by people at many sites.
- EINet GALAXY was
developed by the Enterprise Integration Network (EINet), a division of
Microelectronics and Computer Technology Corporation.
- Yahoo offers the feature of
building your own personalized subject tree.
Programs, called robots, spiders, or wanderers, "crawl" the Web, look for new
URLs and add them to a database. The database can then be searched using a
search engine.
The following are tips for conducting a successful search:
- Understand the limitation of the search engine. Some search engines
search only the titles of documents for search words; others search part of
the document's text; others search just the URL text.
- Start with a fairly specific list of search words in the text box.
Type the most important word first because some search engines give greater
weight to the first word in the query. Some search engines provide special
search operators (or search options) which help to confine the search to
relevant documents. Check your spelling and don't worry about capitalization.
If your search retrieves too many documents, narrow the search by typing
more search words. Reduce the number of search words if the search retrieves
too few documents. Do not use commonly occurring articles (i.e., and,
the, a, an) or Web terms (i.e., http) among your search words.
- View the list of retrieved documents. In most lists the first
document will be the one having the highest match score indicating
the most frequent usage of your search words.
You will notice that many search engines
also contain
subject trees. Findspot
gives
information and tips on how to conduct searches in several of the following
Search Engines:
Here are a few other sites that contain many different search engines:
Netscape provides easy access to several good search engines (Infoseek,
Lycos, and WebCrawler) and to lists of search engines (CUI
W3 and CUSI) with its
Net Search
button.
Geographic-Oriented Search
A geographic-oriented search works well when looking for sites in a
specific geographic area. Map or directory resources can be used to find
Web sites based on location.
Guide-Oriented Search
Guide-oriented searches are especially helpful for finding large
collections of subject-related links and often suggest new and unusual
Web pages.
When Things Go Wrong
What to do when :
- Your search produces no results.
- Make sure your spelling is right.
- If you use logical operators (Boolean
operators), check your syntax.
- Try to be less specific in your query.
- Try another search engine.
- Your search produces too many results.
- Try to be more specific.
- Identify common words that are important to your search.
- Try to think of words that uniquely identify
what you're looking for.
- You're having problems with the server.
- The server might return an error message (or simply not allow any
connection) if it's too busy or temporarily down.
- No answer, Timed out, Too busy
- Try again after a couple of minutes.
- Wait until a less busier time of
the day.
- Avoid prime time hours.
- Try a mirror site (if any).
- Error 404, Page/File not found
There's many reasons for this to happen. It
may be that the link no longer exists, the
URL has been changed or is simply not valid. Check the following and try again:
- Double check your URL, make sure it's entered correctly.
Check the lower/upper case of each
character.
- Check for any non-alphanumeric
symbols, make sure they are correct.
- If your URL has more than one directory level, try to move up in the tree
remove the last level and try again). Do so until you are at the top level.
- Permission denied.
It may be that the site is denying public access or is configured so that only
restricted access is allowed. Try again later. Sometimes the
restricted access is only for certain period of the day.
Accessing Other Internet
Resources
Although the number of Web documents is increasing at a very fast rate, they
only comprise a part of publicly-accessible files on the Internet. In the past,
several different tools (i.e.,FTP, Gopher, WAIS) were used to access these
non-Web resources, but today graphical browsers can access them directly or
indirectly with gateways. A gateway is a Web page that serves as an
intermediary between Web browsers and Internet resources that aren't directly
accessible to the Web.
You can access each of the following with a graphical browser:
FTP File Archives - FTP stands for File Transfer
Protocol. It can be employed to transfer programs (upload or download) between
Internet computers. If a Net computer can be accessed using ftp, it is known
as a ftp site. To access a ftp site, use a URL that begins with
ftp://. See CUinfo List of FTP
Sites for links to many FTP sites.
Gopher - Gopher
sites are databases of
information organized into easy to navigate menu systems. To access a gopher
site, use a URL that begins with gopher://, such as
gopher://riceinfo.rice.edu:70/11/Subject.
You may see a list of
Veronica or Jughead servers at the top of your Gopher menu. Veronica is a
search engine designed to find resources in Gopherspace; it scans an index of
titles of Gopher directories and files for the search word entered. Jughead
is essentially the same, except that it searches directory titles only.
WAIS (Wide Area Information Server) - WAIS is a resource discovery
tool designed for retrieving documents from full-text databases. You can
access WAIS databases through a gateway to a specific WAIS database, or through
a gateway to the WAIS Directory of Servers. When conducting a search, WAIS
searches the entire content of documents found in its database. It does not
employ the Boolean operators (and, or, and not). One example of a
gateway is
the WAIS Gateway.
Summary
Each of these methods has a particular advantage, although, they are
often more effective when used in combination. Use the following
integrated strategies or guidelines to minimize search time and
maximize search results:
- search broadly to determine the breath of information available on a
subject;
- look for pages with collections of links to the subject;
- locate sites related to the subject; and,
- navigate all the links you find to identify additional resources on
the subject.
EXERCISES
1. Access the sites listed under Starting Points,
Content-Oriented Searches, Geographic-Oriented Searches, and
Guide-Oriented Searches. Write an evaluation of at least one
site from each category including its organization, content, and under what
conditions you think you might use the site.
2. Compare and contrast the use of at least two different search
engines (number of hits, ease of use, type of documents obtained)
by entering the same search words in both.
3. Conduct a search on a search engine, describing the search words you
initially entered and the process you went through (adding more search words,
using special search operators, etc.) to find documents which accurately
describe your topic.
