
|
|
|
|
Friday, February 10, 2012
|
ISSUE 33
|
|
|
Keeping up with search engines.
Belinda Weaver belinda@journoz.com
Where are we with search engines? In 2002, AllTheWeb (www.alltheweb.com ) briefly pushed well past the two billion page mark and edged slightly ahead of the long-time search engine leader Google (www.google.com). It did not last. Google has now gone to 3 billion pages plus and begun searching new file formats, such as Microsoft Word files.
New tools came along. Teoma (www.teoma.com ), a Google-like engine, is still quite new, while an older engine, HotBot (www.hotbot.com ) has reinvented itself as a meta-search tool. Many search engine databases are being refreshed more quickly.
Specialty search engines
There are more and more specialty engines popping up. Many do a better job of mining the information they index than a general search engine such as Google. For example, there are search tools for images, multimedia, downloadable files and programs that you can FTP to your PC or Mac. There are tools for searching news headlines or for tracking down PDF files. There are tools such as Google Groups (http://groups.google.com ) for reading newsgroup postings and tools such as BoardReader (www.boardreader.com ) for finding postings to Internet message boards.
In a world where search engines develop all the time, where new features are unveiled constantly, where the number of pages indexed grows seemingly exponentially – how do you stay on top of searching?
Interestingly, the very tool that makes you feel overloaded – the Internet – also helps you cope with information overload by linking you to filtered news relevant to the work you have to do. Many people doing jobs similar to your own have taken on the role of keeping like-minded people informed.
Staying on top of searching
One of the best people to get acquainted with is Gary Price. Price, co-author with Chris Sherman of a book called The Invisible Web, maintains several useful Web sites, including:
- DirectSearch (www.freepint.com/gary/direct.htm ), a gateway to the invisible Web.
- List of Lists (www.specialissues.com/lol ), which provides ‘best of’ lists and rankings.
- the Resource Shelf (http://www.resourceshelf.com/), Price’s personal weblog for information professionals, announces all kinds of new tools and services. You can get weekly highlights via email though it is probably easier to skim the blog itself and follow interesting links from there.
Regular newsletters such as the monthly Internet Resources Newsletter (www.hw.ac.uk/libWWW/irn/ ), the fortnightly FreePint (www.freepint.com ), or the weekly Scout Report (http://scout.cs.wisc.edu/ ) are also useful sources of search engine and other news.
Search engine review and critiques
The key site for keeping on top of search engines is Search Engine Watch (www.searchenginewatch.com/ ). This site ranks search engines, and also contains comparison tables and feature charts. Each year they run polls that declare the best overall tool, the best image searcher, the best meta-search tool and so on. At the site, you can also subscribe to a daily newsletter called SearchDay (www.searchenginewatch.com/searchday ), written by Chris Sherman.
Greg Notess’s site, Search Engine Showdown (www.notess.com ), is also a good source of news, tips and information about search tools, particularly things like search syntax.
Tara Calishain, weekly publisher of the ResearchBuzz (www.researchbuzz.com ) newsletter, discusses tools from the point of view of the user trying to make sense of them. Her take on things is often more helpful than a straight description of features and options would be.
On the technical side
Sites like CNet (www.cnet.com ) announce technology news that includes changes and improvements in the commercial search engines. Even the mainstream press is starting to announce interesting items of search engine news. Last year, the Boston Globe interviewed Eric Schmidt, CEO of Google (www.google.com ), who talked up the company’s plans for Google search. Basically, Schmidt’s (or rather Google’s) aim is to index everything -- to include within Google search not just Web sites, PDFs, image files, newsgroups and so on, but also the content of proprietary databases such as LexisNexis. Obviously that kind of premium content would still be chargeable, but to be able to search within such databases with one search tool would be extremely handy and quick.
Many search engines are good at keeping users informed. Look for news or ‘what’s new’ sections at search engine sites and visit their advanced search features sections. Here you will often find improvements – PDF or image searching, Boolean operators, language tools – that actually make searching a lot easier.
Google has a section specifically devoted to innovations still in beta test. Google Labs (http://labs.google.com/ ) is where you can try out new features, such as the recent viewer that replaced the traditional list display of results with a moving slide show of sites.
Web search programs on your PC
If you are interested in search tools that reside on your computer, such as WebFerret, sites like AgentLand (www.agentland.com ) and BotSpot (www.botspot.com) are the place to keep tabs on these. You can download software from these sites and then customise the tools to match your needs. Both sites include programmable agents that you can design yourself, but that may be too complicated for most people.
Where else can you get news? At the risk of sounding low-tech, I think journals in your discipline could be a fruitful source of Web news. Maybe they will not deal intensely with search engines, but they may still come up with tools that are relevant to the work you do.
Web subject directories
Which brings me to subject pages. These are also called directories, and are often more useful than search because they do not respond blindly to words typed in, but gather and organize material on given topics.
New ones emerge all the time, but a good watching point is Pinakes (www.hw.ac.uk/libWWW/irn/pinakes/pinakes.html ). It lists the top 40 subject-specific pages online – in areas as diverse as social sciences, maritime studies and biotechnology. Pinakes also points to multi-subject directories and helps you keep tabs on a whole range of new search services that may deliver more than your search engine can.
Keep tabs on any tools that promise to open up the invisible Web. The most interesting and useful information is in there, rather than on the open Web indexed by search engines. Big as they are, the search engine databases cover less than 20% of the material actually online, which is why good invisible Web tools are so crucial.
About the author
Belinda Weaver maintains the Guide to Internet information sources for Australian journalists (http://journoz.com/) and its companion blog, journoz: updates for Australian journalists (www.journoz.com/weblog/ ). Since 2000, she has written a Web advice column, FindIT, for the Brisbane Courier-Mail. She also writes a monthly column, Weaver's Web, for inCite (www.alia.org.au/incite/ ). Catch the Wave, her book about finding high quality information sources on the Internet, will be published in May, 2003 by RMIT Publishing (www.rmitpublishing.com.au/ ). The electronic version is at www.informit.com.au/library/ . She contributed the chapter 'The computer as an essential tool' to Journalism: investigation and research, edited by Stephen Tanner, published by Pearson Education, 2003.
scip.online, issue 33, June 10, 2003.
[PRINTER FRIENDLY VERSION]
|
|
|
There are no letters available.
|
|
[POST]
|
|
| |