A search engine does not search the web to find a match; it searches its own database of information about .Web pages that it has collected, indexed, and stored Search engines all have three major pieces: the crawl, the index, and the runtime system or query processor. The process begins with the crawler.
The crawler is a specialized software program that hops from link to link on the World Wide Web, scarfing up the pages it finds and sending them back to be indexed. This is very similar to Matthew Gray’s earliest search engine, which searched and indexed entire files on the Internet, not just the titles. The crawler sends its data back to a massive database called the index. The runtime system or query processor is the user interface you see on a search engine’s Web site, where you type in your search words.
three critical pieces of search, and all three must scale to the size and continued growth of the Web: they must crawl, they must index, and they must serve results. This is no small task: by most accounts, Google alone has more than 175,000 computers dedicated to this job. That’s more than existed on Earth in the early 1970s!
The Web is huge; it is so big that it is hard to get an accurate count of Web pages. In January of 2004, it was estimated that the Web contained over 10 billion pages; with an average world population of 6.4 billion, that is almost two pages per person. In 2003 Google reported that it served 250 million different searches per day.
Most people are using search engines in their daily lives to find information on the Web. The most recognizable part of a search engine is the query interface. This is the home page that is displayed when you visit a major search engine such as Google, Yahoo!, or MSN.
The query interface is the only part of a search engine that the user ever sees. Every other part of the search engine is behind the scenes, out of view of the people who use it every day. That doesn’t mean it’s not important, however. In fact, what’s in the back end is the most important part of the search engine.
Labels:
Search Engine working procedure
The crawler is a specialized software program that hops from link to link on the World Wide Web, scarfing up the pages it finds and sending them back to be indexed. This is very similar to Matthew Gray’s earliest search engine, which searched and indexed entire files on the Internet, not just the titles. The crawler sends its data back to a massive database called the index. The runtime system or query processor is the user interface you see on a search engine’s Web site, where you type in your search words.
three critical pieces of search, and all three must scale to the size and continued growth of the Web: they must crawl, they must index, and they must serve results. This is no small task: by most accounts, Google alone has more than 175,000 computers dedicated to this job. That’s more than existed on Earth in the early 1970s!
The Web is huge; it is so big that it is hard to get an accurate count of Web pages. In January of 2004, it was estimated that the Web contained over 10 billion pages; with an average world population of 6.4 billion, that is almost two pages per person. In 2003 Google reported that it served 250 million different searches per day.
Most people are using search engines in their daily lives to find information on the Web. The most recognizable part of a search engine is the query interface. This is the home page that is displayed when you visit a major search engine such as Google, Yahoo!, or MSN.
The query interface is the only part of a search engine that the user ever sees. Every other part of the search engine is behind the scenes, out of view of the people who use it every day. That doesn’t mean it’s not important, however. In fact, what’s in the back end is the most important part of the search engine.
0 comments:
Post a Comment