[contents] Search Engines Information Retrieval in Practice: Croft, Metzler, Strohman

Search Engines and Infromation Retrieval
Architecutre of a Search Engine
Crawls and Feeds
Processing Text
Ranking with Indexes
Queries and Interfaces
Retrieval Models

Search Engines and Infromation Retrieval
1. What is Infroamtion Rereival?
2. The Big Issues
3. Search Engines
4. Search Engineers
Architecutre of a Search Engine
1. what is an architecture?
2. basic building blocks
3. breaking it down
  1. text acquisition
  2. text transformation
  3. index creation
  4. user interaction
  5. ranking
  6. evaluation
4. how does is really work?
Crawls and Feeds
1. deciding what to search
2. crawling the web
  1. retrieving web pages
  2. the web crawler
  3. freshness
  4. focused crawling
  5. deep web
  6. sitemaps
  7. distributed crawling
3. crawling documents and email
4. document feeds
5. the conversion problem
  1. character encodings
6. scoring the documents
  1. using a database system
  2. random access
  3. compression and large files
  4. update
  5. bigtable
7. detecting duplicates
8. removing noise
Processing Text
1. from words to terms
2. text statistics
3. document parsing
4. document structure and markup
5. link analysis
6. information extraction
7. internationalization
Ranking with Indexes
1. overview
2. abstract model of ranking
3. inverted indexes
4. compression
5. auxilary structures
6. index construction
7. query processing
Queries and Interfaces
1. infromation needs and queries
2. query transformation and refinement
3. showing the results
  1. result pages and snippets
  2. advertising and serach
  3. clustering the results
4. cross-language search
Retrieval Models
1. overview of retreival models
  1. boolean retrieval
  2. the vector space model
2. probabilistic models
  1. information retreival as classification
  2. the bm25 ranking algorithm
3. ranking based on language models
  1. query likelihood ranking
  2. relevance models and pseudo-relevance feedback
4. complex queries and combining evidence
5. web search
6. machine learning and information retreival
7. application-based models

Categories

Comment is the energy for a writer, thanks!Cancel reply