1. Search Engines and Infromation Retrieval
  2. Architecutre of a Search Engine
  3. Crawls and Feeds
  4. Processing Text
  5. Ranking with Indexes
  6. Queries and Interfaces
  7. Retrieval Models

  1. Search Engines and Infromation Retrieval
    1. What is Infroamtion Rereival?
    2. The Big Issues
    3. Search Engines
    4. Search Engineers
  2. Architecutre of a Search Engine
    1. what is an architecture?
    2. basic building blocks
    3. breaking it down
      1. text acquisition
      2. text transformation
      3. index creation
      4. user interaction
      5. ranking
      6. evaluation
    4. how does is really work?
  3. Crawls and Feeds
    1. deciding what to search
    2. crawling the web
      1. retrieving web pages
      2. the web crawler
      3. freshness
      4. focused crawling
      5. deep web
      6. sitemaps
      7. distributed crawling
    3. crawling documents and email
    4. document feeds
    5. the conversion problem
      1. character encodings
    6. scoring the documents
      1. using a database system
      2. random access
      3. compression and large files
      4. update
      5. bigtable
    7. detecting duplicates
    8. removing noise
  4. Processing Text
    1. from words to terms
    2. text statistics
    3. document parsing
    4. document structure and markup
    5. link analysis
    6. information extraction
    7. internationalization
  5. Ranking with Indexes
    1. overview
    2. abstract model of ranking
    3. inverted indexes
    4. compression
    5. auxilary structures
    6. index construction
    7. query processing
  6. Queries and Interfaces
    1. infromation needs and queries
    2. query transformation and refinement
    3. showing the results
      1. result pages and snippets
      2. advertising and serach
      3. clustering the results
    4. cross-language search
  7. Retrieval Models
    1. overview of retreival models
      1. boolean retrieval
      2. the vector space model
    2. probabilistic models
      1. information retreival as classification
      2. the bm25 ranking algorithm
    3. ranking based on language models
      1. query likelihood ranking
      2. relevance models and pseudo-relevance feedback
    4. complex queries and combining evidence
    5. web search
    6. machine learning and information retreival
    7. application-based models