Web Systems and Algorithms

Course Title: Web Systems and Algorithms
Full Marks: 45 + 30
Course No: C.Sc. 559
Pass Marks: 22.5+15
Nature of the Course: Theory + Lab
Credit Hrs: 3

Course Description:

This course covers the Internet systems research including the intelligent web, search engine architecture and algorithms, information retrieval, crawling, text analysis, personalization and context, collaborative environments, and the semantic web.

Course Contents:

Unit 1: Introduction (4 hrs)

Examples of intelligent web applications, Basic elements of intelligent applications; What applications can benefit from intelligence?; How can I build intelligence in my own application?; Machine learning, data mining, and all that; Eight fallacies of intelligent applications

Unit 2: Searching (8 hrs)

Searching with Lucene; Why search beyond indexing?; Improving search results based on link analysis; Improving search results based on user clicks; Ranking Word, PDF, and other documents without links; Large-scale implementation issues; Is what you got what you want? Precision and recall

Unit 3: Creating Suggestions and Recommendations (7 hrs)

An online music store: the basic concepts; How do recommendation engines work?; Recommending friends, articles, and news stories; Recommending movies on a site; Large-scale implementation and evaluation issues

Unit 4: Clustering: Grouping Things Together (7 hrs)

The need for clustering; An overview of clustering algorithms; Link-based algorithms; The k-means algorithm; Robust Clustering Using Links (ROCK); DBSCAN; Clustering issues in very large datasets

Unit 5: Classification: Placing Things Where They Belong (7 hrs)

The need for classification; An overview of classifiers; Automatic categorization of emails and spam filtering; Fraud detection with neural networks; Are your results credible?; Classification with very large datasets

Unit 6: Combining Classifiers (6 hrs)

Credit worthiness: a case study for combining classifiers; Credit evaluation with a single classifier; Comparing multiple classifiers on the same data; Bagging: bootstrap aggregating; Boosting: an iterative improvement approach

Unit 7: Semantic Web (6hrs)

Building models, Calculating with knowledge, Exchanging information, Semantic web technologies, Introduction to Resource Description Language RDF and Web Ontology Language OWL

  1. Algorithms of the Intelligent Web, Haralambos Marmanis and Dmitry Babenko, Manning Publications, 2009
  2. Foundations of Semantic Web Technologies, Pascal Hitzler, Markus Krötzsch, Sebastian Rudolph, CRC Press/Chapman and Hall (2009)