Syllabus
- Introduction: Randomized algorithms, Universal Hash Family, Probabilistic Algorithm Analysis, Approximation Algorithms, $\epsilon$-Approximation Schemes.
- Sketching and Streaming: Extremely Small-Space Data Structures, CountMin Sketch, Count Sketch, Turnstile Streaming, AMS Sketch, Graph Sketching, Graph Connectivity
- Map-Reduce: Map-Reduce Algorithms in Constrains Settings such as small memory, few machines, few rounds, and small total work, Efficient Parallel Algorithms
- External memory and cache-obliviousness: Minimizing I/O for large datasets, Algorithms and data structures such as B-trees, Buffer trees, Multiway merge sort
Learning Materials
-
Scribing of Lectures Algorithms for Big Data By Prof. Jelani Nelson, Harvard University
-
Algorithmic Techniques for Big Data Analysis By Prof. Barna Saha, University of Minnesota Twin Cities
-
https://www.sketchingbigdata.org/
Grading Policy
Quiz
- There will be 3 to 4 Quizzes
- Dates and Syllabus will be announced in Classroom
Review Project
- Goal is to write a review on a specific topic of “Algorithms for Big Data”
- The review should be comprehensive between 3000-4000 words. Do not write review only on one/two papers.
- Review should be submitted with Single Column, double spacing pdf generated with LaTeX (use any standard template e.g., ACM, IEEE, Elsevier etc.)
You may work in a group of 2/3 person. Expectation of larger group will be higher.
Important Dates:
- December 10, 2020: Groups Identified
- December 25, 2020: Abstract Submission
- January 20, 2021: Submission
- January 30, 2021: First Report
- February 5, 2021: Final Submission
Plagiarism tolerance is 7% from single source and 15% cumulative, anything more leads to lower or no marks at all.
Presentation + Viva
- Between Jan 18-30, 2021
- Exact dates will be announced latter