CSL7030: Algorithms for Big Data (winter 2020)
- Credits L-T-P [C]: 2-0-0 [2]
- Where: Google Meet https://meet.google.com/rni-yevq-yhi
Syllabus
- Introduction: Randomized algorithms, Universal Hash Family, Probabilistic Algorithm Analysis, Approximation Algorithms, $\epsilon$-Approximation Schemes.
- Sketching and Streaming: Extremely Small-Space Data Structures, CountMin Sketch, Count Sketch, Turnstile Streaming, AMS Sketch, Graph Sketching, Graph Connectivity
- Map-Reduce: Map-Reduce Algorithms in Constrains Settings such as small memory, few machines, few rounds, and small total work, Efficient Parallel Algorithms
- External memory and cache-obliviousness: Minimizing I/O for large datasets, Algorithms and data structures such as B-trees, Buffer trees, Multiway merge sort
Learning Materials
-
Scribing of Lectures Algorithms for Big Data By Prof. Jelani Nelson, Harvard University
-
Algorithmic Techniques for Big Data Analysis By Prof. Barna Saha, University of Minnesota Twin Cities
Grading Policy
- Quiz: 20% (https://www.classmarker.com/)
- Review Project: 20%
- Presentation + Viva: 20% (dates to be announced)
- Major: 40%
Quiz
- There will be 3 to 4 Quizzes
- Dates and Syllabus will be announced in Classroom
Review Project
- Goal is to write a review on a specific topic of “Algorithms for Big Data”
- The review should be comprehensive between 3000-4000 words. Do not write review only on one/two papers.
- Review should be submitted with Single Column, double spacing pdf generated with LaTeX (use any standard template e.g., ACM, IEEE, Elsevier etc.)
Important Dates:
- December 10, 2020: Groups Identified
- December 25, 2020: Abstract Submission
- January 20, 2021: Submission
- January 30, 2021: First Report
- February 5, 2021: Final Submission
Presentation + Viva
- Between Jan 18-30, 2021
- Exact dates will be announced latter