# Data Structures I

The essential class about construction of efficient data structures. Search trees, heaps, hashing. Worst-case, average-case and amortized complexity of data structures. Self-adjusting data structures. Behavior and analysis of data structures on systems with memory hierarchy. The lecture builds on the lectures Algorithms and Data Structures I and II and Programming I and II from the bachelor study.

## Lecture

Slides from lectures and their compact version are regularly updated.

### Past lectures

1. lecture on 4.10.
• Introduction
• Amortized complexity: Binary counter, dynamic array, BB[alpha]-trees
• Resources: [14, lecture 5]
2. lecture on 11.10.
• BB[alpha]-trees
• Splay trees
• (a,b)-trees (also called B-trees): definition, height
• Resources: Wikipedia (Splay tree, B-tree), [2, chapters 12, 18], [4, chapters 10, 12, 15]
3. lecture on 18.10.
• (a,b)-trees (also called B-trees): basic operations (find, insert, delete), parallel access, A-sort, equivalence to Red-black trees
• Regular heaps: basic operations (insert, delete), decrease
• Resources: Wikipedia (B-tree, Red-black trees, Regular heaps, binomial heaps, Fibonacci heaps), [2, chapters 6, 13, 18, 19], [4, chapters 7, 10, 15]
4. lecture on 25.10.
• Binomial and Fibonacci heaps: basic operations (insert, delete), decrease
• Resources: Wikipedia binomial heaps, Fibonacci heaps), [2, chapters 6, 13, 18, 19], [4, chapters 7, 10, 15]
5. lecture on 1.11.
• Fibonacci heaps: analysis
• Cache-oblivious algorithms: motivation, scanning, binomial heap, binary search
• Resources: Wikipedia binomial heaps, Fibonacci heaps), [2, chapters 6, 13, 18, 19], [4, chapters 7, 10, 15]
no lecture on 8.11.
6. lecture on 15.11.
• Cache-oblivious algorithms: mergesort, matrix transposition
7. lecture on 22.11.
• Cache-oblivious algorithms: van Emde Boas layout, Sleator & Tarjan theorem
8. lecture on 29.11.
• Universal hashing, multiply-mod-prime
9. lecture on 6.12.
• Tabular hashing, separate chaining
10. lecture on 13.12.
• Longest chain in separate chaining, linear probing, Cuckoo hashing
11. lecture on 20.12.
• Analysis of Cuckoo hashing, combination of linear probing and multiply-shift

## Literature

1. A. Koubková, V. Koubek: Datové struktury I. MATFYZPRESS, Praha 2011.
2. T. H. Cormen, C.E. Leiserson, R. L. Rivest, C. Stein: Introduction to Algorithms. Third edition, MIT Press, 2009. On-line and many copies in our library.
3. K. Mehlhorn: Data Structures and Algorithms I: Sorting and Searching. Springer-Verlag, Berlin, 1984. Available in our library.
4. D. P. Mehta, S. Sahni eds.: Handbook of Data Structures and Applications. Chapman & Hall/CRC, Computer and Information Series, 2005, Accessible on-line from faculty IP addresses.
5. D.D. Sleator, R.E. Tarjan: Self-Adjusting Binary Search Trees. J. ACM 32 (3): 652–686, 1985.
6. E. Demaine: Cache-Oblivious Algorithms and Data Structures. 2002.
7. R. Pagh: Cuckoo Hashing for Undergraduates. Lecture note, 2006.
8. M. Thorup: High Speed Hashing for Integers and Strings. lecture notes, 2014.
9. M. Thorup: String hashing for linear probing (Sections 5.1-5.4). In Proc. 20th SODA, 655-664, 2009.
10. D.E. Knuth: The Art of Computer Programming: Volume 3: Sorting and Searching, 2nd edition, Addison-Wesley, 1998
11. M. de Berg, O. Cheong, M. van Kreveld, M. Overmars, Computational Geometry: Algorithms and Applications, Springer, 3 rd edition, 2008. The second edition is on-line.
12. J. Matoušek, J. Vondrák: The probabilistic method. Lecture notes, 2008. online
13. Erik Demaine, and Srinivas Devadas: Introduction to Algorithms (6.006). MIT OpenCourseWare, Fall 2011. Video lectures
14. Erik Demaine, Srinivas Devadas, and Nancy Lynch: Design and Analysis of Algorithms (6.046J). MIT OpenCourseWare, Spring 2015. Video lectures
15. Eric Grimson, and John Guttag: Introduction to Computer Science and Programming (6.00). MIT OpenCourseWare, Fall 2008. Video lectures
16. Donald E. Knuth: The art of computer programming. Volume 3, Sorting and searching. Addison-Wesley, 1973. Available in our library.
17. K. Mehlhorn: Data structures and algorithms 3, Multi-dimensional searching and computational geometry. Springer, 1984. Available in our library.
18. Martin Mareš, Tomáš Valla: Průvodce labyrintem algoritmů. CZ.NIC, 2017. On-line and in our library.
Useful links: Catalogue of our library, access to electronic versions of some librarian's books and list of other databazes.
CzechEnglish
(a,b)-tree: definition, insert, delete13
(a,b)-tree: join, split, amortized, parallel1
A-sort1
Red-black tree12,3
Splay tree3,4,5
Complete d-ary heap12
Binomial heap12,4
Lazy binomial heap1
Fibonacci heap12,4
Cache-oblivious algorithms6,4,5
Hashish with separate chaining110
Hashish with linear probing110
Cuckoo hashing7
Universal hashing8
Computational geometry4,11

A summary of the course was written by Vladimír Čunát.

### Prerequisites, Expected knowledge

Algorithms and data structures
Complexity, amortized analysis, O-notation, sorting, binary heap, binary search tree, AVL-tree, hashing with chaining, breadth-first search (BFS), depth-first search (DFS), topological sort, median finding, dynamic programming
[13, lectures 1-9, 14, 15] [14, lectures 2, 6, 7] [2, chapters 1-13, 15-17, 22-24], [16], [3]
Probability
Random variable, uniform distribution, independence, linearity of expectation, expected value, variance, Chernoff bound
[12, chapters 1-5,7]
Programming skills
[15]
Programming in C
Banahan, M.; Brady, D.; Doran, M.: The C Book. Addison-Wesley, 1991. online
C reference
Youtube tutorials; e.g. C Programming Tutorial, C From Beginner To Expert Programming Tutorial
You can find many books about C language in our library. Visit the catalogue or ask personally in the library.

## Examination

Student must fulfill the following two conditions to pass this subject.
• Four homeworks will be assigned during the semester. For each homework you can get up to 100 points. You need 320 points to pass.
• Pass the exam (will be specified in the end of the semester). Use SIS to register in an exam term.

### Examination questions

• Describe operations Insert and Delete in a dynamic array and analyze their complexity.
• Write the definition of BB[alpha]-tree. Describe its operations and analyze complexity.
• Write the definition of splay tree. Describe its operations and analyze complexity.
• Write the definitions of (a,b)-tree and red-black tree and compare them. Describe operations and analyze complexity. Describe applications of (a,b)-tree in parallel programming.
• Write the definition of (a,b)-tree. Describe and analyze the algorithm A-sort.
• Write the definition of BB[alpha]-tree. Describe its operations and analyze complexity.
• Write the definition of regular and binomial heaps. Describe their operations and analyze complexity.
• Write the definition of a Fibonnaci heap. Describe its operations and analyze complexity.
• Write algorithms for cache-oblivious matrix transpositions. Analyze their time and I/O complexity.
• Compare different representations of static sets in cache-aware and cache-oblivious models: sorted array, binary heap, (a,b)-tree and binary search tree in Emde-Boas representation.
• Write and prove Sleator-Tarjan theorem comparing LRU and optimal page replacement methods.
• Write the definition of universal and k-independent hashing systems. Describe a multiply-mod-prime hashing system and analyze its independence and universality.
• Write the definition of universal and k-independent hashing systems. Describe a tabular hashing system and analyze its independence.
• Write the definition of universal and k-independent hashing systems. Write the definition of a hashing with separate chaining. Describe its operations and analyze complexity.
• Write the definition of universal and k-independent hashing systems. Write the definition of a hashing with linear probing. Describe its operations and analyze complexity.
• Write the definition of universal and k-independent hashing systems. Write the definition of a Cuckoo hashing. Describe its operations and analyze complexity.
• Write the definition a range tree. Describe its operations and analyze complexity. Describe the dynamic version of a range tree and analyze complexity.
• Write the definition a range tree. Describe its operations and analyze complexity. Describe the fractional cascading and its usage in a range tree. Analyze complexity of operations in a range tree with fractional cascading.