Data Structures I

The essential class about construction of efficient data structures. Search trees, heaps, hashing. Worst-case, average-case and amortized complexity of data structures. Self-adjusting data structures. Behavior and analysis of data structures on systems with memory hierarchy. The lecture builds on the lectures Algorithms and Data Structures I and II and Programming I and II from the bachelor study.

Basic information

Colleagues

Lecture

Slides from lectures and their compact version are regularly updated.

Past lectures

1. lecture on 4.10.
  • Introduction
  • Amortized complexity: Binary counter, dynamic array, BB[alpha]-trees
  • Resources: [14, lecture 5]
2. lecture on 11.10.
  • BB[alpha]-trees
  • Splay trees
  • (a,b)-trees (also called B-trees): definition, height
  • Resources: Wikipedia (Splay tree, B-tree), [2, chapters 12, 18], [4, chapters 10, 12, 15]
3. lecture on 18.10.
  • (a,b)-trees (also called B-trees): basic operations (find, insert, delete), parallel access, A-sort, equivalence to Red-black trees
  • Regular heaps: basic operations (insert, delete), decrease
  • Resources: Wikipedia (B-tree, Red-black trees, Regular heaps, binomial heaps, Fibonacci heaps), [2, chapters 6, 13, 18, 19], [4, chapters 7, 10, 15]
4. lecture on 25.10.
  • Binomial and Fibonacci heaps: basic operations (insert, delete), decrease
  • Resources: Wikipedia binomial heaps, Fibonacci heaps), [2, chapters 6, 13, 18, 19], [4, chapters 7, 10, 15]
5. lecture on 1.11.
  • Fibonacci heaps: analysis
  • Cache-oblivious algorithms: motivation, scanning, binomial heap, binary search
  • Resources: Wikipedia binomial heaps, Fibonacci heaps), [2, chapters 6, 13, 18, 19], [4, chapters 7, 10, 15]
no lecture on 8.11.
6. lecture on 15.11.
  • Cache-oblivious algorithms: mergesort, matrix transposition
7. lecture on 22.11.
  • Cache-oblivious algorithms: van Emde Boas layout, Sleator & Tarjan theorem
8. lecture on 29.11.
  • Universal hashing, multiply-mod-prime
9. lecture on 6.12.
  • Tabular hashing, separate chaining
10. lecture on 13.12.
  • Longest chain in separate chaining, linear probing, Cuckoo hashing
11. lecture on 20.12.
  • Analysis of Cuckoo hashing, combination of linear probing and multiply-shift

Literature

  1. A. Koubková, V. Koubek: Datové struktury I. MATFYZPRESS, Praha 2011.
  2. T. H. Cormen, C.E. Leiserson, R. L. Rivest, C. Stein: Introduction to Algorithms. Third edition, MIT Press, 2009. On-line and many copies in our library.
  3. K. Mehlhorn: Data Structures and Algorithms I: Sorting and Searching. Springer-Verlag, Berlin, 1984. Available in our library.
  4. D. P. Mehta, S. Sahni eds.: Handbook of Data Structures and Applications. Chapman & Hall/CRC, Computer and Information Series, 2005, Accessible on-line from faculty IP addresses.
  5. D.D. Sleator, R.E. Tarjan: Self-Adjusting Binary Search Trees. J. ACM 32 (3): 652–686, 1985.
  6. E. Demaine: Cache-Oblivious Algorithms and Data Structures. 2002.
  7. R. Pagh: Cuckoo Hashing for Undergraduates. Lecture note, 2006.
  8. M. Thorup: High Speed Hashing for Integers and Strings. lecture notes, 2014.
  9. M. Thorup: String hashing for linear probing (Sections 5.1-5.4). In Proc. 20th SODA, 655-664, 2009.
  10. D.E. Knuth: The Art of Computer Programming: Volume 3: Sorting and Searching, 2nd edition, Addison-Wesley, 1998
  11. M. de Berg, O. Cheong, M. van Kreveld, M. Overmars, Computational Geometry: Algorithms and Applications, Springer, 3 rd edition, 2008. The second edition is on-line.
  12. J. Matoušek, J. Vondrák: The probabilistic method. Lecture notes, 2008. online
  13. Erik Demaine, and Srinivas Devadas: Introduction to Algorithms (6.006). MIT OpenCourseWare, Fall 2011. Video lectures
  14. Erik Demaine, Srinivas Devadas, and Nancy Lynch: Design and Analysis of Algorithms (6.046J). MIT OpenCourseWare, Spring 2015. Video lectures
  15. Eric Grimson, and John Guttag: Introduction to Computer Science and Programming (6.00). MIT OpenCourseWare, Fall 2008. Video lectures
  16. Donald E. Knuth: The art of computer programming. Volume 3, Sorting and searching. Addison-Wesley, 1973. Available in our library.
  17. K. Mehlhorn: Data structures and algorithms 3, Multi-dimensional searching and computational geometry. Springer, 1984. Available in our library.
  18. Martin Mareš, Tomáš Valla: Průvodce labyrintem algoritmů. CZ.NIC, 2017. On-line and in our library.
Useful links: Catalogue of our library, access to electronic versions of some librarian's books and list of other databazes.
CzechEnglish
(a,b)-tree: definition, insert, delete13
(a,b)-tree: join, split, amortized, parallel1
A-sort1
Red-black tree12,3
Splay tree3,4,5
Complete d-ary heap12
Binomial heap12,4
Lazy binomial heap1
Fibonacci heap12,4
Cache-oblivious algorithms6,4,5
Hashish with separate chaining110
Hashish with linear probing110
Cuckoo hashing7
Universal hashing8
Computational geometry4,11

A summary of the course was written by Vladimír Čunát.

Prerequisites, Expected knowledge

Algorithms and data structures
Complexity, amortized analysis, O-notation, sorting, binary heap, binary search tree, AVL-tree, hashing with chaining, breadth-first search (BFS), depth-first search (DFS), topological sort, median finding, dynamic programming
[13, lectures 1-9, 14, 15] [14, lectures 2, 6, 7] [2, chapters 1-13, 15-17, 22-24], [16], [3]
Probability
Random variable, uniform distribution, independence, linearity of expectation, expected value, variance, Chernoff bound
[12, chapters 1-5,7]
Programming skills
[15]
Programming in C
Banahan, M.; Brady, D.; Doran, M.: The C Book. Addison-Wesley, 1991. online
C reference
Youtube tutorials; e.g. C Programming Tutorial, C From Beginner To Expert Programming Tutorial
You can find many books about C language in our library. Visit the catalogue or ask personally in the library.

Examination

Student must fulfill the following two conditions to pass this subject.
  • Four homeworks will be assigned during the semester. For each homework you can get up to 100 points. You need 320 points to pass.
  • Pass the exam (will be specified in the end of the semester). Use SIS to register in an exam term.

Semester assignments

  1. Splay tree
  2. Fibonnaci heap
  3. Matrix transposition
  4. Hashing
  5. (a,b)-tree

Examination questions

  • Describe operations Insert and Delete in a dynamic array and analyze their complexity.
  • Write the definition of BB[alpha]-tree. Describe its operations and analyze complexity.
  • Write the definition of splay tree. Describe its operations and analyze complexity.
  • Write the definitions of (a,b)-tree and red-black tree and compare them. Describe operations and analyze complexity. Describe applications of (a,b)-tree in parallel programming.
  • Write the definition of (a,b)-tree. Describe and analyze the algorithm A-sort.
  • Write the definition of BB[alpha]-tree. Describe its operations and analyze complexity.
  • Write the definition of regular and binomial heaps. Describe their operations and analyze complexity.
  • Write the definition of a Fibonnaci heap. Describe its operations and analyze complexity.
  • Write algorithms for cache-oblivious matrix transpositions. Analyze their time and I/O complexity.
  • Compare different representations of static sets in cache-aware and cache-oblivious models: sorted array, binary heap, (a,b)-tree and binary search tree in Emde-Boas representation.
  • Write and prove Sleator-Tarjan theorem comparing LRU and optimal page replacement methods.
  • Write the definition of universal and k-independent hashing systems. Describe a multiply-mod-prime hashing system and analyze its independence and universality.
  • Write the definition of universal and k-independent hashing systems. Describe a tabular hashing system and analyze its independence.
  • Write the definition of universal and k-independent hashing systems. Write the definition of a hashing with separate chaining. Describe its operations and analyze complexity.
  • Write the definition of universal and k-independent hashing systems. Write the definition of a hashing with linear probing. Describe its operations and analyze complexity.
  • Write the definition of universal and k-independent hashing systems. Write the definition of a Cuckoo hashing. Describe its operations and analyze complexity.
  • Write the definition a range tree. Describe its operations and analyze complexity. Describe the dynamic version of a range tree and analyze complexity.
  • Write the definition a range tree. Describe its operations and analyze complexity. Describe the fractional cascading and its usage in a range tree. Analyze complexity of operations in a range tree with fractional cascading.