Data Structures I

The essential class about construction of efficient data structures. Search trees, heaps, hashing. Worst-case, average-case and amortized complexity of data structures. Self-adjusting data structures. Behavior and analysis of data structures on systems with memory hierarchy. The lecture builds on the lectures Algorithms and Data Structures I and II and Programming I and II from the bachelor study.

Basic information

  • Name: Data Structures I
  • Code: NTIN066
  • Hours per week, examination: winter semester 2/1 C+Ex
  • Language: English (Czech course)
  • Schedule
  • Year: 2015/16 (most recent year here

Colleagues

Lecture

Slides from lectures and their compact version are regularly updated. Last change on 12.1. at 15:00.

Past lectures

1. lecture on 6.10.
  • Introduction
  • (a,b)-tree: insert, delete, join, split, ord, parallel access

2. lecture on 13.10.
  • A-sort: algorithm, complexity
  • Red-black tree: equivalence to (2,4)-tree, insert
  • Amortized analysis: introduction, incrementing binary counter
  • Splay tree: operation splay and its amortized complexity

3. lecture on 20.10.
  • Splay tree: operations insert and delete
  • d-ary heap: representation in an array, insert, delete-min, decrease-key and build
  • Binomial heap: binomial tree, insert, delete-min and decrease-key, amortized complexity
  • Lazy binomial heap: insert and delete-min

4. lecture on 27.10.
  • Lazy binomial heap: decrease-key, amortized complexity
  • Fibonacci heap: insert, delete-min and decrease-key, amortized complexity

5. lecture on 3.11.
  • Applications of heaps in Dijkstra's algorithms
  • Cache-oblivious algorithms: scanning, mergesort, van Emde Boas tree

6. lecture on 10.11.
  • Cache-oblivious algorithms: van Emde Boas tree, matrix transposition, Sleator & Tarjan

no lecture on 17.11. (holiday)

7. lecture on 24.11.
  • Hash tables - introduction
  • Hashing with separate chaining: Expected complexity of Find, an upper bound on the length of the longest chain

8. lecture on 1.12.
  • Hashing with separate chaining: amortized complexity
  • Linear probing: Expected complexity of Insert

9. lecture on 8.12.
  • Cuckoo hashing

10. lecture on 15.12.
  • Universal hashing: introduction, multiply-mod-prime, multiply-shift

11. lecture on 22.12.
  • Universal hashing: hashing vectors and string
  • Geometry: Introduction
  • d-dimensional range trees: O(n log^{d-1} n) space, build in O(n log^d n), query in O(k + log^d n)

12. lecture on 5.1.
  • d-dimensional range trees: build in O(n log^{d-1} n), query in O(k + log^d n)
  • Layered range trees and fractional cascading: query in O(k + log^{d-1} n)
  • Dynamization using BB[alpha]-trees: Insert and delete in O(log^d n) while query in O(k + log^d n)

13. lecture on 12.1.
  • Segment trees, Interval trees and Priority search trees

Plan of the lectures

Trees
  • (a,b)-trees
  • MFR-strategy for lists, Splay trees
  • other solutions: AVL trees, red-black trees, BB-alpha trees
Heaps
  • Complete d-ary heaps
  • Binomial heaps - amortized and worst-case complexity
  • Fibonacci heaps
Techniques for memory hierarchy
  • I/O model, cache-oblivious analysis, LRU-strategy for on-line paging
  • Examples: matrix transposition and multiplication, van Emde Boas tree layout
Hashing
  • Collisions and their resolution, analysis for uniformly distributed data
  • Selecting a hash function: universal hashing, good hash functions
  • Cuckoo hashing
Multidimensional data structures
  • KD trees
  • Range trees

Literature

  1. A. Koubková, V. Koubek: Datové struktury I. MATFYZPRESS, Praha 2011.
  2. T. H. Cormen, C.E. Leiserson, R. L. Rivest, C. Stein: Introduction to Algorithms. Third edition, MIT Press, 2009. On-line
  3. K. Mehlhorn: Data Structures and Algorithms I: Sorting and Searching. Springer-Verlag, Berlin, 1984
  4. D. P. Mehta, S. Sahni eds.: Handbook of Data Structures and Applications. Chapman & Hall/CRC, Computer and Information Series, 2005, Accessible on-line from faculty IP addresses.

  5. D.D. Sleator, R.E. Tarjan: Self-Adjusting Binary Search Trees. J. ACM 32 (3): 652–686, 1985.
  6. E. Demaine: Cache-Oblivious Algorithms and Data Structures. 2002.
  7. R. Pagh: Cuckoo Hashing for Undergraduates. Lecture note, 2006.
  8. M. Thorup: High Speed Hashing for Integers and Strings. lecture notes, 2014.
  9. M. Thorup: String hashing for linear probing (Sections 5.1-5.4). In Proc. 20th SODA, 655-664, 2009.

  10. D.E. Knuth: The Art of Computer Programming: Volume 3: Sorting and Searching, 2nd edition, Addison-Wesley, 1998
  11. M. de Berg, O. Cheong, M. van Kreveld, M. Overmars, Computational Geometry: Algorithms and Applications, Springer, 3 rd edition, 2008. The second edition is on-line.
CzechEnglish
(a,b)-tree: definition, insert, delete13
(a,b)-tree: join, split, amortized, parallel1
A-sort1
Red-black tree12,3
Splay tree3,4,5
Complete d-ary heap12
Binomial heap12,4
Lazy binomial heap1
Fibonacci heap12,4
Cache-oblivious algorithms6,4,5
Hashish with separate chaining110
Hashish with linear probing110
Cuckoo hashing7
Universal hashing8
Computational geometry4,11

A summary of the course was written by Vladimír Čunát.

Examination

Student must fulfill the following two conditions to pass this subject.
  • Successfully work out four out of five semester assignments.
  • Pass the exam (will be specified in the end of the semester). Use SIS to register in an exam term.

Semester assignments

  1. Sorting big data
  2. (a,b)-trees
  3. Splay trees
  4. Cache-oblivious matrix transposition
  5. Hashing

Pre-term

The topic of the last lectures is range trees. I plan to update my slides to cover the knowledge expected for the exam during the Christmas holidays. Meantime, I recommend the following sources.