Data Structures

Created: December 17, 2024

Linear Data Structures

Array

Fixed-size, contiguous memory
[1][2][3][4][5]

Access: O(1)
Insert/Delete at end: O(1)
Insert/Delete at start/middle: O(n)
Search (unsorted): O(n)
Search (sorted): O(log n) with binary search

Linked List

Nodes with data + pointer to next
[1]→[2]→[3]→[4]→null

Access: O(n)
Insert/Delete at start: O(1)
Insert/Delete at end: O(n) (O(1) if tail pointer)
Insert/Delete in middle: O(n) to find, O(1) to modify
Search: O(n)

Variants:
- Singly linked (→)
- Doubly linked (←→)
- Circular (last points to first)

Stack (LIFO - Last In First Out)

    ┌───┐
    │ 3 │ ← top
    ├───┤
    │ 2 │
    ├───┤
    │ 1 │
    └───┘

Operations:
- push(x): Add to top - O(1)
- pop(): Remove from top - O(1)
- peek(): View top - O(1)

Use cases: Function calls, undo/redo, parentheses matching

Queue (FIFO - First In First Out)

Front                    Back
  ↓                        ↓
[1]→[2]→[3]→[4]
  ↑             ↑
dequeue      enqueue

Operations:
- enqueue(x): Add to back - O(1)
- dequeue(): Remove from front - O(1)
- peek(): View front - O(1)

Use cases: Task scheduling, BFS, request handling

Deque (Double-Ended Queue)

Can add/remove from both ends
[1]⟷[2]⟷[3]⟷[4]
 ↑              ↑
Both ends     Both ends

All operations: O(1)

Tree Structures

Binary Tree

      1
     / \
    2   3
   / \
  4   5

Each node has ≤2 children
Height: h (levels)
Max nodes: 2^h - 1

Binary Search Tree (BST)

      5
     / \
    3   8
   / \   \
  1   4   9

Left < Parent < Right

Search/Insert/Delete: O(log n) average, O(n) worst
Balanced BST: O(log n) guaranteed

Heap (Binary Heap)

Max-Heap:
      9
     / \
    7   5
   / \
  3   1

Parent ≥ Children (max-heap)
Parent ≤ Children (min-heap)

Insert: O(log n)
Extract max/min: O(log n)
Peek max/min: O(1)

Use cases: Priority queues, heap sort

Trie (Prefix Tree)

       root
      /  |  \
     c   d   t
    /    |    \
   a     o     o
  /      |      \
 t       g       p
(cat)  (dog)   (top)

Insert/Search: O(m) where m = string length
Space: O(ALPHABET_SIZE × m × n)

Use cases: Autocomplete, spell check, IP routing

Hash-Based Structures

Hash Table / Hash Map

Key → Hash Function → Index → Value

hash("alice") → 3 → [3] = {age: 30}

Average case:
- Insert: O(1)
- Delete: O(1)
- Search: O(1)

Worst case (collisions): O(n)

Collision handling:
- Chaining (linked lists)
- Open addressing (probing)

Hash Set

Same as hash table but only stores keys (no values)

Contains/Add/Remove: O(1) average

Use cases: Unique elements, membership testing

Graph Structures

Graph (General)

Adjacency List:
A → [B, C]
B → [A, D]
C → [A, D]
D → [B, C]

Adjacency Matrix:
    A B C D
A [ 0 1 1 0 ]
B [ 1 0 0 1 ]
C [ 1 0 0 1 ]
D [ 0 1 1 0 ]

Types:
- Directed vs Undirected
- Weighted vs Unweighted
- Cyclic vs Acyclic (DAG)

Traversal:
- BFS: O(V + E) using queue
- DFS: O(V + E) using stack/recursion

Advanced Structures

Log-structured merge-tree (LSM tree)

The log-structured merge-tree (also known as LSM tree, or LSMT) is a data structure with performance characteristics that make it attractive for providing indexed access to files with high insert volume, such as transactional log data.

AVL Tree (Self-Balancing BST)

Guarantees O(log n) for all operations
Height difference between subtrees ≤ 1
Rebalances via rotations

Red-Black Tree

Another self-balancing BST
Slightly more relaxed than AVL
Used in: Java TreeMap, C++ map

B-Tree

Multi-way tree (more than 2 children)
Used in: Databases, filesystems
Optimized for disk reads

Segment Tree

For range queries (sum, min, max)
Build: O(n)
Query: O(log n)
Update: O(log n)

Disjoint Set (Union-Find)

Tracks set membership
Union: O(α(n)) ≈ O(1)
Find: O(α(n)) ≈ O(1)
α(n) = inverse Ackermann (extremely slow growing)

Use cases: Connected components, Kruskal's algorithm

Time Complexity Summary

Structure	Access	Search	Insert	Delete
Array	O(1)	O(n)	O(n)	O(n)
Linked List	O(n)	O(n)	O(1)*	O(1)*
Stack	O(n)	O(n)	O(1)	O(1)
Queue	O(n)	O(n)	O(1)	O(1)
Hash Table	-	O(1)†	O(1)†	O(1)†
BST	O(log n)‡	O(log n)‡	O(log n)‡	O(log n)‡
Heap	-	O(n)	O(log n)	O(log n)
Trie	-	O(m)	O(m)	O(m)

*At known position †Average case ‡Balanced tree

Space Complexity

Structure	Space
Array	O(n)
Linked List	O(n)
Hash Table	O(n)
BST	O(n)
Graph (Adj List)	O(V + E)
Graph (Adj Matrix)	O(V²)

Choosing a Data Structure

Need fast access by index? → Array

Need fast insertion/deletion at ends? → Linked List, Stack, Queue

Need fast key-value lookup? → Hash Table

Need sorted order? → BST (balanced)

Need priority? → Heap

Need to track relationships? → Graph

Need prefix matching? → Trie

Need range queries? → Segment Tree

Need to group elements? → Disjoint Set