Python Data Structures

Python Data Structures

Category: Programming Updated: 2026-02-15

Python provides a rich set of built-in data structures that are fundamental to writing efficient code. Understanding their characteristics, performance, and use cases is essential for every Python developer.


Quick Reference

Data Structure Ordered Mutable Duplicates Indexed Syntax
List Yes Yes Yes Yes [1, 2, 3]
Tuple Yes No Yes Yes (1, 2, 3)
Set No Yes No No {1, 2, 3}
Frozenset No No No No frozenset([1, 2, 3])
Dict Yes (3.7+) Yes Keys: No Keys {'a': 1, 'b': 2}
Deque Yes Yes Yes Yes deque([1, 2, 3])
NamedTuple Yes No Yes Yes Point(x=1, y=2)
Dataclass Yes @dataclass

List

The most versatile data structure in Python. A dynamic, ordered, mutable sequence.

Characteristics

  • Ordered: Elements maintain insertion order
  • Mutable: Can modify elements after creation
  • Allows duplicates: Same value can appear multiple times
  • Indexed: Access by position with list[i]
  • Dynamic: Grows/shrinks automatically

Operations & Complexity

Operation Example Time Complexity
Access by index lst[i] O(1)
Append to end lst.append(x) O(1) amortized
Insert at position lst.insert(i, x) O(n)
Delete by index del lst[i] O(n)
Pop from end lst.pop() O(1)
Pop from start lst.pop(0) O(n)
Search (in) x in lst O(n)
Slice lst[1:3] O(k)
Sort in-place lst.sort() O(n log n)
Reverse lst.reverse() O(n)
Extend lst.extend(other) O(k)
Copy lst.copy() O(n)

Common Usage

# Creation
numbers = [1, 2, 3, 4, 5]
mixed = [1, "hello", 3.14, True]
empty = []

# Access
first = numbers[0]          # 1
last = numbers[-1]          # 5
sublist = numbers[1:4]      # [2, 3, 4]

# Modification
numbers[0] = 10             # [10, 2, 3, 4, 5]
numbers.append(6)           # [10, 2, 3, 4, 5, 6]
numbers.insert(0, 0)        # [0, 10, 2, 3, 4, 5, 6]
numbers.remove(10)          # [0, 2, 3, 4, 5, 6]
numbers.pop()               # [0, 2, 3, 4, 5]
del numbers[0]              # [2, 3, 4, 5]

# Common operations
numbers.sort()              # In-place sort
sorted_nums = sorted(numbers)  # Returns new sorted list
numbers.reverse()           # In-place reverse
reversed_nums = list(reversed(numbers))  # Returns iterator

# List comprehension
squares = [x**2 for x in range(10)]  # [0, 1, 4, 9, 16, ...]
evens = [x for x in numbers if x % 2 == 0]

# Check membership
if 3 in numbers:
    print("Found!")

# Concatenation
combined = [1, 2] + [3, 4]  # [1, 2, 3, 4]
numbers.extend([6, 7, 8])   # More efficient than +=

# Get length
length = len(numbers)

# Clear all elements
numbers.clear()

When to Use

  • General-purpose sequential data
  • Need random access by index
  • Frequent appends to the end
  • Order matters
  • Need mutable collection

Tuple

An immutable, ordered sequence. Once created, cannot be modified.

Characteristics

  • Immutable: Cannot change after creation
  • Ordered: Maintains insertion order
  • Allows duplicates: Same values allowed
  • Indexed: Access by position
  • Hashable: Can be dict keys or set elements (if all elements are hashable)
  • Memory efficient: Uses less memory than lists

Operations & Complexity

Operation Example Time Complexity
Access by index tpl[i] O(1)
Search x in tpl O(n)
Count occurrences tpl.count(x) O(n)
Find index tpl.index(x) O(n)
Slice tpl[1:3] O(k)

Common Usage

# Creation
point = (3, 5)
single = (42,)              # Note the comma for single element
empty = ()
colors = ("red", "green", "blue")

# Unpacking
x, y = point                # x=3, y=5
first, *rest = colors       # first="red", rest=["green", "blue"]

# Access
first = colors[0]           # "red"
last = colors[-1]           # "blue"

# Cannot modify
# colors[0] = "yellow"      # TypeError!

# Concatenation (creates new tuple)
combined = (1, 2) + (3, 4)  # (1, 2, 3, 4)
repeated = (1, 2) * 3       # (1, 2, 1, 2, 1, 2)

# Methods
count = colors.count("red")      # 1
index = colors.index("green")    # 1

# Convert to list and back
as_list = list(colors)
as_tuple = tuple(as_list)

# Nested tuples
matrix = ((1, 2), (3, 4), (5, 6))

When to Use

  • Data should not change (immutable)
  • Return multiple values from function
  • Use as dictionary keys
  • Slightly faster and more memory-efficient than lists
  • Protect data from accidental modification
  • Represent fixed structures (coordinates, RGB values, etc.)

Set

An unordered collection of unique elements. Provides fast membership testing and set operations.

Characteristics

  • Unordered: No guaranteed order (implementation-dependent)
  • No duplicates: Automatically removes duplicates
  • Mutable: Can add/remove elements
  • Elements must be hashable: Cannot contain lists or dicts
  • Fast membership testing: O(1) average case

Operations & Complexity

Operation Example Time Complexity
Add element s.add(x) O(1) average
Remove element s.remove(x) O(1) average
Discard element s.discard(x) O(1) average
Membership test x in s O(1) average
Union s1 \| s2 O(len(s1) + len(s2))
Intersection s1 & s2 O(min(len(s1), len(s2)))
Difference s1 - s2 O(len(s1))
Symmetric difference s1 ^ s2 O(len(s1) + len(s2))
Subset test s1 <= s2 O(len(s1))

Common Usage

# Creation
numbers = {1, 2, 3, 4, 5}
empty = set()               # Note: {} creates dict, not set
from_list = set([1, 2, 2, 3, 3, 3])  # {1, 2, 3}

# Add/remove
numbers.add(6)              # {1, 2, 3, 4, 5, 6}
numbers.remove(3)           # {1, 2, 4, 5, 6} - raises KeyError if not found
numbers.discard(10)         # No error if not found
popped = numbers.pop()      # Remove and return arbitrary element

# Membership testing (fast!)
if 5 in numbers:
    print("Found!")

# Set operations
a = {1, 2, 3, 4}
b = {3, 4, 5, 6}

union = a | b               # {1, 2, 3, 4, 5, 6}
intersection = a & b        # {3, 4}
difference = a - b          # {1, 2}
symmetric_diff = a ^ b      # {1, 2, 5, 6}

# Method alternatives
union = a.union(b)
intersection = a.intersection(b)
difference = a.difference(b)
sym_diff = a.symmetric_difference(b)

# Subset/superset
is_subset = a <= b          # False
is_superset = a >= b        # False
is_disjoint = a.isdisjoint({7, 8, 9})  # True

# Update operations (in-place)
a.update(b)                 # a |= b
a.intersection_update(b)    # a &= b
a.difference_update(b)      # a -= b

# Remove duplicates from list
unique = list(set([1, 2, 2, 3, 3, 3]))  # [1, 2, 3]

# Set comprehension
squares = {x**2 for x in range(10)}

When to Use

  • Need to enforce uniqueness
  • Fast membership testing required
  • Mathematical set operations needed
  • Remove duplicates from sequence
  • No need for ordering

Dictionary (dict)

A mutable mapping of keys to values. Hash table implementation provides fast lookups.

Characteristics

  • Key-value pairs: Maps keys to values
  • Keys must be hashable: Immutable types (str, int, tuple, etc.)
  • No duplicate keys: Each key appears once
  • Ordered (Python 3.7+): Maintains insertion order
  • Fast lookups: O(1) average case

Operations & Complexity

Operation Example Time Complexity
Access by key d[key] O(1) average
Set value d[key] = value O(1) average
Delete key del d[key] O(1) average
Membership test key in d O(1) average
Get with default d.get(key, default) O(1) average
Keys/values/items d.keys(), d.values(), d.items() O(1) to return view
Iterate for k in d: O(n)
Copy d.copy() O(n)

Common Usage

# Creation
person = {"name": "Alice", "age": 30, "city": "NYC"}
empty = {}
from_pairs = dict([("a", 1), ("b", 2)])
from_kwargs = dict(name="Bob", age=25)

# Access
name = person["name"]       # "Alice"
age = person.get("age")     # 30
height = person.get("height", 0)  # 0 (default)

# Modification
person["age"] = 31          # Update
person["country"] = "USA"   # Add new key
del person["city"]          # Remove key
person.pop("country")       # Remove and return value

# Membership
if "name" in person:        # Check if key exists
    print("Name found")

# Iteration
for key in person:
    print(key, person[key])

for key, value in person.items():
    print(f"{key}: {value}")

for value in person.values():
    print(value)

# Dictionary comprehension
squares = {x: x**2 for x in range(10)}
filtered = {k: v for k, v in person.items() if v != None}

# Merging (Python 3.9+)
merged = {**person, **{"job": "Engineer"}}
person |= {"salary": 100000}  # Update operator

# Common patterns
word_count = {}
for word in words:
    word_count[word] = word_count.get(word, 0) + 1

# Or using defaultdict
from collections import defaultdict
word_count = defaultdict(int)
for word in words:
    word_count[word] += 1

# Get all keys/values as lists
keys = list(person.keys())
values = list(person.values())

# Clear all items
person.clear()

# Set default if key missing
person.setdefault("score", 0)  # Returns 0, adds if not exists

When to Use

  • Need to look up values by key
  • Key-value associations
  • Counting/frequency tables
  • Caching/memoization
  • Configuration data
  • JSON-like data structures

Deque (Double-Ended Queue)

A list-like container optimized for fast appends and pops from both ends.

Import

from collections import deque

Characteristics

  • Double-ended: Efficient operations at both ends
  • Thread-safe: Atomic append and pop operations
  • Bounded: Can set maximum length
  • Not indexed: Slower than list for random access

Operations & Complexity

Operation Example Time Complexity
Append right dq.append(x) O(1)
Append left dq.appendleft(x) O(1)
Pop right dq.pop() O(1)
Pop left dq.popleft() O(1)
Extend right dq.extend(iter) O(k)
Extend left dq.extendleft(iter) O(k)
Rotate dq.rotate(n) O(k)
Access by index dq[i] O(n)
Remove dq.remove(x) O(n)

Common Usage

from collections import deque

# Creation
dq = deque([1, 2, 3])
bounded = deque(maxlen=5)   # Max 5 elements
empty = deque()

# Add/remove from both ends
dq.append(4)                # Right: [1, 2, 3, 4]
dq.appendleft(0)            # Left: [0, 1, 2, 3, 4]
dq.pop()                    # Right: [0, 1, 2, 3]
dq.popleft()                # Left: [1, 2, 3]

# Extend
dq.extend([4, 5])           # [1, 2, 3, 4, 5]
dq.extendleft([0, -1])      # [-1, 0, 1, 2, 3, 4, 5]

# Rotate
dq.rotate(2)                # Rotate right 2: [4, 5, -1, 0, 1, 2, 3]
dq.rotate(-2)               # Rotate left 2: [-1, 0, 1, 2, 3, 4, 5]

# Bounded deque (LRU-like)
recent = deque(maxlen=3)
recent.append(1)            # [1]
recent.append(2)            # [1, 2]
recent.append(3)            # [1, 2, 3]
recent.append(4)            # [2, 3, 4] - oldest popped automatically

# Convert to/from list
as_list = list(dq)
from_list = deque([1, 2, 3])

When to Use

  • Queue implementation (BFS, task scheduling)
  • Stack implementation (when you also need queue features)
  • Sliding window algorithms
  • Recent history tracking (with maxlen)
  • Palindrome checking
  • Undo/redo functionality

Named Tuple

Tuple subclass with named fields. Provides attribute-style access to tuple elements.

Import

from collections import namedtuple

Characteristics

  • Immutable: Like regular tuples
  • Named fields: Access by name or index
  • Memory efficient: More efficient than classes
  • Self-documenting: Field names make code clearer

Common Usage

from collections import namedtuple

# Define a named tuple type
Point = namedtuple('Point', ['x', 'y'])
Person = namedtuple('Person', 'name age city')  # Space-separated

# Create instances
p = Point(3, 5)
person = Person("Alice", 30, "NYC")

# Access by name
print(p.x, p.y)             # 3 5
print(person.name)          # Alice

# Access by index (still works)
print(p[0], p[1])           # 3 5

# Unpack
x, y = p
name, age, city = person

# Convert to dict
person_dict = person._asdict()  # OrderedDict

# Replace (creates new tuple)
older = person._replace(age=31)

# Field names
print(Point._fields)        # ('x', 'y')

# Create from iterable
values = [10, 20]
point = Point(*values)

# Default values (Python 3.7+)
from collections import namedtuple
Node = namedtuple('Node', ['value', 'left', 'right'], defaults=[None, None])
leaf = Node(5)              # left and right default to None

When to Use

  • Simple data classes (before Python 3.7 dataclasses)
  • Return multiple values from functions (more readable than tuple)
  • Replace dictionaries when field names are known
  • CSV or database row representation
  • Lightweight alternative to classes

Dataclass

A decorator that automatically generates boilerplate for classes (Python 3.7+).

Import

from dataclasses import dataclass

Characteristics

  • Mutable (by default, can be frozen)
  • Type hints: Enforces type annotations
  • Less boilerplate: Auto-generates __init__, __repr__, etc.
  • More features than namedtuple: Default values, inheritance, post-init

Common Usage

from dataclasses import dataclass, field
from typing import List

# Basic dataclass
@dataclass
class Point:
    x: float
    y: float

p = Point(3.0, 5.0)
print(p)                    # Point(x=3.0, y=5.0)
p.x = 10                    # Mutable

# With defaults
@dataclass
class Person:
    name: str
    age: int = 0
    city: str = "Unknown"

person = Person("Alice")    # age=0, city="Unknown"

# Frozen (immutable)
@dataclass(frozen=True)
class ImmutablePoint:
    x: float
    y: float

ip = ImmutablePoint(1, 2)
# ip.x = 5                  # Error: frozen

# Field with default factory
@dataclass
class Team:
    name: str
    members: List[str] = field(default_factory=list)

t1 = Team("A")
t2 = Team("B")
t1.members.append("Alice")  # Only affects t1

# Custom initialization
@dataclass
class Rectangle:
    width: float
    height: float
    area: float = field(init=False)

    def __post_init__(self):
        self.area = self.width * self.height

# Comparison
@dataclass(order=True)
class Student:
    score: int
    name: str

s1 = Student(85, "Alice")
s2 = Student(90, "Bob")
print(s1 < s2)              # True (compares by score first)

When to Use

  • Data-holding classes with minimal logic
  • Need mutability (unlike namedtuple)
  • Want type hints and IDE support
  • Classes with many fields (less boilerplate)
  • Python 3.7+

Frozenset

An immutable version of set. Cannot be modified after creation.

Common Usage

# Creation
fs = frozenset([1, 2, 3, 4])
empty = frozenset()

# Can be used as dict key or set element
cache = {frozenset([1, 2]): "result"}
set_of_sets = {frozenset([1, 2]), frozenset([3, 4])}

# All set operations work
a = frozenset([1, 2, 3])
b = frozenset([3, 4, 5])
union = a | b               # frozenset({1, 2, 3, 4, 5})

# Cannot modify
# fs.add(5)                 # Error!

When to Use

  • Need set operations but data should be immutable
  • Use sets as dictionary keys
  • Use sets as elements of other sets

Comparison Table: Time Complexity

Operation List Tuple Set Dict Deque
Access by index O(1) O(1) O(n)
Search O(n) O(n) O(1) O(1) O(n)
Insert/Delete start O(n) O(1)
Insert/Delete end O(1) O(1) O(1) O(1)
Insert/Delete middle O(n) O(n)

Choosing the Right Data Structure

Need Use
Ordered sequence, frequent index access List
Immutable sequence Tuple
Unique elements, fast membership test Set
Key-value lookup Dict
Fast operations at both ends Deque
Named fields, immutable NamedTuple
Data class with type hints Dataclass
Immutable set for dict keys Frozenset

Common Patterns

Remove duplicates preserving order

# Using dict (Python 3.7+)
unique = list(dict.fromkeys(items))

# Using set (loses order)
unique = list(set(items))

Count occurrences

from collections import Counter
counts = Counter(['a', 'b', 'a', 'c', 'b', 'a'])
# Counter({'a': 3, 'b': 2, 'c': 1})

Group by key

from collections import defaultdict
groups = defaultdict(list)
for item in items:
    groups[item.category].append(item)

LRU Cache

from collections import OrderedDict
cache = OrderedDict()
MAX_SIZE = 100

def get(key):
    if key in cache:
        cache.move_to_end(key)
        return cache[key]
    return None

def put(key, value):
    cache[key] = value
    cache.move_to_end(key)
    if len(cache) > MAX_SIZE:
        cache.popitem(last=False)