Data Structures in Python

Contents

2.6. Data Structures in Python#

In programming, a data structure is a specialized format for organizing, processing, retrieving, and storing data. They are fundamental building blocks that allow you to manage data efficiently and effectively. Choosing the right data structure for a given task is crucial for writing performant, readable, and maintainable code.

Python offers several powerful built-in data structures, each with unique properties and use cases. Understanding their characteristics is key to leveraging Python’s full potential.

2.6.1. Lists#

A list is Python’s most versatile and widely used ordered collection of items. Lists are created by enclosing a comma-separated sequence of items within square brackets []

  • Definition/Purpose: A sequence of items, similar to an array in other languages, but with greater flexibility.

  • Key Characteristics:

    • Ordered: Items are stored in a defined sequence based on their insertion order. This order will not change unless explicitly modified.

    • Mutable: You can change, add, or remove items after the list has been created. This “changeability” is a primary distinction from tuples.

    • Allows Duplicates: Can contain multiple items with the same value.

    • Indexed: Items can be accessed using an integer index, starting from 0 for the first element. Negative indexing allows access from the end (-1 for the last element).

    • Slicing: Supports slicing to extract sub-sequences of elements.

    • Heterogeneous: Can contain items of different data types within the same list (integers, strings, floats, booleans, other lists, tuples, dictionaries, etc.).

    • Dynamic Size: Can grow or shrink as needed.

  • Syntax: [item1, item2, item3, ...]

# Creating a list
empty_list = []
my_numbers = [10, 20, 30, 40, 50]
mixed_list = ["apple", 1, True, 3.14, [6, 7]]

print(f"Empty list: {empty_list}")
print(f"List of numbers: {my_numbers}")
print(f"Mixed list: {mixed_list}")
Empty list: []
List of numbers: [10, 20, 30, 40, 50]
Mixed list: ['apple', 1, True, 3.14, [6, 7]]

2.6.1.1. Common List Methods#

Python’s lists come with a rich set of built-in methods for manipulation. Remember that many of these methods modify the list in-place and do not return a new list.

dir(list)
['__add__',
 '__class__',
 '__class_getitem__',
 '__contains__',
 '__delattr__',
 '__delitem__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__iadd__',
 '__imul__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__reversed__',
 '__rmul__',
 '__setattr__',
 '__setitem__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'append',
 'clear',
 'copy',
 'count',
 'extend',
 'index',
 'insert',
 'pop',
 'remove',
 'reverse',
 'sort']

2.6.1.1.1. append(item): Adds a single item to the end of the list.#

fruits = ["apple", "banana", "cherry"]
print(f"Initial list: {fruits}")

# append()
fruits.append("date")
print(f"After append('date'): {fruits}") 
Initial list: ['apple', 'banana', 'cherry']
After append('date'): ['apple', 'banana', 'cherry', 'date']

2.6.1.1.2. extend(iterable): Adds all items from an iterable (like another list) to the end of the list.#

# extend()
print(f"Initial list: {fruits}")
fruits.extend(["elderberry", "fig"])
print(f"After extend(['elderberry', 'fig']): {fruits}")
Initial list: ['apple', 'banana', 'cherry', 'date']
After extend(['elderberry', 'fig']): ['apple', 'banana', 'cherry', 'date', 'elderberry', 'fig']

2.6.1.1.3. insert(index, item): Inserts an item at a specific index.#

# insert
print(f"Initial list: {fruits}")
fruits.insert(1,'grapes')
print(f"After insert(1,'apple'): {fruits}")
Initial list: ['apple', 'banana', 'cherry', 'date', 'elderberry', 'fig']
After insert(1,'apple'): ['apple', 'grapes', 'banana', 'cherry', 'date', 'elderberry', 'fig']

2.6.1.1.4. remove(value): Removes the first occurrence of a specified value. Raises a ValueError if the value is not found.#

# remove()
print(f"Initial list: {fruits}")
fruits.remove("banana")
print(f"After remove('banana'): {fruits}")
Initial list: ['apple', 'grapes', 'banana', 'cherry', 'date', 'elderberry', 'fig']
After remove('banana'): ['apple', 'grapes', 'cherry', 'date', 'elderberry', 'fig']

2.6.1.1.5. clear(): Removes all items from the list, making it empty.#

print(f"Initial list: {fruits}")
fruits.clear()
print(f'After clear(): {fruits}')
Initial list: ['apple', 'grapes', 'cherry', 'date', 'elderberry', 'fig']
After clear(): []

2.6.1.1.6. pop([index]): Removes the item at the given index and returns it. If no index is specified, it removes and returns the last item.#

# pop()
fruits = ["apple", "banana", "cherry"]
print(f"Initial list: {fruits}")
last_fruit = fruits.pop() 
first_fruit = fruits.pop(0) 
print(f"Popped last: {last_fruit}, Popped first: {first_fruit}, List: {fruits}")
Initial list: ['apple', 'banana', 'cherry']
Popped last: cherry, Popped first: apple, List: ['banana']

2.6.1.1.7. index(value): Returns the index of the first occurrence of a value. Raises a ValueError if the value is not found.#

# index() and count()
numbers = [1, 5, 8, 5, 2, 5, 7, 7]
print(f"Index of first '5': {numbers.index(5)}") 
Index of first '5': 1

2.6.1.1.8. count(value): Returns the number of times a value appears in the list.#

print(f"Count of '5': {numbers.count(5)}")
Count of '5': 3

2.6.1.1.9. sort(): Sorts the list’s items in-place in ascending order. You can use reverse=True for descending order.#

# sort()
numbers = [1, 5, 8, 5, 2, 5, 7, 7]
print(f"Initial list of numbers: {numbers}")
numbers.sort()
print(f"After sort(): {numbers}") 
Initial list of numbers: [1, 5, 8, 5, 2, 5, 7, 7]
After sort(): [1, 2, 5, 5, 5, 7, 7, 8]

2.6.1.1.10. reverse(): Reverses the order of the list’s items in-place.#

# reverse()
numbers = [1, 5, 8, 5, 2, 5, 7, 7]
print(f"Inicial list of numbers: {numbers}")
numbers.reverse()
print(f"After reverse(): {numbers}") 
Inicial list of numbers: [1, 5, 8, 5, 2, 5, 7, 7]
After reverse(): [7, 7, 5, 2, 5, 8, 5, 1]

2.6.1.1.11. copy(): Returns a shallow copy of the list. This is important to avoid modifying the original list unintentionally.#

# copy()
# A shallow copy creates a new list, but references to nested objects are shared.
original = [[1, 2], 3, 4]
copied = original.copy()
copied[0][0] = 99 # This will change the original list too, as the inner list is shared!
print(f"Original after modifying copy: {original}") 
print(f"Copied list: {copied}") 
Original after modifying copy: [[99, 2], 3, 4]
Copied list: [[99, 2], 3, 4]

2.6.1.1.12. Using the del statement#

The del keyword is a statement, not a list method (which is why it’s not called like del()). It is a general-purpose statement in Python used to delete objects, including items from a list at a specified index or a slice.

  • Syntax: del list[index] or del list[start:end]

  • Behavior: It removes the item(s) from the list at the specified position(s) and does not return the removed item(s).

# del statement (removes by index)
print(f"Using 'del' statement:")
colors = ['red', 'green', 'blue', 'yellow']
print(f"Original list: {colors}")
del colors[1]
print(f"After 'del colors[1]': {colors}") 
del colors[1:3]
print(f"After 'del colors[1:3]': {colors}") 
Using 'del' statement:
Original list: ['red', 'green', 'blue', 'yellow']
After 'del colors[1]': ['red', 'blue', 'yellow']
After 'del colors[1:3]': ['red']

2.6.1.2. Using Lists as Stacks#

A Stack is a data structure that follows the LIFO (Last-In, First-Out) principle. The last item added is the first one to be removed. You can easily implement a stack using a list.

  • Pushing an item: Use append() to add an item to the top of the stack.

  • Popping an item: Use pop() to remove and return the item from the top of the stack.

# Create a stack of tasks
task_stack = []

# Push items onto the stack (append)
task_stack.append("Task 1: Prepare slides")
task_stack.append("Task 2: Write code")
task_stack.append("Task 3: Review documentation")

print(f"Initial stack: {task_stack}")
Initial stack: ['Task 1: Prepare slides', 'Task 2: Write code', 'Task 3: Review documentation']
# Pop items from the stack (the last one added is removed first)
last_task = task_stack.pop()
print(f"Popped task: '{last_task}'") 
current_task = task_stack.pop()
print(f"Popped task: '{current_task}'")

print(f"Stack after popping: {task_stack}") 
Popped task: 'Task 3: Review documentation'
Popped task: 'Task 2: Write code'
Stack after popping: ['Task 1: Prepare slides']

2.6.1.3. Using Lists as Queues#

A Queue is a data structure that follows the FIFO (First-In, First-Out) principle. The first item added is the first one to be removed.

  • Enqueuing (adding): Use append() to add an item to the end of the queue.

  • Dequeuing (removing): Use pop(0) to remove and return the item from the beginning of the queue.

Important Performance Note: For large lists, pop(0) is very inefficient. Removing the first element requires shifting all other elements one position to the left, which can be a slow operation (O(n) time complexity).

Recommendation: For a true and efficient queue, use collections.deque. It’s a double-ended queue designed for fast appends and pops from both ends.

from collections import deque

# Using deque for an efficient queue
customer_queue = deque(["Customer A", "Customer B", "Customer C"])

print(f"Initial queue: {customer_queue}")

# Enqueue an item (append to the right)
customer_queue.append("Customer D")
print(f"After adding 'Customer D': {customer_queue}")

# Dequeue an item (pop from the left)
next_customer = customer_queue.popleft()
print(f"Serving customer: '{next_customer}'")

print(f"Queue after serving: {customer_queue}")
Initial queue: deque(['Customer A', 'Customer B', 'Customer C'])
After adding 'Customer D': deque(['Customer A', 'Customer B', 'Customer C', 'Customer D'])
Serving customer: 'Customer A'
Queue after serving: deque(['Customer B', 'Customer C', 'Customer D'])

2.6.1.4. List Comprehensions#

List comprehensions provide a concise and elegant way to create lists. They are a more readable and often faster alternative to using for loops with append().

  • Syntax: [expression for item in iterable if condition]

# Old way: Using a for loop
squares_old = []
for i in range(1, 6):
    squares_old.append(i**2)
print(f"Squares (old way): {squares_old}")

# New way: Using a list comprehension
squares_new = [i**2 for i in range(1, 6)]
print(f"Squares (new way): {squares_new}")

# List comprehension with a conditional (filtering)
even_squares = [i**2 for i in range(1, 11) if i % 2 == 0]
print(f"Even numbers squared: {even_squares}") 
Squares (old way): [1, 4, 9, 16, 25]
Squares (new way): [1, 4, 9, 16, 25]
Even numbers squared: [4, 16, 36, 64, 100]

2.6.1.5. Nested List Comprehensions#

You can also use list comprehensions to create nested lists, often replacing nested loops. This is useful for creating matrices or grids.

# Create a 3x3 matrix (list of lists)
matrix = [[j for j in range(3)] for i in range(3)]
print(f"3x3 matrix: {matrix}")

# Flatten a nested list into a single list
nested_list = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
flattened_list = [num for sublist in nested_list for num in sublist]
print(f"Flattened list: {flattened_list}") 
3x3 matrix: [[0, 1, 2], [0, 1, 2], [0, 1, 2]]
Flattened list: [1, 2, 3, 4, 5, 6, 7, 8, 9]

2.6.2. Tuples#

A tuple is an ordered, immutable collection of items. Tuples are defined by enclosing comma-separated values within parentheses ()

  • Definition/Purpose: Often used for fixed collections of items where the content is not expected to change, such as coordinates, database records, or function arguments.

  • Key Characteristics:

    • Ordered: Items have a defined order, similar to lists.

    • Immutable: This is its key feature. Once a tuple is created, you cannot change its elements, add new ones, or remove existing ones.

    • Allows Duplicates: Can hold multiple items with the same value.

    • Indexed: Items can be accessed using an integer index.

    • Slicing: Supports slicing to extract sub-sequences.

    • Heterogeneous: Can contain items of different data types.

    • Fixed Size: Its size is determined at creation and cannot change.

  • Syntax: (item1, item2, item3, ...) For a single-item tuple, you must include a trailing comma: (item,)

# Creating a tuple
empty_tuple = ()
my_tuple = ("red", "green", "blue", 123,1, "green")
single_item_tuple = (42,) # Note the comma for single item tuple
another_tuple = "apple", "banana" # Parentheses are often optional during creation

print(f"Empty tuple: {empty_tuple}")
print(f"My tuple: {my_tuple}")
print(f"Single item tuple: {single_item_tuple}")
print(f"Another tuple: {another_tuple}")
Empty tuple: ()
My tuple: ('red', 'green', 'blue', 123, 1, 'green')
Single item tuple: (42,)
Another tuple: ('apple', 'banana')
# Accessing elements
print(f"Accessing Elements:")
print(f"First element (my_tuple[0]): {my_tuple[0]}")
print(f"Last element (my_tuple[-1]): {my_tuple[-1]}")
Accessing Elements:
First element (my_tuple[0]): red
Last element (my_tuple[-1]): green
# --- Attempting to modify (will cause a TypeError!) ---
print(f"Attempting to Modify:")
try:
    my_tuple[0] = "yellow"
except TypeError as e:
    print(f"Error! Tuples are immutable: {e}")
Attempting to Modify:
Error! Tuples are immutable: 'tuple' object does not support item assignment
# Slicing tuples (returns a new tuple)
print(f"Slicing Tuples:")
print(f"Sliced tuple (my_tuple[1:3]): {my_tuple[1:3]}") 
Slicing Tuples:
Sliced tuple (my_tuple[1:3]): ('green', 'blue')
# Tuple packing and unpacking
print(f"Tuple Packing and Unpacking:")
coordinates = (10, 20) # Packing
x, y = coordinates     # Unpacking
print(f"Coordinates: x={x}, y={y}")
Tuple Packing and Unpacking:
Coordinates: x=10, y=20
# Functions returning multiple values often return them as a tuple
def get_min_max(numbers):
    return min(numbers), max(numbers)

min_val, max_val = get_min_max([10, 5, 20, 1])
print(f"Min: {min_val}, Max: {max_val}")
Min: 1, Max: 20

2.6.2.1. Common Tuples Methods#

dir(tuple)
['__add__',
 '__class__',
 '__class_getitem__',
 '__contains__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getnewargs__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rmul__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'count',
 'index']

2.6.2.1.1. index(value): Returns the index of the first occurrence of a value. Raises a ValueError if the value is not found.#

my_tuple.index("green")
print(f'Index of green value in tuple is: {my_tuple.index("green")}') 
Index of green value in tuple is: 1

2.6.2.1.2. count(value): Returns the number of times a value appears in the list.#

my_tuple.count("green")
print(f'Count of green value in tuple is: {my_tuple.index("green")}') 
Count of green value in tuple is: 1

2.6.3. Sets#

A set is an unordered collection of unique items. Sets can be created using curly braces {} or the set() constructor.

  • Definition/Purpose: Primarily used for membership testing, removing duplicates from a sequence, and mathematical set operations (union, intersection, difference).

  • Key Characteristics:

    • Unordered: Items do not have a defined order and cannot be accessed by index or key. The order of elements when printed might vary.

    • Mutable: You can add or remove items from a set, but you cannot change individual elements in place (as elements are not indexed).

    • No Duplicates: Automatically removes duplicate elements. If you add an existing element, the set remains unchanged.

    • Unindexed: Does not support indexing or slicing.

    • Elements Must Be Hashable: Elements must be immutable and hashable (like numbers, strings, tuples containing only immutable types). Lists or dictionaries cannot be direct elements of a set.

    • Heterogeneous: Can contain items of different data types within the same list (integers, strings, floats, booleans, other lists, tuples, dictionaries, etc.).

    • Dynamic Size: Can grow or shrink.

  • Syntax: {item1, item2, ...} (use set() to create an empty set, as {} creates an empty dictionary).

# Creating a set with duplicate values
my_set = {"apple", "banana", "cherry", "apple", "banana"}
print(f"Set with duplicates removed: {my_set}")
Set with duplicates removed: {'apple', 'cherry', 'banana'}
# Creating an empty set
empty_set = set()
print(f"Empty set: {empty_set}")
Empty set: set()
# we can use set() to create an set from others data scturecures
fruits_list = ['banana', 'apple', 'cherry','banana', 'cherry'] 
fruits_list
['banana', 'apple', 'cherry', 'banana', 'cherry']
fruits_set = set(fruits_list)
fruits_set
{'apple', 'banana', 'cherry'}
# we can use len
len(my_set)
3
# since set its not index, we need to use if to check if an element is present
# (very fast!)
'orange' in my_set
False
# we can use for loops to print each element
for fruits in my_set:
    print(fruits.upper())
APPLE
CHERRY
BANANA

2.6.3.1. Common Set Methods#

dir(set)
['__and__',
 '__class__',
 '__class_getitem__',
 '__contains__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__iand__',
 '__init__',
 '__init_subclass__',
 '__ior__',
 '__isub__',
 '__iter__',
 '__ixor__',
 '__le__',
 '__len__',
 '__lt__',
 '__ne__',
 '__new__',
 '__or__',
 '__rand__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__ror__',
 '__rsub__',
 '__rxor__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__sub__',
 '__subclasshook__',
 '__xor__',
 'add',
 'clear',
 'copy',
 'difference',
 'difference_update',
 'discard',
 'intersection',
 'intersection_update',
 'isdisjoint',
 'issubset',
 'issuperset',
 'pop',
 'remove',
 'symmetric_difference',
 'symmetric_difference_update',
 'union',
 'update']

2.6.3.1.1. add(value): This operation incorporates a single, new element into the set. If the element is already present, the set remains unchanged.#

# Adding elements
print(f"Inicial set {my_set}:")
my_set.add("orange")
my_set.add("cherry") # Adding an existing element has no effect
print(f"Set after adding 'orange' and 'cherry': {my_set}")
Inicial set {'apple', 'cherry', 'banana'}:
Set after adding 'orange' and 'cherry': {'apple', 'cherry', 'orange', 'banana'}

2.6.3.1.2. remove(value): This operation removes a specified element from the set. If the element is not found in the set, it raises a KeyError.#

# Removing elements
print(f"Inicial set {my_set}:")
my_set.remove("banana") # Removes a specific element
print(f"Set after removing 'banana': {my_set}")
Inicial set {'apple', 'cherry', 'orange', 'banana'}:
Set after removing 'banana': {'apple', 'cherry', 'orange'}
print(f"Inicial set {my_set}:")
try:
    my_set.remove("grape")
except KeyError as e:
    print(f"Error! Element not found in the set: {e}")
Inicial set {'apple', 'cherry', 'orange'}:
Error! Element not found in the set: 'grape'

2.6.3.1.3. discard(value): This operation removes a specified element from the set if it is present. Unlike remove, it does not raise an error if the element is not found.#

my_set = {"apple", "banana", "cherry", "apple", "banana"}
print(f"Inicial set {my_set}:")
my_set.discard("grape") # Removes 'grape' if it exists, but does not raise an error if not found
print(f"Set after discarding 'grape': {my_set}")
Inicial set {'apple', 'cherry', 'banana'}:
Set after discarding 'grape': {'apple', 'cherry', 'banana'}

2.6.3.1.4. pop(): This operation don’t requerie argument and removes and returns an arbitrary element from the set. Since sets are unordered, there’s no way to predict which element will be removed. An error is raised if the set is empty.#

# pop
my_set = {"apple", "banana", "cherry", "apple", "banana"}
print(f"Inicial set {my_set}:")
popped_item = my_set.pop() # Removes and returns an arbitrary element
print(f"Popped item: {popped_item}, Set: {my_set}")
Inicial set {'apple', 'cherry', 'banana'}:
Popped item: apple, Set: {'cherry', 'banana'}

2.6.3.1.5. clear(): This operation removes all elements from the set, resulting in an empty set.#

my_set = {"apple", "banana", "cherry", "apple", "banana"}
print(f"Inicial set {my_set}:")
popped_item = my_set.clear() # clear
print(f"After clear(): {my_set}")
Inicial set {'apple', 'cherry', 'banana'}:
After clear(): set()

2.6.3.1.6. copy(): This operation creates a new set that is an exact duplicate of the original set. Changes made to the new set will not affect the original.#

my_set = {"apple", "banana", "cherry", "apple", "banana"}
print(f"Inicial set {my_set}:")
copied = my_set.copy()
print(f"Copied set: {copied}") 
Inicial set {'apple', 'cherry', 'banana'}:
Copied set: {'apple', 'cherry', 'banana'}

2.6.3.1.7. update(): This operation adds all elements from an iterable (like another set, list, or tuple) into the original set. Duplicate elements are ignored#

my_items = {"A", "B"}
new_items = ["B", "C", "D"] # Can update from a list
more_items = {"D", "E"}     # Can update from another set

print(f"My Items: {my_items}, New Items: {new_items}, More Items: {more_items}")

my_items.update(new_items, more_items)
print(f"After update(new_items, more_items): {my_items}") 
My Items: {'B', 'A'}, New Items: ['B', 'C', 'D'], More Items: {'E', 'D'}
After update(new_items, more_items): {'A', 'D', 'E', 'B', 'C'}

2.6.3.1.8. union(other_set) or |: This operation returns a new set containing all unique elements from both the original set and another specified set(s).#

# union(other_set) or |
my_items = {"A", "B"}
my_items |= {"C", "D"}
print(f"After |= {'C', 'D'}:\n{my_items}")
After |= ('C', 'D'):
{'B', 'A', 'D', 'C'}
all_fruits = {"apple", "banana"}
tropical_fruits = {"banana", "mango"}
berries = {"strawberry", "blueberry"}

print(f"All Fruits: {all_fruits}, \nTropical Fruits: {tropical_fruits}, \nBerries: {berries}")

combined_fruits = all_fruits.union(tropical_fruits)
print(f"Union (all_fruits.union(tropical_fruits)): {combined_fruits}") 

all_three = all_fruits.union(tropical_fruits, berries)
print(f"Union (all_fruits.union(tropical_fruits, berries)): {all_three}")

union_operator = all_fruits | tropical_fruits
print(f"Union (all_fruits | tropical_fruits): {union_operator}")
All Fruits: {'apple', 'banana'}, 
Tropical Fruits: {'mango', 'banana'}, 
Berries: {'strawberry', 'blueberry'}
Union (all_fruits.union(tropical_fruits)): {'apple', 'mango', 'banana'}
Union (all_fruits.union(tropical_fruits, berries)): {'apple', 'banana', 'strawberry', 'mango', 'blueberry'}
Union (all_fruits | tropical_fruits): {'apple', 'mango', 'banana'}

2.6.3.1.9. difference(other_set) or -: This operation returns a new set containing all elements that are present in the first set but not in another specified set(s).#

# difference
set_a = {1, 2, 3, 4}
set_b = {3, 4, 5, 6}
set_c = {4, 7}
print(f"Set A: {set_a}, Set B: {set_b}, Set C: {set_c}")

diff_ab = set_a.difference(set_b)
print(f"Elements in A but not in B (A.difference(B)): {diff_ab}") 

diff_abc = set_a.difference(set_b, set_c)
print(f"Elements in A but not in B or C (A.difference(B, C)): {diff_abc}") 

diff_operator = set_a - set_b
print(f"Elements in A but not in B (A - B): {diff_operator}") 
Set A: {1, 2, 3, 4}, Set B: {3, 4, 5, 6}, Set C: {4, 7}
Elements in A but not in B (A.difference(B)): {1, 2}
Elements in A but not in B or C (A.difference(B, C)): {1, 2}
Elements in A but not in B (A - B): {1, 2}

2.6.3.1.10. difference_update(other_set) or -=: This operation modifies the original set by removing all elements that are also present in another specified set(s). We can use the operador#

# difference_update
set_a.difference_update(set_b)
print(f"Remove elements of B on A using A.difference_update(B): {set_a}") 

set_a -= set_b
print(f"Remove elements of set_b from set_a using operator -=: {set_a}") # {1, 2}
Remove elements of B on A using A.difference_update(B): {1, 2}
Remove elements of set_b from set_a using operator -=: {1, 2}

2.6.3.1.11. intersection(other_set) or &: This operation returns a new set containing only the elements that are common to both the original set and another specified set(s).#

# intersection
set_a = {1, 2, 3, 4}
set_b = {3, 4, 5, 6}
set_c = {4, 7}
print(f"Set A: {set_a}, Set B: {set_b}, Set C: {set_c}")

common_ab = set_a.intersection(set_b)
print(f"Common elements of A and B (A.intersection(B)): {common_ab}") # {3, 4}

common_abc = set_a.intersection(set_b, set_c)
print(f"Common elements of A, B, and C (A.intersection(B, C)): {common_abc}") # {4}

common_operator = set_a & set_b
print(f"Common elements of A and B (A & B): {common_operator}") # {3, 4}
Set A: {1, 2, 3, 4}, Set B: {3, 4, 5, 6}, Set C: {4, 7}
Common elements of A and B (A.intersection(B)): {3, 4}
Common elements of A, B, and C (A.intersection(B, C)): {4}
Common elements of A and B (A & B): {3, 4}

2.6.3.1.12. intersection_update(other_set) or &=: This operation modifies the original set to contain only the elements that are common to both the original set and another specified set(s).#

# intersection_update
set_a.intersection_update(set_b)
print(f"Update elements of A with common elements of A,B using A.intersection_update(B): {set_a}") # {4}

set_a &= set_b
print(f"Update elements of A with common elements of A,B using operator &=: {set_a}") # {4}
Update elements of A with common elements of A,B using A.intersection_update(B): {3, 4}
Update elements of A with common elements of A,B using operator &=: {3, 4}

2.6.3.1.13. isdisjoint(other_set): This operation checks if two sets have no elements in common. It returns True if they are disjoint, and False otherwise.#

# isdisjoin
set_evens = {2, 4, 6}
set_odds = {1, 3, 5}
set_primes = {2, 3, 5}

print(f"Set Evens: {set_evens}, \nSet Odds: {set_odds}, \nSet Primes: {set_primes}")

print(f"Are Evens and Odds disjoint? {set_evens.isdisjoint(set_odds)}") 
print(f"Are Evens and Primes disjoint? {set_evens.isdisjoint(set_primes)}")
Set Evens: {2, 4, 6}, 
Set Odds: {1, 3, 5}, 
Set Primes: {2, 3, 5}
Are Evens and Odds disjoint? True
Are Evens and Primes disjoint? False

2.6.3.1.14. issubset(other_set) or <=: This operation checks if all elements of one set are also present in another set. It returns True if the first set is a subset of the second, and False otherwise.#

# issuperset
main_set = {1, 2, 3, 4, 5}
sub_set = {2, 3}
non_sub_set = {1, 6}

print(f"Main Set: {main_set}, \nSub Set: {sub_set}, \nNon Sub Set: {non_sub_set}")
print(f"Is {sub_set} a subset of {main_set}? {sub_set.issubset(main_set)}") 
print(f"Is {sub_set} <= {main_set}? {sub_set <= main_set}") 
print(f"Is {non_sub_set} a subset of {main_set}? {non_sub_set.issubset(main_set)}") 
Main Set: {1, 2, 3, 4, 5}, 
Sub Set: {2, 3}, 
Non Sub Set: {1, 6}
Is {2, 3} a subset of {1, 2, 3, 4, 5}? True
Is {2, 3} <= {1, 2, 3, 4, 5}? True
Is {1, 6} a subset of {1, 2, 3, 4, 5}? False

2.6.3.1.15. issuperset(other_set) or >=: This operation checks if one set contains all the elements of another set. It returns True if the first set is a superset of the second, and False otherwise.#

# isuperset
main_set = {1, 2, 3, 4, 5}
sub_set = {2, 3}

print(f"Main Set: {main_set}, \nSub Set: {sub_set}")

print(f"Is {main_set} a superset of {sub_set}? {main_set.issuperset(sub_set)}") 
print(f"Is {main_set} >= {sub_set}? {main_set >= sub_set}")
print(f"Is {sub_set} a superset of {main_set}? {sub_set.issuperset(main_set)}") #
Main Set: {1, 2, 3, 4, 5}, 
Sub Set: {2, 3}
Is {1, 2, 3, 4, 5} a superset of {2, 3}? True
Is {1, 2, 3, 4, 5} >= {2, 3}? True
Is {2, 3} a superset of {1, 2, 3, 4, 5}? False

2.6.3.1.16. symmetric_difference(other_set) or ^: This operation returns a new set containing all elements that are in either of two sets, but not in their intersection (i.e., elements unique to each set).#

# symmetric_difference
set_x = {1, 2, 3, 4}
set_y = {3, 4, 5, 6}

print(f"Set X: {set_x}, \nSet Y: {set_y}")

sym_diff = set_x.symmetric_difference(set_y)
print(f"Symmetric difference (X.symmetric_difference(Y)): {sym_diff}") 
sym_diff_operator = set_x ^ set_y
print(f"Symmetric difference (X ^ Y): {sym_diff_operator}") 
my_data = {1, 2, 3}
new_data = {3, 4, 5}
print(f"My Data: {my_data}, \nNew Data: {new_data}")
Set X: {1, 2, 3, 4}, 
Set Y: {3, 4, 5, 6}
Symmetric difference (X.symmetric_difference(Y)): {1, 2, 5, 6}
Symmetric difference (X ^ Y): {1, 2, 5, 6}
My Data: {1, 2, 3}, 
New Data: {3, 4, 5}

2.6.3.1.17. symmetric_difference_update(other_set) or ^=: This operation modifies the original set to contain only the elements that are in either of the original set or another specified set, but not in their intersection.#

# symmetric_difference_update(other_set)
my_data = {1, 2, 3}
another_data = {2, 6}
my_data.symmetric_difference_update(another_data)
print(f"After symmetric_difference_update(new_data): {my_data}") 

# Using the operator
my_data = {1, 2, 3}
another_data = {2, 6}
my_data ^= another_data
print(f"After ^= another_data: {my_data}")
After symmetric_difference_update(new_data): {1, 3, 6}
After ^= another_data: {1, 3, 6}

2.6.3.2. frozenset#

A frozenset is an immutable version of a set. Once created, you cannot add or remove elements. This makes frozensets hashable, meaning they can be used as elements in other sets or as keys in dictionaries.

# Creating a frozenset
immutable_set = frozenset([1, 2, 3, 2])
print(f"Frozenset: {immutable_set}")
print(f"Type of frozenset: {type(immutable_set)}")

# Attempting to add/remove (will raise an AttributeError)
try:
    immutable_set.add(4)
except AttributeError as e:
    print(f"Error! Frozensets are immutable: {e}")

# Frozensets can be elements of a regular set
nested_set = {frozenset([1, 2]), frozenset([3, 4])}
print(f"Nested set with frozensets: {nested_set}")
Frozenset: frozenset({1, 2, 3})
Type of frozenset: <class 'frozenset'>
Error! Frozensets are immutable: 'frozenset' object has no attribute 'add'
Nested set with frozensets: {frozenset({3, 4}), frozenset({1, 2})}

2.6.4. Dictionaries#

A dict (dictionary) is a collection of key-value pairs. Each unique key maps to a specific value, they are typically created using curly braces {} with key-value pairs separated by colons : and individual pairs separated by commas ,

  • Definition/Purpose: Ideal for storing data where each piece of information is associated with a unique identifier (the key), allowing for very fast lookups.

  • Key Characteristics:

    • Ordered (since Python 3.7): Items are kept in the order they were inserted. Before 3.7, they were unordered.

    • Mutable: You can add new key-value pairs, remove existing ones, and change the values associated with keys.

    • Keys are Unique: Each key must be unique within a dictionary. If you assign a new value to an existing key, it overwrites the old value.

    • Keys Must Be Immutable and Hashable: Keys must be of an immutable type (e.g., strings, numbers, tuples containing only immutable types). Lists and dictionaries cannot be keys.

    • Values Can Be Anything: Values can be of any data type and can be duplicates.

    • Mapped: You access values using their associated keys, not a numerical index.

    • Dynamic Size: Can grow or shrink as needed.

  • Syntax: {key1: value1, key2: value2, ...}

# Creating a dictionary
empty_dict = {}
my_dict = {
    "name": "Alice",
    "age": 30,
    "city": "New York",
    "occupation": "Engineer",
    "skills": ["Python", "SQL", "ML"] # Value can be a list
}

print(f"Empty dictionary: {empty_dict}")
print(f"Person dictionary: {my_dict}")
Empty dictionary: {}
Person dictionary: {'name': 'Alice', 'age': 30, 'city': 'New York', 'occupation': 'Engineer', 'skills': ['Python', 'SQL', 'ML']}
# Checking for key existence
print(f"\nChecking Key Existence:")
print(f"Is 'name' in person? {'name' in my_dict}") # Output: True
print(f"Is 'salary' in person? {'salary' in my_dict}") # Output: False
Checking Key Existence:
Is 'name' in person? True
Is 'salary' in person? False

2.6.4.1. Common Dictionary Methods#

dir(dict)
['__class__',
 '__class_getitem__',
 '__contains__',
 '__delattr__',
 '__delitem__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__ior__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__ne__',
 '__new__',
 '__or__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__reversed__',
 '__ror__',
 '__setattr__',
 '__setitem__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'clear',
 'copy',
 'fromkeys',
 'get',
 'items',
 'keys',
 'pop',
 'popitem',
 'setdefault',
 'update',
 'values']

2.6.4.1.1. clear(): Removes all key-value pairs from the dictionary.#

# clear
print(f"Original dictionary: {my_dict}") 
my_dict.clear()
print(f"Dictionary after clear(): {my_dict}")  # {}
Original dictionary: {'name': 'Alice', 'age': 30, 'city': 'New York', 'occupation': 'Engineer', 'skills': ['Python', 'SQL', 'ML']}
Dictionary after clear(): {}

2.6.4.1.2. copy(): Returns a shallow copy of the dictionary. This means a new dictionary is created, but nested mutable objects (like lists or other dictionaries) within the original will still be referenced by the copy.#

# copy
original_dict = {'a': 1, 'b': [2, 3]}
copied_dict = original_dict.copy()
print(f"Original dictionary: {original_dict}") 
print(f"Copied dictionary: {copied_dict}")    
Original dictionary: {'a': 1, 'b': [2, 3]}
Copied dictionary: {'a': 1, 'b': [2, 3]}

2.6.4.1.3. items(): Returns a new view of the dictionary’s key-value pairs as tuples ((key, value)). This view is dynamic, meaning it reflects changes made to the dictionary.#

person = {'name': 'David', 'age': 35, 'city': 'London'}
print(f"Person dictionary: {person}")

all_items = person.items()
print(f"All items (person.items()): {all_items}") 

# Iterate over items
print("Iterating over items:")
for key, value in all_items:
    print(f"{key}: {value}")

# The view is dynamic
person['occupation'] = 'Artist'
print(f"Items after adding 'occupation': {all_items}") 
Person dictionary: {'name': 'David', 'age': 35, 'city': 'London'}
All items (person.items()): dict_items([('name', 'David'), ('age', 35), ('city', 'London')])
Iterating over items:
name: David
age: 35
city: London
Items after adding 'occupation': dict_items([('name', 'David'), ('age', 35), ('city', 'London'), ('occupation', 'Artist')])

2.6.4.1.4. values(): Returns a new view of the dictionary’s values. This view is dynamic, reflecting changes made to the dictionary.#

all_values = person.values()
print(f"All items (person.values()): {all_values}") 

# Iterate over items
print("Iterating over keys:")
for value in all_values:
    print(f"{value}")

# The view is dynamic
person['name'] = 'John'
print(f"Values after chaging name to 'John': {all_values}") 
All items (person.values()): dict_values(['David', 35, 'London', 'Artist'])
Iterating over keys:
David
35
London
Artist
Values after chaging name to 'John': dict_values(['John', 35, 'London', 'Artist'])

2.6.4.1.5. keys(): Returns a new view of the dictionary’s keys. This view is dynamic, reflecting changes made to the dictionary.#

all_keys = person.keys()
print(f"All items (person.keys()): {all_keys}") 

# Iterate over items
print("Iterating over keys:")
for key in all_keys:
    print(f"{key}")

# The view is dynamic
person['Sex'] = 'Male'
print(f"Keys after adding 'Sex': {all_keys}") 
All items (person.keys()): dict_keys(['name', 'age', 'city', 'occupation'])
Iterating over keys:
name
age
city
occupation
Keys after adding 'Sex': dict_keys(['name', 'age', 'city', 'occupation', 'Sex'])

2.6.4.1.6. fromkeys(iterable, value=None): Creates a new dictionary with keys from the iterable and values set to value. If value is not specified, it defaults to None.#

# from keys
keys = ['name', 'age', 'city']

# Create a dictionary with default value None
person_info = dict.fromkeys(keys)
print(f"Dictionary from keys (default None): {person_info}") # {'name': None, 'age': None, 'city': None}

# Create a dictionary with a specified default value
initial_scores = dict.fromkeys(['math', 'science', 'history'], 0)
print(f"Dictionary from keys (default 0): {initial_scores}") # {'math': 0, 'science': 0, 'history': 0}

# If the default value is a mutable object, all keys will point to the SAME object!
# Be careful with this:
mutable_default = dict.fromkeys(['item1', 'item2'], [])
mutable_default['item1'].append('added')
print(f"Mutable default issue: {mutable_default}") # {'item1': ['added'], 'item2': ['added']} (both modified!)
Dictionary from keys (default None): {'name': None, 'age': None, 'city': None}
Dictionary from keys (default 0): {'math': 0, 'science': 0, 'history': 0}
Mutable default issue: {'item1': ['added'], 'item2': ['added']}

2.6.4.1.7. get(key, default_value=None): Returns the value for the specified key. If the key is not found, it returns default_value (which is None by default) instead of raising a KeyError.#

# get
student = {'name': 'Emily', 'grade': 'A', 'major': 'Computer Science'}
print(f"Student dictionary: {student}")

# Get an existing key
student_name = student.get('name')
print(f"Student name (get('name')): {student_name}") 

# Get a non-existing key (returns None by default)
student_email = student.get('email')
print(f"Student email (get('email')): {student_email}")

# Get a non-existing key with a custom default value
student_phone = student.get('phone', 'N/A')
print(f"Student phone (get('phone', 'N/A')): {student_phone}") 

# Direct access (will raise KeyError if key not found)
try:
    print(student['address'])
except KeyError as e:
    print(f"Error! Direct access to non-existing key: {e}")
Student dictionary: {'name': 'Emily', 'grade': 'A', 'major': 'Computer Science'}
Student name (get('name')): Emily
Student email (get('email')): None
Student phone (get('phone', 'N/A')): N/A
Error! Direct access to non-existing key: 'address'

2.6.4.1.8. pop(key, default_value_if_not_found): Removes the specified key from the dictionary and returns its corresponding value. If the key is not found, it returns default_value_if_not_found if provided; otherwise, it raises a KeyError.#

# pop
user_profile = {'username': 'coder123', 'email': 'coder@example.com', 'status': 'active'}
print(f"Initial profile: {user_profile}")

# Pop an existing key
email = user_profile.pop('email')
print(f"Popped email: '{email}', Profile: {user_profile}") 

# Try to pop a non-existing key without default (will error)
try:
    user_profile.pop('phone')
except KeyError as e:
    print(f"Error popping non-existing key without default: {e}")

# Pop a non-existing key with a default value
last_login = user_profile.pop('last_login', 'Never')
print(f"Popped 'last_login' (with default): '{last_login}', Profile: {user_profile}")
Initial profile: {'username': 'coder123', 'email': 'coder@example.com', 'status': 'active'}
Popped email: 'coder@example.com', Profile: {'username': 'coder123', 'status': 'active'}
Error popping non-existing key without default: 'phone'
Popped 'last_login' (with default): 'Never', Profile: {'username': 'coder123', 'status': 'active'}

2.6.4.1.9. popitem(): Removes and returns an arbitrary key-value pair from the dictionary. As of Python 3.7+, it specifically removes and returns the last inserted key-value pair. Raises a KeyError if the dictionary is empty.#

# pop item 
settings = {'theme': 'dark', 'font_size': 14, 'notifications': True}
print(f"Initial settings: {settings}")

# Pop the last inserted item (Python 3.7+)
item1 = settings.popitem()
print(f"Popped item: {item1}, Remaining settings: {settings}") 

item2 = settings.popitem()
print(f"Popped item: {item2}, Remaining settings: {settings}") 

# Trying to pop from an empty dictionary
empty_settings = {}
try:
    empty_settings.popitem()
except KeyError as e:
    print(f"Error popping from empty dictionary: {e}")
Initial settings: {'theme': 'dark', 'font_size': 14, 'notifications': True}
Popped item: ('notifications', True), Remaining settings: {'theme': 'dark', 'font_size': 14}
Popped item: ('font_size', 14), Remaining settings: {'theme': 'dark'}
Error popping from empty dictionary: 'popitem(): dictionary is empty'

2.6.4.1.10. setdefault(key, default_value=None): Returns the value for the specified key. If the key is not found, it inserts the key with default_value (or None if not specified) into the dictionary and returns that default_value.#

# setdefault
data = {'count': 10, 'status': 'active'}
print(f"Initial data: {data}")

# Key exists: returns its value, dictionary unchanged
current_count = data.setdefault('count', 0)
print(f"Value for 'count': {current_count}, Data: {data}") 

# Key does not exist: inserts key with default value, returns default value
last_updated = data.setdefault('last_updated', '2025-01-01')
print(f"Value for 'last_updated': '{last_updated}', Data: {data}") 

# Key does not exist, no default value provided: inserts with None
category = data.setdefault('category')
print(f"Value for 'category': {category}, Data: {data}") 
Initial data: {'count': 10, 'status': 'active'}
Value for 'count': 10, Data: {'count': 10, 'status': 'active'}
Value for 'last_updated': '2025-01-01', Data: {'count': 10, 'status': 'active', 'last_updated': '2025-01-01'}
Value for 'category': None, Data: {'count': 10, 'status': 'active', 'last_updated': '2025-01-01', 'category': None}

2.6.4.1.11. update([other_dict_or_iterable]): Updates the dictionary with key-value pairs from other_dict_or_iterable (which can be another dictionary, or an iterable of key-value pairs), overwriting existing keys.#

# update
user_info = {'name': 'John', 'age': 25}
new_details = {'age': 26, 'city': 'New York'}
print(f"Initial user_info: {user_info}") 

# Update with another dictionary
user_info.update(new_details)
print(f"After update with new_details: {user_info}") 

# Update with an iterable of key-value pairs
more_details = [('email', 'john@example.com'), ('age', 27)]
user_info.update(more_details)
print(f"After update with iterable: {user_info}") 

# Update with keyword arguments
user_info.update(city='Boston', phone='555-1234')
print(f"After update with keyword args: {user_info}") 
Initial user_info: {'name': 'John', 'age': 25}
After update with new_details: {'name': 'John', 'age': 26, 'city': 'New York'}
After update with iterable: {'name': 'John', 'age': 27, 'city': 'New York', 'email': 'john@example.com'}
After update with keyword args: {'name': 'John', 'age': 27, 'city': 'Boston', 'email': 'john@example.com', 'phone': '555-1234'}

2.6.5. Arrays (from array module) and Numeric Data#

While lists are incredibly flexible, they can store heterogeneous data. When you need to store a large sequence of items of the exact same primitive data type (like all integers or all floats) and care about memory efficiency, Python’s built-in array module can be used.

  • Definition/Purpose: Provides a space-efficient way to store arrays of basic numeric types. Less flexible than lists but more memory-efficient for homogeneous numerical data.

  • Key Characteristics:

    • Homogeneous: All elements must be of the same specified type (e.g., all signed integers, all floating-point numbers). This type is specified by a ‘type code’ during creation.

    • Mutable: Elements can be changed after creation.

    • Ordered & Indexed: Like lists, elements maintain order and are accessed by index.

    • Memory Efficient: Stores elements more compactly than a Python list, which stores full Python objects.

  • Syntax: from array import array array(typecode, [initial items])

    Common typecode examples:

    • 'i': signed integer (2 bytes)

    • 'f': float (4 bytes)

    • 'd': double (8 bytes)

Important Note: For serious numerical computing and data science, the NumPy ndarray (from the numpy library) is the de facto standard. It provides highly optimized array operations and a vast ecosystem of scientific functions, far beyond the basic array module.

from array import array

# Creating an array of signed integers ('i' typecode)
my_int_array = array('i', [30,60,10, 0, 20, 40, 50])
print(f"Integer array: {my_int_array}")
print(f"Type of the array: {type(my_int_array)}")
Integer array: array('i', [30, 60, 10, 0, 20, 40, 50])
Type of the array: <class 'array.array'>
# Creating an array of double-precision floats ('d' typecode)
my_float_array = array('d', [1.1, 2.2, 3.3])
print(f"Float array: {my_float_array}")
Float array: array('d', [1.1, 2.2, 3.3])
# Accessing and modifying elements works like a list
print(f"Third element (my_int_array[2]): {my_int_array[2]}")
my_int_array[0] = 5
print(f"Array after modification: {my_int_array}")
Third element (my_int_array[2]): 10
Array after modification: array('i', [5, 60, 10, 0, 20, 40, 50])

2.6.5.1. Array Methods and Attributes#

The array module provides an array.array object, which is a sequence that can store basic values like numbers (integers, floats) efficiently. All elements in an array.array must be of the same type.

  • pop([i]): Removes the item at the given index i and returns it. If no index is specified, it removes and returns the last item. Raises an IndexError if the array is empty or the index is out of bounds.

  • reverse(): Reverses the order of the array’s items in-place.

  • tolist(): Converts the array into a regular Python list containing the same items.

  • fromlist(list): Appends items from a standard Python list. Each item in the list must be of the array’s typecode. The operation modifies the array in-place.

2.6.5.1.1. append(x): Adds a single item x to the end of the array. The item must be of the array’s typecode.#

# append
my_int_array.append(60)
print(f"After append(60): {my_int_array}") # array('i', [10, 20, 30, 40, 50, 60])
After append(60): array('i', [5, 60, 10, 0, 20, 40, 50, 60])
# Attempting to add a different data type (will cause a TypeError!)
try:
    my_int_array.append(3.14) # Trying to add a float to an integer array
except TypeError as e:
    print(f"Error! Array requires a homogeneous type: {e}")
Error! Array requires a homogeneous type: 'float' object cannot be interpreted as an integer
# Attempting to add the wrong integer type (e.g., too large for 'i')
try:
    my_int_array.append(2**31) # Value too large for typical 'i' (signed 32-bit int)
except OverflowError as e:
    print(f"Error! Value too large for array type: {e}")
Error! Value too large for array type: signed integer is greater than maximum

2.6.5.1.2. count(x): Returns the number of times item x appears in the array.#

my_int_array.count(60)
print(f"Count the number of times 60 in array: {my_int_array.count(60)}")
Count the number of times 60 in array: 2

2.6.5.1.3. extend(iterable): Appends all items from an iterable to the end of the array. Each item in the iterable must be of the array’s typecode. The operation modifies the array in-place.#

# extend
my_int_array.extend([70, 80, 90])
print(f"After extend([70, 80, 90]): {my_int_array}")
After extend([70, 80, 90]): array('i', [5, 60, 10, 0, 20, 40, 50, 60, 70, 80, 90])

2.6.5.1.4. index(x): Returns the index of the first occurrence of item x in the array. Raises a ValueError if the item is not found.#

# index()
my_int_array.index(80)
print(f"The index of first ocurrence of 80 in the array: {my_int_array.index(80)}")
The index of first ocurrence of 80 in the array: 9

2.6.5.1.5. insert(i, x): Inserts item x at the specified index i. The item must be of the array’s typecode. The operation modifies the array in-place.#

# insert
print(f'Original array: \n{my_int_array}')
my_int_array.insert(0, 5) # Insert 5 at the beginning
my_int_array.insert(5, 45) # Insert 45 at index 5
print(f"After insert(0, 5) and insert(5, 45): \n{my_int_array}") 
Original array: 
array('i', [5, 60, 10, 0, 20, 40, 50, 60, 70, 80, 90])
After insert(0, 5) and insert(5, 45): 
array('i', [5, 5, 60, 10, 0, 45, 20, 40, 50, 60, 70, 80, 90])

2.6.5.1.6. remove(x): Removes the first occurrence of item x from the array. Raises a ValueError if the item is not found. The operation modifies the array in-place.#

# remove 
my_int_array = array('i', [30,60,10, 0, 20, 40, 50])
print(f'Original array: \n{my_int_array}')
my_int_array.remove(30)
print(f"After remove(30): \n{my_int_array}")
Original array: 
array('i', [30, 60, 10, 0, 20, 40, 50])
After remove(30): 
array('i', [60, 10, 0, 20, 40, 50])

2.6.5.1.7. pop([i]): Removes the item at the given index i and returns it. If no index is specified, it removes and returns the last item. Raises an IndexError if the array is empty or the index is out of bounds.#

# pop 
my_int_array = array('i', [30,60,10, 0, 20, 40, 50])
print(f'Original array: \n{my_int_array}')
my_int_array.pop(2)
print(f"After pop(2): \n{my_int_array}")
Original array: 
array('i', [30, 60, 10, 0, 20, 40, 50])
After pop(2): 
array('i', [30, 60, 0, 20, 40, 50])

2.6.5.1.8. itemsize: An attribute that returns the size in bytes of one array item (e.g., 4 for a 'f' float, 8 for a 'd' double). This value is constant for a given array type.#

my_int_array = array('i', [30,60,10, 1, 20, 40, 50])
print(f"Itemsize of array with int : \n{my_int_array.itemsize}")
Itemsize of array with int : 
4
my_float_array = array('d', [1.1, 2.2, 3.3])
print(f"Itemsize of array with int : \n{my_float_array.itemsize}")
Itemsize of array with int : 
8

2.6.5.1.9. reverse(): Reverses the order of the array’s items in-place.#

# reverse
my_int_array = array('i', [30,60,10, 0, 20, 40, 50])
print(f'Original array: \n{my_int_array}')
my_int_array.reverse()
print(f"After reverse: \n{my_int_array}")
Original array: 
array('i', [30, 60, 10, 0, 20, 40, 50])
After reverse: 
array('i', [50, 40, 20, 0, 10, 60, 30])

2.6.5.1.10. tolist(): Converts the array into a regular Python list containing the same items.#

# tolist
my_int_array = array('i', [30,60,10, 0, 20, 40, 50])
print(f'Original array: \n{my_int_array}')

print(f"Converting array to list: \n{my_int_array.tolist()}")
Original array: 
array('i', [30, 60, 10, 0, 20, 40, 50])
Converting array to list: 
[30, 60, 10, 0, 20, 40, 50]

2.6.5.1.11. fromlist(list): Appends items from a standard Python list. Each item in the list must be of the array’s typecode. The operation modifies the array in-place.#

# fromlist
my_list = [30, 60, 10, 0, 20, 40, 50]
print(f'Original list: \n{my_list}')
my_array_int = array('i', [])
my_array_int.fromlist(my_list)
print(f"Converting list into array: \n{my_array_int}")
Original list: 
[30, 60, 10, 0, 20, 40, 50]
Converting list into array: 
array('i', [30, 60, 10, 0, 20, 40, 50])

2.6.5.2. Others Arrays methods#

The array module also includes methods for more advanced and low-level data handling, involving direct memory access and binary file input/output. These specific functionalities, while powerful, are not within the general scope of this book.

  • fromunicode(s): Appends items from a Unicode string s. This method is only valid when the array’s typecode is 'u' (for Unicode characters). The operation modifies the array in-place.

  • frombytes(s): Appends items from a bytes object s. The bytes object is interpreted as an array of machine values (as if read from a binary file). The operation modifies the array in-place.

  • buffer_info(): Returns a tuple (address, length) providing the current memory address and the number of elements in the buffer used to hold the array’s contents. This is a low-level operation primarily for interfacing with C/C++ code.

  • byteswap(): Swaps the byte order of all items in the array. This is useful when reading data from files or network streams that use a different byte order (endianness). The operation modifies the array in-place.

  • fromfile(f, n): Reads n items from a file object f (which must be an open binary file) and appends them to the array. If fewer than n items are available, it reads as many as possible and raises an EOFError if none were read. The operation modifies the array in-place.

  • tobytes(): Converts the array into a bytes object representing the array’s contents. The byte order is machine-dependent.

  • tofile(f): Writes all items from the array to a file object f (which must be an open binary file). The items are written as their machine values.

  • tounicode(): Converts the array into a Unicode string. This method is only valid when the array’s typecode is 'u' (for Unicode characters).

  • typecode: An attribute that returns the typecode character used to create the array (e.g., 'i', 'f', 'd'). This character defines the type of elements the array can hold.

2.6.6. Summary of Python’s Core Data Structures#

Data Structure

Type

Mutability

Ordered

Allows Duplicates

Accessed By

Common Use Cases

List

Sequence

Mutable

✅ Yes

✅ Yes

Index (e.g., [0])

General-purpose collection, dynamic arrays, ordered items.

Tuple

Sequence

Immutable

✅ Yes

✅ Yes

Index (e.g., [0])

Fixed collections, function arguments/returns, dictionary keys.

Set

Unordered Collection

Mutable

❌ No

❌ No

N/A (no index)

Membership testing, removing duplicates, mathematical set ops.

Frozenset

Unordered Collection

Immutable

❌ No

❌ No

N/A (no index)

Immutable sets, elements of other sets, dictionary keys.

Dictionary

Mapping

Mutable

✅ Yes

❌ No (keys), ✅ Yes (values)

Key (e.g., ['key'])

Key-value storage, lookup tables, representing structured data.

Array (array module)

Sequence

Mutable

✅ Yes

✅ Yes

Index (e.g., [0])

Memory-efficient storage for large, homogeneous numerical data.

2.6.7. Project: Bookstore Management System#

Objective: Your task is to design and implement a simple command-line based system to manage the inventory and sales of a small bookstore. This project is a practical application of core Python concepts, including variables, control flow (if/else, for, while), functions, and most importantly, choosing and utilizing the appropriate built-in data structures for different types of information.

For each functional requirement below, you will need to decide which Python data structure (List, Tuple, Set, Dictionary, array.array) is best suited to store and manage the relevant information. Consider the characteristics of each data structure:

  • Does the order of items matter?

  • Do you need to modify (add, remove, change) items frequently?

  • Are duplicate items allowed or should they be automatically prevented?

  • How will you access individual pieces of information (by index, by a unique identifier)?

  • Do all items need to be of the same type, especially for numerical data?

Functional Requirements:Your system should be able to handle the following functionalities:

2.6.7.1. Book Inventory Management#

  • Data Storage: The system needs to keep a comprehensive record of every book in the bookstore. Each book will have a unique identifier (like a Book ID). For each book, you’ll need to store its title, author, genre, price, and the current quantity in stock.

  • Operations:

    • Add new books to the inventory.

    • Update the stock quantity of existing books.

    • Look up book details quickly using their unique identifier.

    • Display all books currently available, perhaps sorted alphabetically by title or author.

2.6.7.2. Sales Transaction Recording#

  • Data Storage: Every time a book is sold, the system must record the transaction. For each sale, you’ll need to store which book was sold (by its unique ID), the quantity purchased, and the total amount of the sale. It’s crucial to maintain a historical log of all sales in the order they occurred.

  • Operations:

    • Record a new sale, which also involves reducing the stock of the sold book.

    • Keep an unchangeable record of each transaction once it’s made.

2.6.7.3. Unique Category Tracking#

  • Data Storage: The bookstore wants to know all the distinct book genres they carry. The system should automatically collect and store only the unique genres of books currently in the inventory, without any duplicates.

  • Operations:

    • Automatically update the collection of unique genres when new books are added.

    • Display all available genres.

2.6.7.4. Daily Sales Performance Tracking#

  • Data Storage: The management wants to track the total number of book units sold each day for the past week (e.g., the last 7 days). This data consists purely of numerical values and needs to be stored efficiently, allowing for easy updates as new daily figures become available.

  • Operations:

    • Update the daily sales quantity, shifting older data out as new data comes in.

    • Calculate the sum of sales quantities over the tracked period.

2.6.7.5. Customer Service Queue#

  • Data Storage: Implement a simple system to manage customers waiting for assistance. Customers should be served in the exact order they arrived.

  • Operations:

    • Add new customers to the end of the waiting line.

    • Serve the next customer from the front of the line.

    • View the current list of customers in the queue.

2.6.7.6. Technical Requirements#

  • Main Menu: Create a text-based menu that allows users to select different operations (add book, view inventory, record sale, generate report, manage queue, exit). Use a while loop to keep the program running until the user chooses to exit.

  • User Interaction: Utilize the input() function to get user choices and data, and the print() function to display information and feedback.

  • Control Flow: Make extensive use of if, elif, and else statements for menu navigation, input validation, and decision-making within functions.

  • Iteration: Use for loops to iterate over collections (e.g., displaying inventory, processing sales).

  • Modularity: Break down your program into well-defined functions, each responsible for a specific task (e.g., add_book(), record_sale(), generate_report()).

  • Optional Enhancements (for extra challenge):

    • Implement basic error handling using try-except blocks for invalid numerical inputs.

    • Use lambda expressions when sorting lists of dictionaries/tuples for reports.

2.6.7.7. Testing Your Implementation#

After you implement each function and the overall system, remember to test your code.

  • Manual Tests: Run your program and manually try all the menu options. Check the output carefully to ensure it’s what you expect.

  • Automated Tests (Recommended): Following the principles discussed, create a separate Python file (e.g., test_bookstore.py) and write small, focused test functions using assert statements to verify the behavior of your individual functions (e.g., test add_book for adding unique/duplicate IDs, test record_sale for stock updates and error conditions). Use pytest to run your tests.

This project will provide hands-on experience in selecting and applying the most suitable Python data structures for various programming scenarios!

2.6.7.7.1. Solution#

Hide code cell source
from array import array
from collections import deque

# --- Global Data Structures ---
# Dictionary: Stores book details (key: book_id, value: dict of details)
# Each book's details include title, author, genre, price, and stock.
books_inventory = {}

# List: Stores transaction records. Each record is a tuple for immutability.
# Format: (book_id, quantity_sold, total_price_for_transaction, current_timestamp)
sales_transactions = []

# Set: Stores unique genres encountered across all books.
unique_genres = set()

# Array: Stores daily total sales quantities (e.g., total items sold for the last 7 days).
# Using 'i' for signed integers, assuming quantities won't exceed standard integer limits.
# Initialized to zeros for 7 days.
daily_sales_quantity = array('i', [0] * 7)

# Deque: Simulates a customer service queue (First-In, First-Out)
customer_service_queue = deque()

# --- Helper Functions ---

def display_menu():
    """Displays the main menu options to the user."""
    print("\n--- Bookstore Management System ---")
    print("1. Add New Book")
    print("2. View All Books (Inventory)")
    print("3. Record a Sale")
    print("4. Generate Sales Report")
    print("5. Update Daily Sales Quantity (Array Demo)")
    print("6. Manage Customer Queue (Queue Demo)")
    print("7. Exit")
    print("-----------------------------------")

def add_book(inventory, genres_set):
    """
    Allows the user to add a new book to the inventory.
    Uses dictionary to store book details and set to track unique genres.
    """
    print("\n--- Add New Book ---")
    book_id = input("Enter Book ID (e.g., B001): ").upper()
    if book_id in inventory:
        print(f"Error: Book ID '{book_id}' already exists. Please choose a unique ID.")
        return

    title = input("Enter Title: ")
    author = input("Enter Author: ")
    genre = input("Enter Genre: ").strip().capitalize() # Clean and capitalize genre for consistency

    try:
        price = float(input("Enter Price: $"))
        stock = int(input("Enter Stock Quantity: "))
        if price <= 0 or stock < 0:
            raise ValueError("Price must be positive, stock non-negative.")
    except ValueError:
        print("Invalid input for price or stock. Please enter valid numbers.")
        return

    # Store book details. While genre is part of the dictionary value,
    # the genre itself is also added to a set for unique tracking.
    inventory[book_id] = {
        'title': title,
        'author': author,
        'genre': genre,
        'price': price,
        'stock': stock
    }
    genres_set.add(genre) # Add genre to the set of unique genres

    print(f"Book '{title}' (ID: {book_id}) added successfully!")

def view_inventory(inventory):
    """
    Displays all books currently in the inventory, sorted by title.
    Demonstrates iterating through a dictionary and using a lambda for sorting.
    """
    print("\n--- Current Book Inventory ---")
    if not inventory:
        print("Inventory is empty. Add some books first!")
        return

    # Sort books by title for better readability
    # uses a lambda expression as the sorting key
    sorted_books = sorted(inventory.items(), key=lambda item: item[1]['title'].lower())

    for book_id, details in sorted_books:
        print(f"ID: {book_id}")
        print(f"  Title: {details['title']}")
        print(f"  Author: {details['author']}")
        print(f"  Genre: {details['genre']}")
        print(f"  Price: ${details['price']:.2f}")
        print(f"  Stock: {details['stock']}")
        print("-" * 25)

def record_sale(inventory, transactions):
    """
    Records a book sale, updates inventory stock, and stores transaction details.
    Uses dictionary for inventory lookup, and list to store transaction tuples.
    """
    print("\n--- Record a Sale ---")
    book_id = input("Enter Book ID to purchase: ").upper()

    if book_id not in inventory:
        print("Error: Book not found in inventory. Please check the ID.")
        return

    book = inventory[book_id]
    print(f"Book selected: {book['title']} (Current Stock: {book['stock']})")

    try:
        qty = int(input("Enter quantity to purchase: "))
        if qty <= 0:
            raise ValueError("Quantity must be positive.")
        if qty > book['stock']:
            print(f"Error: Not enough stock. Only {book['stock']} units of '{book['title']}' available.")
            return
    except ValueError:
        print("Invalid quantity. Please enter a valid whole number.")
        return

    # Update stock in inventory (dictionary modification)
    book['stock'] -= qty
    total_price = qty * book['price']

    # Create a transaction record as a tuple (immutable)
    # Includes a timestamp for more realism
    import time
    current_timestamp = int(time.time()) # Unix timestamp (integer seconds since epoch)
    transaction_record = (book_id, qty, total_price, current_timestamp) # Tuple packing

    # Add the transaction to the list of all sales
    transactions.append(transaction_record)

    print(f"Sale recorded: {qty} x '{book['title']}' for ${total_price:.2f}")
    print(f"Remaining stock for '{book['title']}': {book['stock']}")

def generate_report(transactions, inventory, genres_set, daily_sales_array):
    """
    Generates a sales report, showing total revenue, top-selling books/genres,
    and unique genres in the system. Demonstrates set operations and array usage.
    """
    print("\n--- Sales Report ---")
    if not transactions:
        print("No sales recorded yet to generate a report.")
        return

    total_revenue = 0.0
    book_sales_count = {} # Dictionary to count units sold per book
    genre_sales_count = {} # Dictionary to count units sold per genre

    # Iterate through sales_transactions (a list of tuples)
    for transaction in transactions:
        # Tuple unpacking
        book_id, qty, total_sale_value, _ = transaction
        total_revenue += total_sale_value

        # Update sales count for book (dictionary .get() for safe updates)
        book_sales_count[book_id] = book_sales_count.get(book_id, 0) + qty

        # Get genre from inventory and update genre sales count
        if book_id in inventory: # Check if book still exists in inventory
            genre = inventory[book_id]['genre']
            genre_sales_count[genre] = genre_sales_count.get(genre, 0) + qty
        else:
            print(f"Warning: Book ID '{book_id}' from transaction not found in current inventory.")

    print(f"Total Revenue from All Sales: ${total_revenue:.2f}")

    print("\n--- Book Sales Summary ---")
    if not book_sales_count:
        print("No book sales to summarize.")
    else:
        # Sort books by quantity sold (descending) using a lambda function
        sorted_book_sales = sorted(book_sales_count.items(), key=lambda item: item[1], reverse=True)
        for book_id, count in sorted_book_sales:
            title = inventory.get(book_id, {}).get('title', 'Unknown Book (ID not found)')
            print(f"  '{title}' (ID: {book_id}): {count} units sold")

    print("\n--- Genre Sales Summary ---")
    if not genre_sales_count:
        print("No genre sales to summarize.")
    else:
        # Sort genres by quantity sold (descending)
        sorted_genre_sales = sorted(genre_sales_count.items(), key=lambda item: item[1], reverse=True)
        for genre, count in sorted_genre_sales:
            print(f"  {genre}: {count} units sold")

    print("\n--- All Unique Genres in System (Set Demo) ---")
    if genres_set:
        # Convert set to list for sorting (sets are unordered), then join for display
        print(f"  Genres: {', '.join(sorted(list(genres_set)))}")
    else:
        print("No genres added to the system yet.")

    print("\n--- Daily Sales Quantity (Array Demo) ---")
    # Show array contents and simple operations, demonstrating its homogeneous nature
    print(f"  Last {len(daily_sales_array)} days sales quantities: {daily_sales_array.tolist()} items")
    print(f"  Total items sold in last {len(daily_sales_array)} days: {sum(daily_sales_array)} items")
    # Example of accessing an array element
    print(f"  Sales quantity 3 days ago (index {len(daily_sales_array)-4}): {daily_sales_array[len(daily_sales_array)-4]} items")


def update_daily_sales_array(daily_sales_array):
    """
    Updates the daily sales quantity array.
    Demonstrates array modification and shifting elements.
    """
    print("\n--- Update Daily Sales Quantity ---")
    print(f"Current last {len(daily_sales_array)} days sales quantities: {daily_sales_array.tolist()}")
    try:
        new_sale_qty = int(input("Enter today's total sales quantity (integer): "))
        if new_sale_qty < 0:
            raise ValueError("Quantity cannot be negative.")

        # Shift old data to the left and add new data to the right (like a moving window)
        # This loop demonstrates modifying an array's elements by index
        for i in range(len(daily_sales_array) - 1):
            daily_sales_array[i] = daily_sales_array[i+1] # Move value from right to left
        daily_sales_array[-1] = new_sale_qty # Place new quantity at the end

        print("Daily sales quantity updated successfully.")
    except ValueError as e:
        print(f"Invalid input: {e}. Please enter a non-negative integer.")

def manage_customer_queue(customer_queue):
    """
    Manages a simple customer service queue using collections.deque.
    Demonstrates FIFO principle (append for enqueue, popleft for dequeue).
    """
    print("\n--- Customer Service Queue ---")
    if not customer_queue:
        print("Customer queue is currently empty.")
    else:
        print(f"Customers waiting: {list(customer_queue)}") # Convert deque to list for easy printing

    while True:
        queue_action = input("Action (add/serve/view/back): ").lower().strip()
        if queue_action == 'add':
            new_customer_id = input("Enter new customer ID: ")
            customer_queue.append(new_customer_id) # Enqueue
            print(f"Customer '{new_customer_id}' added to queue.")
        elif queue_action == 'serve':
            if customer_queue:
                served_customer = customer_queue.popleft() # Dequeue
                print(f"Served customer: '{served_customer}'")
            else:
                print("Queue is empty. No customer to serve.")
        elif queue_action == 'view':
            if customer_queue:
                print(f"Current queue: {list(customer_queue)}")
            else:
                print("Queue is empty.")
        elif queue_action == 'back':
            print("Returning to main menu.")
            break
        else:
            print("Invalid action. Please choose 'add', 'serve', 'view', or 'back'.")

# --- Main Program Loop ---
def main():
    """
    The main function to run the bookstore management system.
    Initializes dummy data and manages the main menu loop using if/elif/else.
    """
    # Initial dummy data to start with (Dictionaries and initial Set population)
    books_inventory['P001'] = {'title': 'Python Crash Course', 'author': 'Eric Matthes', 'genre': 'Programming', 'price': 35.00, 'stock': 10}
    books_inventory['F001'] = {'title': 'The Lord of the Rings', 'author': 'J.R.R. Tolkien', 'genre': 'Fantasy', 'price': 25.50, 'stock': 15}
    books_inventory['S001'] = {'title': 'Cosmos', 'author': 'Carl Sagan', 'genre': 'Science', 'price': 20.00, 'stock': 5}
    books_inventory['P002'] = {'title': 'Clean Code', 'author': 'Robert C. Martin', 'genre': 'Programming', 'price': 40.00, 'stock': 7}

    # Populate unique_genres set from initial books
    for book_id in books_inventory:
        unique_genres.add(books_inventory[book_id]['genre'])

    # Add some dummy sales transactions (List of Tuples)
    # (book_id, quantity, total_price, timestamp)
    sales_transactions.append(('P001', 2, 70.00, 1720000000)) # Example timestamp
    sales_transactions.append(('F001', 1, 25.50, 1720005000))
    sales_transactions.append(('P001', 1, 35.00, 1720010000))
    sales_transactions.append(('S001', 3, 60.00, 1720015000))

    # Add some dummy daily sales quantities for the array (last 7 days)
    # The array is mutable and will be updated directly.
    daily_sales_quantity[0] = 5  # 7 days ago
    daily_sales_quantity[1] = 8
    daily_sales_quantity[2] = 3
    daily_sales_quantity[3] = 12
    daily_sales_quantity[4] = 6
    daily_sales_quantity[5] = 9
    daily_sales_quantity[6] = 15 # Yesterday's sales quantity

    # Add some initial customers to the queue
    customer_service_queue.append("Alice's Order")
    customer_service_queue.append("Bob's Inquiry")

    while True:
        display_menu()
        choice = input("Enter your choice (1-7): ").strip()

        if choice == '1':
            add_book(books_inventory, unique_genres)
        elif choice == '2':
            view_inventory(books_inventory)
        elif choice == '3':
            record_sale(books_inventory, sales_transactions)
        elif choice == '4':
            generate_report(sales_transactions, books_inventory, unique_genres, daily_sales_quantity)
        elif choice == '5':
            update_daily_sales_array(daily_sales_quantity)
        elif choice == '6':
            manage_customer_queue(customer_service_queue)
        elif choice == '7':
            print("Exiting Bookstore Management System. Goodbye!")
            break # Exit the main loop
        else:
            print("Invalid choice. Please enter a number between 1 and 7.")

# This ensures the main() function runs only when the script is executed directly
if __name__ == "__main__":
    main()
--- Bookstore Management System ---
1. Add New Book
2. View All Books (Inventory)
3. Record a Sale
4. Generate Sales Report
5. Update Daily Sales Quantity (Array Demo)
6. Manage Customer Queue (Queue Demo)
7. Exit
-----------------------------------
---------------------------------------------------------------------------
StdinNotImplementedError                  Traceback (most recent call last)
Cell In[88], line 323
    321 # This ensures the main() function runs only when the script is executed directly
    322 if __name__ == "__main__":
--> 323     main()

Cell In[88], line 301, in main()
    299 while True:
    300     display_menu()
--> 301     choice = input("Enter your choice (1-7): ").strip()
    303     if choice == '1':
    304         add_book(books_inventory, unique_genres)

File /opt/hostedtoolcache/Python/3.12.11/x64/lib/python3.12/site-packages/ipykernel/kernelbase.py:1281, in Kernel.raw_input(self, prompt)
   1279 if not self._allow_stdin:
   1280     msg = "raw_input was called, but this frontend does not support input requests."
-> 1281     raise StdinNotImplementedError(msg)
   1282 return self._input_request(
   1283     str(prompt),
   1284     self._parent_ident["shell"],
   1285     self.get_parent("shell"),
   1286     password=False,
   1287 )

StdinNotImplementedError: raw_input was called, but this frontend does not support input requests.