A comprehensive guide to Python best practices

A guide of best practices for developing in Python. Inspired by Rui Maranhao’s gist.

In General

“Beautiful is better than ugly.” - PEP 20

General Development Guidelines

“Explicit is better than implicit” - PEP 20

Yes

def process_data(data, encoding='utf-8', timeout=30):
    """Process data with explicit parameters."""
    # Parameters are clear and visible
    return data.decode(encoding)

No

def process_data(data):
    """Process data with hidden defaults."""
    # Magic values hidden inside function
    return data.decode('utf-8')  # What encoding? Why?

“Readability counts.” - PEP 20

Yes

def calculate_total_price(items, tax_rate):
    subtotal = sum(item.price for item in items)
    tax = subtotal * tax_rate
    total = subtotal + tax
    return total

No

def calc(i, t):
    return sum(x.p for x in i) * (1 + t)

“Anybody can fix anything.”

Don’t create artificial barriers or “ownership” of code. If you see a bug or improvement opportunity in any part of the codebase, fix it.

This principle comes from Khan Academy’s development philosophy.

Yes

  • See a typo in someone else’s module? Fix it.
  • Found a bug in a different team’s code? Submit a fix.
  • Notice outdated documentation? Update it.

No

  • “That’s not my module, I won’t touch it.”
  • “I’ll just work around this bug in their code.”
  • Leaving broken code for the “owner” to fix.

Fix each issue (bad design, wrong decision, or poor code) as soon as it is discovered.

Yes

# You notice during code review that a function is doing too much.
# Refactor it immediately:

def process_user_data(user):
    validate_user(user)
    save_to_database(user)
    send_welcome_email(user)

No

# You notice the problem but add a TODO comment instead:

def process_user_data(user):
    # TODO: This function does too much, should refactor
    validate_user(user)
    save_to_database(user)
    send_welcome_email(user)
    # TODO will likely never be addressed

“Now is better than never.” - PEP 20

Don’t wait for the “perfect” solution. Ship working code, then iterate.

Yes

  • Implement a basic working feature, deploy it, gather feedback, improve it.
  • Write simple tests now rather than waiting to design the perfect test suite.

No

  • Endlessly debating the ideal architecture without writing code.
  • Waiting months to ship because you want every edge case handled.

Test ruthlessly. Write docs for new features.

Yes

def divide(a, b):
    """Divide a by b.

    :param a: Numerator
    :param b: Denominator (must be non-zero)
    :returns: Result of a / b
    :raises ValueError: If b is zero
    """
    if b == 0:
        raise ValueError("Cannot divide by zero")
    return a / b

# And corresponding tests:
class TestDivide(unittest.TestCase):
    def test_divide_positive_numbers(self):
        self.assertEqual(divide(10, 2), 5)

    def test_divide_by_zero_raises_error(self):
        with self.assertRaises(ValueError):
            divide(10, 0)

No

def divide(a, b):
    return a / b  # No docs, no tests, will crash on zero

Even more important than Test-Driven Development–Human-Driven Development

Write code for humans first, machines second.

Yes

class ShoppingCart:
    def __init__(self):
        self.items = []

    def add_item(self, item):
        """Add an item to the cart."""
        self.items.append(item)

    def get_total(self):
        """Calculate total price of all items."""
        return sum(item.price for item in self.items)

No

class SC:
    def __init__(self):
        self.i = []

    def a(self, x):
        self.i.append(x)

    def t(self):
        return sum(z.p for z in self.i)

These guidelines may–and probably will–change.

Be flexible and open to improving practices as you learn and as the field evolves. What works today might not be the best approach tomorrow.

In Particular

Style

Follow PEP 8, when sensible.

Naming

  • Variables, functions, methods, packages, modules
    • lower_case_with_underscores
  • Classes and Exceptions
    • CapWords
  • Protected methods and internal functions
    • _single_leading_underscore(self, ...)
    • Note: Leading underscores help IDEs identify protected/private members and can trigger “unused” warnings for internal helpers
  • Private methods
    • __double_leading_underscore(self, ...)
  • Constants
    • ALL_CAPS_WITH_UNDERSCORES

About underscore prefixes:

Using _ prefix for protected/private functions helps your IDE and linters understand your intent:

# mymodule.py

def _internal_helper(data):
    """Internal function - not part of public API."""
    return data.strip().lower()

def public_function(text):
    """Public API function."""
    return _internal_helper(text)

# IDE will warn that _internal_helper is "unused" if not called internally
# IDE will mark it as protected/internal in code navigation

Rationale: IDEs like PyCharm and VS Code use the underscore prefix to:

  • Activate “unused code” warnings for internal helpers
  • Indicate these functions shouldn’t be imported with from module import *
  • Show different icons/colors in code navigation to distinguish public vs internal APIs
General Naming Guidelines

Avoid one-letter variables (esp. l, O, I).

Exception: In very short blocks, when the meaning is clearly visible from the immediate context

Fine

for e in elements:
    e.mutate()

Avoid redundant labeling.

Yes

import audio

core = audio.Core()
controller = audio.Controller()

No

import audio
from audio import *

core = audio.AudioCore()
controller = audio.AudioController()

Prefer “reverse notation”.

Yes

elements = ...
elements_active = ...
elements_defunct = ...

No

elements = ...
active_elements = ...
defunct_elements ...

Avoid reusing the same variable name for different purposes.

Yes

# Each variable has a single, clear purpose
user_input = input("Enter a number: ")
user_number = int(user_input)
squared_number = user_number ** 2
print(f"Result: {squared_number}")
# Processing different types of data
raw_data = fetch_data_from_api()
processed_data = clean_data(raw_data)
validated_data = validate_data(processed_data)

No

# 'data' means something different each time
data = input("Enter a number: ")  # data is a string
data = int(data)                   # now data is an int
data = data ** 2                   # now data is the squared result
print(f"Result: {data}")           # confusing!
# Reusing 'result' for unrelated things
result = calculate_tax(price)
save_to_database(result)
result = send_email(user)  # result now means something completely different
result = validate_input(form)  # and now something else again

Rationale: Reusing variable names makes code harder to debug, understand, and maintain. Each variable should represent one concept throughout its scope.

Indentation

Up to you, but be consistent. Enough said.

However, note that: A tab could be a different number of columns depending on your environment, but a space is always one column. In terms of how many spaces (or tabs) constitutes indentation, it’s more important to be consistent throughout your code than to use any specific tab stop value.

Equality checking

Avoid comparing to True, False or None.

Yes

if attr:
    print('True!')

if attr is True:
    print('True!')

if not attr:
    print('False!')

if attr is None:
    print('None')

No

if attr == True:
    print('True!')

if attr == False:
    print('False!')

if attr == None:
    print('None')

List comprehensions

Use list comprehension when possible.

Yes

a = [3, 4, 5]
b = [i for i  in a  if i > 4]

#Or (filter is this case; map could also be more appropriate in other cases)
b = filter(lambda x: x > 4, a)

No

a = [3, 4, 5]
b = []
for i in a:
    if  i > 4:
        b.append(i)

Keyword with and files

The with statement ensures that clean-up code is executed. When opening a file, with will make sure that the file is closed after the with block.

Yes

with open('file.txt') as f:
    do_something_with_f

No

f = open('file.txt')
do_something_with_f
f.close()

Imports

Import entire modules instead of individual symbols within a module. For example, for a top-level module canteen that has a file canteen/sessions.py,

Yes

import canteen
import canteen.sessions
from canteen import sessions

No

from canteen import get_user  # Symbol from canteen/__init__.py
from canteen.sessions import get_session  # Symbol from canteen/sessions.py

Exception: For third-party code where documentation explicitly says to import individual symbols.

Rationale: Avoids circular imports. See here.

Put all imports at the top of the page with three sections, each separated by a blank line, in this order:

  1. System imports
  2. Third-party imports
  3. Local source tree imports

Rationale: Makes it clear where each module is coming from.

Documentation

Follow PEP 257’s docstring guidelines. reStructured Text and Sphinx can help to enforce these standards.

When possible, use one-line docstrings for obvious functions.

"""Return the pathname of ``foo``."""

Multiline docstrings should include

  • Summary line
  • Use case, if appropriate
  • Args
  • Return type and semantics, unless None is returned
"""Train a model to classify Foos and Bars.

Usage::

    >>> import klassify
    >>> data = [("green", "foo"), ("orange", "bar")]
    >>> classifier = klassify.train(data)

:param train_data: A list of tuples of the form ``(color, label)``.
:rtype: A :class:`Classifier <Classifier>`
"""

Notes

  • Use action words (“Return”) rather than descriptions (“Returns”).
  • Document __init__ methods in the docstring for the class.
class Person(object):
    """A simple representation of a human being.

    :param name: A string, the person's name.
    :param age: An int, the person's age.
    """
    def __init__(self, name, age):
        self.name = name
        self.age = age
Alternative Docstring Formats

While the examples above use reStructuredText (reST) format, there are other popular docstring styles used in the Python community. Choose one and be consistent throughout your project.

NumPy/SciPy Style

Popular in scientific computing, more readable for complex functions with many parameters.

def calculate_statistics(data, weights=None, ddof=0):
    """
    Calculate mean and standard deviation of data.

    Parameters
    ----------
    data : array_like
        Input data array.
    weights : array_like, optional
        Weights for each value in data. Default is None.
    ddof : int, optional
        Delta degrees of freedom. Default is 0.

    Returns
    -------
    mean : float
        Arithmetic mean of the data.
    std : float
        Standard deviation of the data.

    Raises
    ------
    ValueError
        If data is empty.

    Examples
    --------
    >>> calculate_statistics([1, 2, 3, 4, 5])
    (3.0, 1.4142135623730951)
    """
    pass

Google Style

Clean and readable, popular in many open-source projects.

def calculate_statistics(data, weights=None, ddof=0):
    """Calculate mean and standard deviation of data.

    Args:
        data (array_like): Input data array.
        weights (array_like, optional): Weights for each value in data.
            Defaults to None.
        ddof (int, optional): Delta degrees of freedom. Defaults to 0.

    Returns:
        tuple: A tuple containing:
            - mean (float): Arithmetic mean of the data.
            - std (float): Standard deviation of the data.

    Raises:
        ValueError: If data is empty.

    Examples:
        >>> calculate_statistics([1, 2, 3, 4, 5])
        (3.0, 1.4142135623730951)
    """
    pass

Comparison

Style Best For Tools
reStructuredText (reST) General Python projects, Sphinx documentation Sphinx, most IDEs
NumPy/SciPy Scientific computing, data science projects Sphinx with Napoleon extension
Google Clean, readable docs; Google-style projects Sphinx with Napoleon extension

Documentation Generation Tools

Beyond Sphinx, several other tools can generate documentation from your docstrings:

Tool Features Best For
pdoc Minimal, clean HTML docs; auto-generates from docstrings; great for small projects Quick documentation without configuration
MkDocs with mkdocstrings Modern Material Design theme; Markdown-based; integrates docstrings Beautiful, modern project documentation
Pydoc Built-in Python tool; minimal setup; generates HTML or terminal docs Simple projects; documentation without dependencies
Quartodoc Bridges Quarto and Python; supports multiple docstring formats Data science projects; integrating code with narratives
ReadTheDocs Free hosting; integrates with Sphinx; automatic builds from GitHub Open-source projects needing hosted documentation
Griffe Modern Python doc parser; supports multiple formats; async-friendly Projects requiring flexible docstring extraction

Key Points

  • Be consistent: Pick one style and use it throughout your project
  • Tool support: Most documentation generators support multiple docstring formats
  • Team preference: Follow your team’s or project’s existing convention
  • Readability: NumPy and Google styles are often more readable for complex functions
  • Integration: Consider which tools integrate with your CI/CD and hosting platform
On comments

Use them sparingly. Prefer code readability to writing a lot of comments. Often, small methods are more effective than comments.

No

# If the sign is a stop sign
if sign.color == 'red' and sign.sides == 8:
    stop()

Yes

def is_stop_sign(sign):
    return sign.color == 'red' and sign.sides == 8

if is_stop_sign(sign):
    stop()

When you do write comments, remember: “Strunk and White apply.” - PEP 8

In summary:

Use clear, direct language and avoid unnecessary words to ensure a reader’s understanding, as recommended by Strunk and White.

Line lengths

Don’t stress over it. 80-100 characters is fine. We have wide screens nowadays.

Use parentheses for line continuations.

wiki = (
    "The Colt Python is a .357 Magnum caliber revolver formerly manufactured "
    "by Colt's Manufacturing Company of Hartford, Connecticut. It is sometimes "
    'referred to as a "Combat Magnum". It was first introduced in 1955, the '
    "same year as Smith & Wesson's M29 .44 Magnum."
)

String Formatting

Use f-strings (formatted string literals) for string formatting. They are more readable, concise, and faster than older methods.

Yes

name = "Alice"
age = 30
city = "Lisbon"

# Simple variable interpolation
message = f"Hello, {name}!"

# Expressions inside f-strings
info = f"{name} is {age} years old and lives in {city}."

# Formatting numbers
price = 19.99
quantity = 3
total = f"Total: €{price * quantity:.2f}"  # Total: €59.97

# Multi-line f-strings
report = (
    f"User Report:\n"
    f"  Name: {name}\n"
    f"  Age: {age}\n"
    f"  City: {city}"
)

No

name = "Alice"
age = 30
city = "Lisbon"

# Old-style % formatting (avoid)
message = "Hello, %s!" % name
info = "%s is %d years old and lives in %s." % (name, age, city)

# str.format() method (verbose)
message = "Hello, {}!".format(name)
info = "{} is {} years old and lives in {}.".format(name, age, city)

# String concatenation (error-prone and hard to read)
message = "Hello, " + name + "!"
info = name + " is " + str(age) + " years old and lives in " + city + "."

Rationale: F-strings (Python 3.6+) are faster, more readable, and less error-prone than % formatting or .format(). They allow expressions directly inside the string and make the code’s intent clearer.

Advanced f-string features

# Debugging with f-strings (Python 3.8+)
x = 10
y = 20
print(f"{x=}, {y=}, {x+y=}")  # x=10, y=20, x+y=30

# Calling functions
def get_status():
    return "active"

status_msg = f"System is {get_status()}"

Formatting numbers with f-strings

# Format a float to 2 decimal places
price = 19.98765
formatted_price = f"{price:.2f}"  # '19.99'

# Format an integer with leading zeros (e.g., pad to 4 digits)
order_number = 42
formatted_order = f"{order_number:04d}"  # '0042'

# Right-align a string with spaces using f-strings
text = "Python"
width = 12
right_aligned = f"{text:>{width}}"
print(f"'{right_aligned}'")  # Output: '      Python'

# Or, for left alignment (for comparison):
left_aligned = f"{text:<{width}}"
print(f"'{left_aligned}'")  # Output: 'Python      '

# Format a percentage with 1 decimal place
success_rate = 0.857
formatted_rate = f"{success_rate:.1%}"  # '85.7%'

# Format large numbers with commas
population = 1234567
formatted_population = f"{population:,}"  # '1,234,567'

# Format large numbers with locale-aware grouping using `:n`
import locale

locale.setlocale(locale.LC_ALL, '')  # Set to user's default locale
large_number = 1234567.89
formatted_number = f"{large_number:n}"  # e.g., '1,234,567.89' or '1.234.567,89' depending on locale

Type Hints

Use type hints to make your code more readable and maintainable. Type hints help IDEs provide better autocomplete, catch errors early, and serve as inline documentation.

Yes

def calculate_total(price: float, quantity: int) -> float:
    """Calculate total price for items."""
    return price * quantity

def greet(name: str) -> str:
    """Return a greeting message."""
    return f"Hello, {name}!"

def process_items(items: list[str]) -> dict[str, int]:
    """Count occurrences of each item."""
    counts = {}
    for item in items:
        counts[item] = counts.get(item, 0) + 1
    return counts

No

def calculate_total(price, quantity):
    """Calculate total price for items."""
    return price * quantity  # What types? Will this work with all inputs?

def greet(name):
    """Return a greeting message."""
    return f"Hello, {name}!"  # Is name always a string?

def process_items(items):
    """Count occurrences of each item."""
    counts = {}
    for item in items:
        counts[item] = counts.get(item, 0) + 1
    return counts  # What does this return?

Rationale: Type hints improve code clarity, enable better IDE support (autocomplete, refactoring, error detection), and help catch bugs before runtime. Modern IDEs like VS Code and PyCharm use type hints to provide intelligent code completion and warnings.

Modern type hints (Python 3.9+)

Starting with Python 3.9, you can use built-in types directly instead of importing from typing. Python 3.10+ also introduces the | operator for union types.

Yes (Python 3.10+)

# Use built-in types with [] syntax (Python 3.9+)
def process_items(items: list[str]) -> dict[str, int]:
    """Count occurrences of each item."""
    counts = {}
    for item in items:
        counts[item] = counts.get(item, 0) + 1
    return counts

# Use | operator for unions (Python 3.10+)
def find_user(user_id: int) -> str | None:
    """Find user by ID, return None if not found."""
    if user_id > 0:
        return "Alice"
    return None

def process_id(id_value: int | str) -> str:
    """Process ID that can be int or string."""
    return str(id_value)

# Multiple union types
def parse_config(value: str | int | float | None) -> str:
    """Parse configuration value."""
    return str(value) if value is not None else ""

Older style (Python 3.5-3.8)

from typing import Optional, Union, List, Dict

# Had to import and use capitalized generic types
def process_items(items: List[str]) -> Dict[str, int]:
    """Count occurrences of each item."""
    counts = {}
    for item in items:
        counts[item] = counts.get(item, 0) + 1
    return counts

# Used Optional and Union from typing module
def find_user(user_id: int) -> Optional[str]:
    """Find user by ID, return None if not found."""
    if user_id > 0:
        return "Alice"
    return None

def process_id(id_value: Union[int, str]) -> str:
    """Process ID that can be int or string."""
    return str(id_value)

Note: If you’re using Python 3.10 or later, prefer the modern syntax with | and built-in types (list, dict, set, tuple). For Python 3.9, use built-in types with [] but continue using Optional and Union from typing. The old style with List, Dict, etc. from typing is now deprecated but still works.

More type hint examples

from typing import Any, Callable

# Class type hints
class User:
    def __init__(self, name: str, age: int) -> None:
        self.name: str = name
        self.age: int = age

    def get_info(self) -> dict[str, Any]:
        """Return user information."""
        return {"name": self.name, "age": self.age}

# Callable type hints
def apply_operation(x: int, operation: Callable[[int], int]) -> int:
    """Apply a function to a number."""
    return operation(x)

# Type aliases for complex types
UserId = int
UserData = dict[str, str | int]  # Python 3.10+
# Or for older versions: dict[str, Union[str, int]]

def get_user_data(user_id: UserId) -> UserData:
    """Get user data by ID."""
    return {"name": "Alice", "age": 30}

IDE and AI Assistant Benefits

With type hints, your IDE and AI coding assistants can:

  • Autocomplete: Suggest methods and attributes based on the type
  • Error detection: Warn you when passing wrong types before running code
  • Refactoring: Safely rename variables and functions across your codebase
  • Documentation: Show parameter types in function signatures without reading docs
  • AI assistance: GitHub Copilot, Codeium, and other AI assistants provide more accurate suggestions when they understand your types
# IDE knows 'result' is a float, suggests float methods
result: float = calculate_total(19.99, 3)
result.is_integer()  # IDE autocompletes this method

# IDE warns if you pass wrong types
calculate_total("19.99", "3")  # IDE shows warning: expected float and int

# AI assistants provide better completions with type hints
def process_users(users: list[dict[str, str]]) -> list[str]:
    # AI knows 'users' is a list of dicts, suggests appropriate operations
    # AI knows return type should be list of strings
    return [user["name"] for user in users]  # AI suggests this correctly

Note: AI coding assistants like GitHub Copilot use type hints to understand your code’s intent and provide more accurate, context-aware suggestions. Well-typed code gets better AI assistance.

Leverage the Standard Library

Python’s standard library is extensive and well-tested. Before writing custom solutions or installing third-party packages, check if the standard library already provides what you need. Familiarizing yourself with common standard library modules will make you a more effective Python programmer.

collections - Specialized Container Data Types

The collections module provides alternatives to built-in containers with additional functionality.

Counter - Count hashable objects

Yes

from collections import Counter

# Count occurrences in a list
words = ["apple", "banana", "apple", "orange", "banana", "apple"]
word_counts = Counter(words)
print(word_counts)  # Counter({'apple': 3, 'banana': 2, 'orange': 1})

# Get most common items
print(word_counts.most_common(2))  # [('apple', 3), ('banana', 2)]

# Combine counters
more_words = ["apple", "grape"]
word_counts.update(more_words)

No

# Manually counting (verbose and error-prone)
words = ["apple", "banana", "apple", "orange", "banana", "apple"]
word_counts = {}
for word in words:
    if word in word_counts:
        word_counts[word] += 1
    else:
        word_counts[word] = 1

defaultdict - Dictionary with default values

from collections import defaultdict

# Group items by category
items = [("fruit", "apple"), ("fruit", "banana"), ("vegetable", "carrot")]
grouped = defaultdict(list)
for category, item in items:
    grouped[category].append(item)
# {'fruit': ['apple', 'banana'], 'vegetable': ['carrot']}
functools - Higher-Order Functions

partial - Partial function application

Yes

from functools import partial

def power(base: int, exponent: int) -> int:
    """Raise base to the power of exponent."""
    return base ** exponent

# Create specialized functions
square = partial(power, exponent=2)
cube = partial(power, exponent=3)

print(square(5))  # 25
print(cube(5))   # 125

No

# Creating wrapper functions manually
def power(base: int, exponent: int) -> int:
    """Raise base to the power of exponent."""
    return base ** exponent

def square(base: int) -> int:
    return power(base, 2)

def cube(base: int) -> int:
    return power(base, 3)

lru_cache - Memoization decorator

from functools import lru_cache

@lru_cache(maxsize=128)
def fibonacci(n: int) -> int:
    """Calculate nth Fibonacci number with caching."""
    if n < 2:
        return n
    return fibonacci(n - 1) + fibonacci(n - 2)

# Much faster for repeated calls
print(fibonacci(100))  # Computed instantly due to caching
itertools - Iterator Building Blocks

chain - Combine multiple iterables

from itertools import chain

# Combine multiple lists into one
list1 = [1, 2, 3]
list2 = [4, 5, 6]
combined = list(chain(list1, list2))  # [1, 2, 3, 4, 5, 6]

combinations - All combinations of items

from itertools import combinations

# Get all 2-item combinations
items = ['A', 'B', 'C']
combos = list(combinations(items, 2))
# [('A', 'B'), ('A', 'C'), ('B', 'C')]

For more combinatorics utilities, refer to the official Python documentation.

groupby - Group consecutive items

from itertools import groupby

# Group consecutive items by a key
data = [('A', 1), ('A', 2), ('B', 1), ('B', 2), ('A', 3)]
for key, group in groupby(data, key=lambda x: x[0]):
    print(f"{key}: {list(group)}")
# A: [('A', 1), ('A', 2)]
# B: [('B', 1), ('B', 2)]
# A: [('A', 3)]

islice - Slice an iterator

from itertools import islice

# Get items from an iterator without loading all into memory
def generate_numbers():
    n = 0
    while True:
        yield n
        n += 1

# Get first 10 even numbers
evens = (x for x in generate_numbers() if x % 2 == 0)
first_ten_evens = list(islice(evens, 10))  # [0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
pathlib - Object-Oriented File Paths

Yes

from pathlib import Path

# Modern, readable path operations
project_dir = Path("/home/user/project")
config_file = project_dir / "config" / "settings.json"

if config_file.exists():
    content = config_file.read_text()

# List all Python files
python_files = list(project_dir.glob("**/*.py"))

No

import os

# Old-style string manipulation
project_dir = "/home/user/project"
config_file = os.path.join(project_dir, "config", "settings.json")

if os.path.exists(config_file):
    with open(config_file, 'r') as f:
        content = f.read()
dataclasses - Reduce Boilerplate for Classes

Yes (Python 3.7+)

from dataclasses import dataclass

@dataclass
class Point:
    x: float
    y: float

    def distance_from_origin(self) -> float:
        return (self.x ** 2 + self.y ** 2) ** 0.5

p = Point(3.0, 4.0)
print(p)  # Point(x=3.0, y=4.0)
print(p.distance_from_origin())  # 5.0

No

class Point:
    def __init__(self, x: float, y: float):
        self.x = x
        self.y = y

    def __repr__(self):
        return f"Point(x={self.x}, y={self.y})"

    def __eq__(self, other):
        if not isinstance(other, Point):
            return NotImplemented
        return self.x == other.x and self.y == other.y

    def distance_from_origin(self) -> float:
        return (self.x ** 2 + self.y ** 2) ** 0.5

Rationale: The standard library is well-documented, thoroughly tested, and requires no additional dependencies. Learning these modules will help you write more concise, efficient, and idiomatic Python code.

Modern Python Features

Python continues to evolve with each version, introducing new syntax and features that make code more readable and concise. Here are some significant additions from recent Python versions.

Assignment Expressions (Walrus Operator) - Python 3.8+

The walrus operator := allows you to assign values to variables as part of an expression. This is particularly useful in conditions and loops.

Yes

# Assign and use in one line
if (n := len(data)) > 10:
    print(f"List is too long ({n} elements, expected <= 10)")

# Avoid redundant function calls in while loops
while (line := file.readline()) != "":
    process(line)

# List comprehensions with shared subexpression
data = [1, 2, 3, 4, 5]
results = [y for x in data if (y := x * 2) > 5]  # [6, 8, 10]

# Reuse expensive computation in comprehension
expensive_results = [result for x in data
                    if (result := expensive_function(x)) is not None]

No

# Without walrus operator - less concise
n = len(data)
if n > 10:
    print(f"List is too long ({n} elements, expected <= 10)")

# Redundant calls
line = file.readline()
while line != "":
    process(line)
    line = file.readline()

# Computing twice
results = [x * 2 for x in data if x * 2 > 5]

Rationale: The walrus operator reduces code duplication and makes intentions clearer, especially when you need to use a computed value multiple times.

Positional-Only and Keyword-Only Parameters - Python 3.8+

Use / to specify positional-only parameters and * for keyword-only parameters, making your API more explicit and preventing misuse.

Yes

# Positional-only parameters (before /)
def greet(name, /, greeting="Hello"):
    """Name must be positional, greeting can be keyword."""
    return f"{greeting}, {name}!"

greet("Alice")  # OK
greet("Alice", greeting="Hi")  # OK
# greet(name="Alice")  # Error: name is positional-only

# Keyword-only parameters (after *)
def create_user(username, *, email, age):
    """Email and age must be passed as keywords."""
    return {"username": username, "email": email, "age": age}

create_user("alice", email="alice@example.com", age=30)  # OK
# create_user("alice", "alice@example.com", 30)  # Error: email and age must be keyword

# Combining both
def process(a, b, /, c, *, d, e):
    """a, b are positional-only; c can be either; d, e are keyword-only."""
    pass

No

# Ambiguous API without parameter restrictions
def greet(name, greeting="Hello"):
    return f"{greeting}, {name}!"

# Users might do this (unclear which is which):
greet(greeting="Alice", name="Hello")  # Confusing!

Rationale: Explicit parameter types prevent API misuse and make function signatures self-documenting.

Pattern Matching (Structural Pattern Matching) - Python 3.10+

The match/case statement provides powerful pattern matching similar to switch statements in other languages, but much more flexible.

Yes

# Simple value matching
def http_status(status: int) -> str:
    match status:
        case 200:
            return "OK"
        case 404:
            return "Not Found"
        case 500:
            return "Internal Server Error"
        case _:
            return "Unknown Status"

# Pattern matching with structures
def process_command(command: tuple) -> str:
    match command:
        case ("quit",):
            return "Exiting..."
        case ("load", filename):
            return f"Loading {filename}"
        case ("save", filename, format):
            return f"Saving {filename} as {format}"
        case _:
            return "Unknown command"

# Matching objects and types
def describe(obj):
    match obj:
        case int(x) if x > 0:
            return f"Positive integer: {x}"
        case int(x) if x < 0:
            return f"Negative integer: {x}"
        case str(s) if len(s) > 0:
            return f"Non-empty string: {s}"
        case list([]):
            return "Empty list"
        case list([first, *rest]):
            return f"List starting with {first}"
        case {"name": name, "age": age}:
            return f"Person: {name}, age {age}"
        case _:
            return "Something else"

No

# Long if-elif chains (less readable)
def http_status(status: int) -> str:
    if status == 200:
        return "OK"
    elif status == 404:
        return "Not Found"
    elif status == 500:
        return "Internal Server Error"
    else:
        return "Unknown Status"

def process_command(command: tuple) -> str:
    if len(command) == 1 and command[0] == "quit":
        return "Exiting..."
    elif len(command) == 2 and command[0] == "load":
        return f"Loading {command[1]}"
    elif len(command) == 3 and command[0] == "save":
        return f"Saving {command[1]} as {command[2]}"
    else:
        return "Unknown command"

Rationale: Pattern matching makes complex conditional logic more readable and maintainable, especially when dealing with structured data.

Dictionary Merge and Update Operators - Python 3.9+

Use | and |= operators for merging dictionaries.

Yes

# Merge dictionaries with | operator
defaults = {"color": "blue", "size": "medium"}
custom = {"size": "large", "style": "modern"}
config = defaults | custom  # {'color': 'blue', 'size': 'large', 'style': 'modern'}

# Update in place with |=
settings = {"theme": "dark"}
settings |= {"language": "en", "theme": "light"}
# settings is now {'theme': 'light', 'language': 'en'}

No

# Old way - more verbose
defaults = {"color": "blue", "size": "medium"}
custom = {"size": "large", "style": "modern"}
config = {**defaults, **custom}  # Works but less clear

# Or even older
config = defaults.copy()
config.update(custom)

Rationale: The | operator makes dictionary merging more intuitive and consistent with set operations.

Improved Error Messages - Python 3.10+

Python 3.10+ provides much more helpful error messages with better context and suggestions.

# Example: Better syntax error messages
# Python 3.10+ will point to the exact location and suggest fixes

# Missing colon
# if x > 5
#        ^
# SyntaxError: expected ':'

# Better name error suggestions
name = "Alice"
# print(nam)
# NameError: name 'nam' is not defined. Did you mean: 'name'?

# Better attribute errors
class User:
    def __init__(self):
        self.username = "alice"

user = User()
# print(user.usrname)
# AttributeError: 'User' object has no attribute 'usrname'. Did you mean: 'username'?

Parenthesized Context Managers - Python 3.10+

Write cleaner code when using multiple context managers.

Yes (Python 3.10+)

# Parentheses allow natural line breaks
with (
    open("input.txt") as infile,
    open("output.txt", "w") as outfile,
    open("log.txt", "w") as logfile,
):
    process_files(infile, outfile, logfile)

No (Python 3.9 and earlier)

# Awkward backslash continuation
with open("input.txt") as infile, \
     open("output.txt", "w") as outfile, \
     open("log.txt", "w") as logfile:
    process_files(infile, outfile, logfile)

Rationale: Parenthesized context managers improve readability when working with multiple resources.

Essential Python Idiomatics and Best Practices

The if __name__ == "__main__" Pattern

Use this idiomatic expression to make your Python files both importable modules and executable scripts.

Yes

# mymodule.py
def calculate_sum(a: int, b: int) -> int:
    """Add two numbers."""
    return a + b

def main():
    """Main entry point when run as script."""
    result = calculate_sum(5, 3)
    print(f"Result: {result}")

if __name__ == "__main__":
    main()

When imported:

# other_file.py
import mymodule

# Only the function is available, main() doesn't run automatically
result = mymodule.calculate_sum(10, 20)

When run directly:

python mymodule.py  # Runs main() and prints "Result: 8"

No

# mymodule.py - Bad practice
def calculate_sum(a: int, b: int) -> int:
    """Add two numbers."""
    return a + b

# Code runs immediately when imported (bad!)
result = calculate_sum(5, 3)
print(f"Result: {result}")

Rationale: This pattern allows code reuse. The same file can be used as an importable module or run as a standalone script without executing unwanted side effects during import.

Logging vs Print

Use the logging module instead of print() for anything beyond simple scripts. Logging provides better control over output levels, formatting, and destinations.

Yes

import logging

# Configure logging at the start of your application
logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)

logger = logging.getLogger(__name__)

def process_data(data):
    logger.debug(f"Processing data: {data}")

    if not data:
        logger.warning("Empty data received")
        return None

    try:
        result = expensive_operation(data)
        logger.info(f"Successfully processed {len(data)} items")
        return result
    except Exception as e:
        logger.error(f"Failed to process data: {e}", exc_info=True)
        raise

# Benefits:
# - Can change log level without modifying code
# - Logs include timestamps and context
# - Can redirect to files, streams, or external services
# - Different loggers for different modules

No

def process_data(data):
    print(f"Processing data: {data}")  # Always prints, no control

    if not data:
        print("WARNING: Empty data received")  # No standard format
        return None

    try:
        result = expensive_operation(data)
        print(f"Processed {len(data)} items")  # Mixed with actual output
        return result
    except Exception as e:
        print(f"ERROR: {e}")  # No stack trace, hard to debug
        raise

Log Levels

import logging

logger = logging.getLogger(__name__)

# Use appropriate levels
logger.debug("Detailed information for debugging")
logger.info("General informational messages")
logger.warning("Warning messages for unexpected situations")
logger.error("Error messages for serious problems")
logger.critical("Critical messages for very serious errors")

# In production, set level to INFO or WARNING to reduce noise
logging.basicConfig(level=logging.WARNING)
# Now only warning, error, and critical messages appear

Rationale: Logging is configurable, structured, and production-ready. It allows you to control verbosity without changing code and provides better context for debugging.

Efficient Collection Operations

Choose the right data structure for the job. Understanding performance characteristics prevents inefficient code. For more use cases, refer to this blog post.

Set Membership Testing

Yes

# Use sets for membership testing - O(1) average case
valid_users = {"alice", "bob", "charlie", "david"}  # set

if username in valid_users:  # Fast: O(1)
    grant_access()

# Converting list to set for multiple lookups
user_list = ["alice", "bob", "charlie", "david"]
user_set = set(user_list)  # Convert once

for login_attempt in login_attempts:
    if login_attempt in user_set:  # O(1) per check
        process_login(login_attempt)

No

# Using lists for membership testing - O(n) linear search
valid_users = ["alice", "bob", "charlie", "david"]  # list

if username in valid_users:  # Slow: O(n)
    grant_access()

# Repeatedly searching through lists
for login_attempt in login_attempts:
    if login_attempt in valid_users:  # O(n) per check - very slow!
        process_login(login_attempt)

Dictionary for Fast Lookups

Yes

# Use dicts for key-value lookups - O(1)
user_scores = {
    "alice": 95,
    "bob": 87,
    "charlie": 92
}

score = user_scores.get("alice", 0)  # Fast lookup with default

# Dict comprehension
squared = {x: x**2 for x in range(10)}
# {0: 0, 1: 1, 2: 4, 3: 9, ...}

No

# Using parallel lists - inefficient and error-prone
usernames = ["alice", "bob", "charlie"]
scores = [95, 87, 92]

# Linear search to find score - O(n)
def get_score(username):
    for i, name in enumerate(usernames):
        if name == username:
            return scores[i]
    return 0

Set Operations

# Set operations are fast and readable
set_a = {1, 2, 3, 4, 5}
set_b = {4, 5, 6, 7, 8}

# Union - all unique elements
union = set_a | set_b  # {1, 2, 3, 4, 5, 6, 7, 8}

# Intersection - common elements
intersection = set_a & set_b  # {4, 5}

# Difference - elements in a but not in b
difference = set_a - set_b  # {1, 2, 3}

# Symmetric difference - elements in either but not both
sym_diff = set_a ^ set_b  # {1, 2, 3, 6, 7, 8}

# Remove duplicates from a list
unique_items = list(set(items_with_duplicates))

List vs Generator for Large Data

Yes

# Generator - memory efficient for large data
def read_large_file(filename):
    with open(filename) as f:
        for line in f:  # Processes one line at a time
            yield line.strip()

# Only loads one line at a time
for line in read_large_file("huge_file.txt"):
    process(line)

# Generator expression
sum_of_squares = sum(x**2 for x in range(1000000))  # Memory efficient

No

# List - loads everything into memory
def read_large_file(filename):
    with open(filename) as f:
        return [line.strip() for line in f]  # Loads entire file!

# Memory intensive
all_lines = read_large_file("huge_file.txt")
for line in all_lines:
    process(line)

# List comprehension - creates full list in memory
sum_of_squares = sum([x**2 for x in range(1000000)])  # Wasteful

Rationale: Using the right data structure improves performance dramatically. Sets and dicts offer O(1) lookups, while lists require O(n) searches. Generators save memory for large datasets.

Duck Typing and EAFP vs LBYL

Duck typing is a programming concept where the type or class of an object is determined by its behavior (methods and properties), rather than its explicit inheritance or type. The phrase “If it walks like a duck and quacks like a duck, it’s a duck” means that if an object implements the necessary methods or behaviors, it can be used wherever those behaviors are expected—regardless of its actual type. This approach makes code more flexible and reusable, as it focuses on what an object can do, not what it is. Python also follows the philosophy of “Easier to Ask for Forgiveness than Permission” (EAFP) rather than “Look Before You Leap” (LBYL).

EAFP - Pythonic Approach

Yes

# EAFP: Try the operation, handle exceptions if it fails
def get_user_age(user_dict):
    try:
        return user_dict["age"]
    except KeyError:
        return None

# Duck typing: "If it walks like a duck and quacks like a duck..."
def process_file(file_obj):
    # Don't check type - just use it like a file
    try:
        content = file_obj.read()
        return content.upper()
    except AttributeError:
        raise TypeError("Object must have a read() method")

# Works with any file-like object
from io import StringIO
process_file(open("file.txt"))  # Real file
process_file(StringIO("test"))  # String buffer - also works!

# EAFP with dict access
config = {"timeout": 30, "retries": 3}
try:
    timeout = config["timeout"]
    retries = config["retries"]
except KeyError as e:
    raise ValueError(f"Missing required config: {e}")

No - LBYL (Look Before You Leap)

# LBYL: Check before you try (less Pythonic, race conditions)
def get_user_age(user_dict):
    if "age" in user_dict:  # Extra check
        return user_dict["age"]
    else:
        return None

# Type checking instead of duck typing (rigid)
def process_file(file_obj):
    if not isinstance(file_obj, io.IOBase):  # Too restrictive!
        raise TypeError("Must be a file object")
    content = file_obj.read()
    return content.upper()
# Now StringIO won't work, even though it has read()!

# Multiple checks (verbose and slower)
config = {"timeout": 30, "retries": 3}
if "timeout" in config and "retries" in config:
    timeout = config["timeout"]
    retries = config["retries"]
else:
    raise ValueError("Missing required config")

More EAFP Examples

# Converting to int
# EAFP - Pythonic
try:
    value = int(user_input)
except ValueError:
    print("Invalid number")

# LBYL - Not Pythonic
if user_input.isdigit():
    value = int(user_input)
else:
    print("Invalid number")
# Problem: isdigit() doesn't handle negative numbers or floats

# File operations
# EAFP - Pythonic
try:
    with open("config.json") as f:
        config = json.load(f)
except FileNotFoundError:
    config = default_config()
except json.JSONDecodeError:
    print("Invalid JSON")

# LBYL - Less Pythonic
import os
if os.path.exists("config.json"):
    # Race condition: file could be deleted between check and open!
    with open("config.json") as f:
        config = json.load(f)
else:
    config = default_config()

Duck Typing Benefits

# Functions work with any compatible type
def print_all(items):
    """Works with lists, tuples, sets, generators, etc."""
    for item in items:  # Just needs to be iterable
        print(item)

print_all([1, 2, 3])           # list
print_all((1, 2, 3))           # tuple
print_all({1, 2, 3})           # set
print_all(range(1, 4))         # range object
print_all(x for x in [1,2,3])  # generator

# Custom class works too if it implements __iter__
class MyCollection:
    def __init__(self, data):
        self.data = data

    def __iter__(self):
        return iter(self.data)

print_all(MyCollection([1, 2, 3]))  # Works!

Rationale: EAFP is more Pythonic, often faster (one operation vs check + operation), and handles edge cases better. Duck typing makes code more flexible and reusable by focusing on behavior rather than type.




    Enjoy Reading This Article?

    Here are some more articles you might like to read next:

  • Analyzing the history of CVPR
  • sli.dev for non-web developers
  • Improving your python code with simple tricks
  • The problem of research code reproducibility
  • Creating localized blog posts