python-functions-modules – COMP 536: Computational Modeling for Scientists (Spring 2026)

Author

Anna Rosen

title: “Chapter 5: Functions & Modules - Building Reusable Scientific Code” subtitle: “COMP 536 | Python Fundamentals” author: “Anna Rosen” draft: false execute: freeze: auto echo: true warning: true error: false format: html: toc: true code-fold: true code-summary: “Show code” —

Learning Objectives

By the end of this chapter, you will be able to:

(1) Design and encapsulate computational physics and astronomy algorithms with clear input/output contracts, including debugging common errors like mutable defaults and scope violations.
(2) Apply Python’s LEGB scope rules to prevent variable conflicts in complex numerical simulations.
(3) Implement flexible function interfaces using positional, keyword, and default arguments for scientific computing.
(4) Transform iterative calculations into functional programming patterns using map, filter, and lambda.
(5) Organize related computational functions into reusable modules following scientific Python conventions.
(6) Write comprehensive docstrings that document physical units, assumptions, and numerical methods.
(7) Apply optimization strategies through memoization and vectorization for large-scale computations.
(8) Build modular, reusable code that forms the foundation for object-oriented scientific applications.

Prerequisites Check

Before starting this chapter, verify you can:

Write loops and conditionals fluently (Chapter 3)
Choose appropriate data structures for different tasks (Chapter 4)
Handle floating-point arithmetic safely (Chapter 2)
Use IPython for testing and timing code (Chapter 1)
Design algorithms with pseudocode (Chapter 3)

Chapter Overview

Function A reusable block of code that performs a specific task, taking inputs (parameters) and optionally returning outputs. Functions encapsulate logic and create abstractions.

Functions are the fundamental building blocks of organized code. Without functions, you’d be copying and pasting the same code repeatedly, making bugs harder to fix and improvements impossible to maintain. But functions are more than just a way to avoid repetition—they’re how we create abstractions, manage complexity, and build reliable software. Whether you’re calculating statistical measures, simulating star cluster dynamics, or processing multi-wavelength observational data, every computational project starts with well-designed functions.

This chapter teaches you to think about functions as contracts between different parts of your code. When you write a function that calculates energy or performs numerical integration, you’re creating a promise: given valid input, the function will reliably return the correct output. This contract mindset helps you write functions that others (including future you) can trust and use effectively. You’ll learn how to choose between positional and keyword arguments, when to use default values, and how to design interfaces that are both flexible and clear.

Abstraction A simplified interface that hides complex implementation details, allowing users to work with concepts rather than low-level operations. Functions are the fundamental abstraction mechanism in programming.

We’ll explore Python’s scope rules, which determine where variables can be accessed, and learn how seemingly simple concepts like default arguments can create subtle bugs that have plagued even major scientific software packages. You’ll discover how Python’s flexible parameter system enables powerful interfaces, and how functional programming concepts prepare you for modern scientific computing frameworks. By the end, you’ll be organizing your code into modules that can be shared, tested, and maintained professionally—essential skills for collaborative scientific research.

📝 Import Statement Convention

In production code, Python modules should be imported at the top of files for clarity and efficiency. However, in these educational examples, we sometimes import inside functions for pedagogical clarity — to show exactly what each function needs. In your own code, always prefer top-level imports unless you have a specific reason to import locally (such as avoiding circular dependencies or optional dependencies).

TL;DR — Why Functions Matter in Scientific Code

Functions are how you turn a one-off script into a reusable scientific tool.

In this course, functions are not just “avoiding repetition.” They are how you:

make your logic testable (so you trust results)
isolate assumptions (units, tolerances, edge cases)
reuse code safely across projects
collaborate without breaking each other’s work

If you invest in this chapter, you’ll write less code—and get more correct results.

Course Policy Reminder: No Notebooks

COMP 536 uses scripts + IPython (not Jupyter notebooks). Functions are the unit of reuse in script-based workflows: they make your code modular, debuggable, and reproducible.

How to Use This Chapter

This chapter is a reference. You are not expected to memorize every parameter rule on first read.

Come back to it when you:

are copy/pasting the same code twice
can’t explain what a block of code is supposed to do
need to test a calculation in isolation
are ready to organize a project into modules

5.1 Defining Functions: The Basics

A function encapsulates a piece of logic that transforms inputs into outputs. Think of a function as a machine: you feed it raw materials (inputs), it performs some process (the function body), and it produces a product (output). In scientific computing, a function might calculate statistical measures, integrate equations, or transform data — each function performs one clear task that can be tested and trusted.

Your First Function

Let’s start with something every scientist needs — calculating mean and standard deviation:

Mean temperature: 23.92 degC

Let’s break down exactly how this works:

def keyword: Tells Python we’re defining a function.
Function name (calculate_mean): Follows snake_case convention, describes what it does.
Parameters (values): Variables that receive values when function is called.
Docstring: Brief description of what the function does (always include this!).
Function body: Indented code that does the actual work.
return statement: Sends a value back to whoever called the function.

When Python executes calculate_mean(measurements), it creates a temporary namespace where values = [23.5, 24.1, ...], runs the function body, and returns the result.

🔍 Check Your Understanding

What will this code print?

def calculate_product(x, y):
    product = x * y
    # Oops, forgot the return statement!

result = calculate_product(3.0, 4.0)
print(f"Product: {result}")

Answer

It prints Product: None. The function calculates product but doesn’t return it. Without an explicit return statement, Python functions return None. This is a common bug in scientific code!

To fix it:

def calculate_product(x, y):
    product = x * y
    return product  # Now it returns the value

Functions Without Return Values

Not all functions return values. Some perform actions like printing results or saving data:

Statistics for Temperature ( degC):
  Mean: 20.920
  Range: [19.800, 22.300]

Pure Functions vs Side Effects

A pure function returns a value and doesn’t modify outside state (easier to test, cache, parallelize, and reuse).
An impure function has side effects (prints, writes files, mutates globals); that can be useful, but it’s harder to test.

In this chapter, kinetic_energy_cgs(...) is pure, while report_statistics(...) is intentionally impure because it prints.

Returning Multiple Values

Python functions can return multiple values using tuples — perfect for calculations that produce related results:

Mean: 9.950
Std Dev: 0.171
Range: [9.7, 10.2]

Kinetic Energy in CGS Units

Now let’s calculate kinetic energy using CGS units (centimeters, grams, seconds):

Electron kinetic energy: 4.09e-11 ergs
In eV: 25.55 eV

The Design Process: From Problem to Function

Before writing any function, design it first. This prevents the common mistake of coding yourself into a corner:

Energy = 5000.0 ergs
Valid (positive)? True
Valid (in range)? True

🌟 Why This Matters: The Therac-25 Radiation Overdose Disasters

Between 1985 and 1987, the Therac-25 radiation therapy machine caused at least six accidents where patients received massive radiation overdoses — up to 100 times the intended dose. Three patients died directly from the overdoses, and three others suffered serious injuries including paralysis.

The root cause was a software function that lacked proper validation. The machine could be configured in two modes: low-power electron beam (direct) or high-power X-ray mode (with a metal target to convert electrons to X-rays). A race condition in the control software meant that if an operator typed too quickly when changing modes, the validation function would pass but leave the machine in a lethal state: high power WITHOUT the metal target in place, delivering concentrated electron beams directly to patients.

#| eval: false
# Simplified illustration of the type of bug:
def configure_beam(mode, power_level):
    # FATAL FLAW: No validation that configuration is complete
    # Race condition: if operator types fast, these can be inconsistent
    set_mode(mode)        # Could be "electron" mode
    set_power(power_level)  # Could be set to X-ray power level!
    # No check that mode and power are compatible

# What it should have been:
def configure_beam_safe(mode, power_level):
    # Validate configuration before allowing beam
    if mode == "electron" and power_level > ELECTRON_MAX:
        raise ValueError("Power too high for electron mode!")
    if mode == "xray" and not target_in_place():
        raise ValueError("X-ray mode requires target!")
    # Only proceed if validation passes
    set_mode(mode)
    set_power(power_level)

The tragedy led to fundamental changes in medical device software. Today, every critical function must validate inputs, check system state, and verify outputs. The few lines of validation code you write aren’t bureaucratic overhead—they’re the difference between healing and harm. Every time you add validation to your functions, you’re following safety practices written in the aftermath of preventable deaths.

[Source: Leveson, N.G. & Turner, C.S. (1993). “An Investigation of the Therac-25 Accidents”. IEEE Computer, 26(7), 18-41.]

💡 Computational Thinking: Function Contract Design

Every well-designed function follows a contract pattern that applies across all programming:

CONTRACT PATTERN:
1. **Preconditions:** What must be true before calling
2. **Postconditions:** What will be true after calling
3. **Invariants:** What stays unchanged
4. **Side effects:** What else happens

Example for kinetic_energy_cgs():
- **Precondition:** mass > 0, velocity is numeric
- **Postcondition:** returns positive energy value
- **Invariant:** input values unchanged
- **Side effects:** none (pure function)

This pattern appears in:
- Database transactions (ACID properties)
- API design (REST contracts)
- Parallel computing (thread safety)
- Unit testing (test contracts)

Function Contract Checklist (use this all semester)

Before you write a function, answer these:

Inputs: What are the inputs? What are their units and valid ranges?
Outputs: What do you return? In what units?
Preconditions: What must be true before calling?
Failure modes: What can go wrong (empty data, non-finite values, invalid ranges)?
Side effects: Does it print, write files, mutate inputs, or depend on global state?

A good function makes assumptions explicit and failures loud.

5.2 Function Arguments In-Depth

Parameter A variable in a function definition that receives a value when the function is called.

Python provides flexible ways to handle function parameters. Understanding the distinction between positional arguments, keyword arguments, and default values is crucial for designing clear, flexible interfaces. Let’s explore this flexibility through progressive examples.

Argument The actual value passed to a function when calling it.

Positional vs Keyword Arguments

Correct escape velocity: 1.12e+06 cm/s = 11.2 km/s
Wrong (args reversed): 1.19e-13 cm/s
The wrong value is 9373724690001567744x too small!

With keywords: 1.12e+06 cm/s

Default Arguments Make Functions Flexible

Default Argument A parameter value specified in the function definition that is used when no argument is provided for that parameter during the function call.

Default arguments let users omit parameters when standard values suffice:

At t=0: 1000000 atoms
After 5730 years: 500000 atoms
U-238 after 1 billion years: 857244 atoms

📚 When to Use Different Argument Types

Use Positional Arguments When:

The meaning is obvious from context (power(base, exponent))
There are only 1-2 required parameters
The order is natural and memorable

Use Keyword Arguments When:

There are many parameters (>3)
Parameters are optional
The meaning isn’t obvious (process(True, False, 5) vs process(verbose=True, cache=False, retries=5))

Use Default Arguments When:

There’s a sensible standard value
Most calls use the same value
You want backward compatibility when adding features

Best Practice Progression:

Start with required positional arguments
Add defaults for optional behaviors
Use keyword-only arguments for clarity (see Advanced section)

The Mutable Default Trap

Here’s a critical bug that has appeared in major scientific software:

Day 1: [23.5]
Day 2: [23.5, 24.1]
Same list? True

⚠️ Common Bug Alert: The Mutable Default Disaster

This bug has appeared in:

CERN analysis scripts (accumulated all runs’ data)
NASA trajectory calculations (mixed mission parameters)
Weather prediction models (combined different forecasts)

The symptom: data from previous runs mysteriously appears in new analyses.

The cause: Python evaluates default arguments once when the function is defined, not each time it’s called.

The fix: always use None as default for mutable arguments.

Here’s the correct pattern:

Day 1: [23.5]
Day 2: [24.1]
Combined: [23.5, 24.1]

Sentinel Value A special value (like None) used to signal a particular condition, often used as a default for mutable arguments.

Variable Arguments (*args and **kwargs)

*args Collects extra positional arguments into a tuple.

**kwargs Collects extra keyword arguments into a dictionary.

{'n': 5, 'mean': 3.0, 'std': 1.4142135623730951, 'min': 1, 'max': 5}
Mean only: 2.47

=== Experiment: Test A ===
Configuration:
  temperature_k: 293.15
  pressure_dynes_cm2: 1013250.0
  trials: 10
=== Experiment: Test B ===
Configuration:
  temperature_k: 350
  pressure_dynes_cm2: 2000000.0
  trials: 50
  catalyst: Platinum

Parameter Ordering Rules

Python enforces specific rules about parameter order in function definitions. These rules exist to make functions clearer and prevent ambiguous calls. Let’s build up from simple to complex.

Basic Rule: Required Before Optional

The fundamental rule is that parameters with defaults must come after parameters without defaults:

15
25

The Complete Parameter Order

Python supports different parameter types that must appear in a specific order. Here’s the complete hierarchy:

Order	Type	Syntax Example	How to Call
1	Positional-only	`def f(x, y, /):`	Must use position: `f(1, 2)`
2	Standard	`def f(a, b):`	Position or keyword: `f(1, 2)` or `f(a=1, b=2)`
3	Default values	`def f(x=10):`	Optional: `f()` or `f(5)`
4	*args	`def f(*args):`	Collects extras: `f(1, 2, 3)` $\to$ args=(1,2,3)
5	Keyword-only (required)	`def f(*, x):`	Must use name: `f(x=5)` (required)
6	Keyword-only with default	`def f(*, x=10):`	Optional keyword: `f()` or `f(x=5)`
7	**kwargs	`def f(**kwargs):`	Collects extra keywords: `f(a=1, b=2)`

Positional-Only Parameter A parameter that can only be passed by position, not by name. Marked with / after such parameters (Python 3.8+).

You rarely need all types in one function. Here are practical examples showing common combinations:

Keyword-Only Parameter A parameter that can only be passed by name, not by position. Created by placing parameters after *args or a bare *.

3.33
3.3333
Processing 4 points
Normalize: False
Options: {'threshold': 0.5, 'verbose': True}

Understanding Keyword-Only Parameters

Any parameter that comes AFTER *args (or a bare *) can ONLY be passed using its name. This feature makes function calls self-documenting:

A Practical API Design Rule

If a parameter is a boolean (e.g., normalize=True) or has units/meaning that isn’t obvious at the call site, make it keyword-only.

Common Patterns You’ll Use

Average of 2, 4, 6, 8: 5.0
Plotting 3 points
  color: red
  linewidth: 2
  marker: o

Understanding the Special Markers

/ Parameter Marker Forces all parameters before it to be positional-only. Parameters after / can be passed by position or keyword. Python 3.8+.

Python uses two special markers to control how arguments can be passed:

* Parameter Marker Forces all parameters after it to be keyword-only. Can be used alone (*, x) or with *args to collect positional arguments.

5.0
Configuring MyApp
  Debug mode enabled

🔍 Check Your Understanding

Which of these function definitions are valid, and how would you call each?

# Definition A
def func_a(x, y=1, *args, z):
    pass

# Definition B
def func_b(x, *args, z, y=1):
    pass

# Definition C
def func_c(x, y=1, z=2, *args):
    pass

# Definition D
def func_d(x, /, y, *, z):
    pass

Answer

A: VALID - z is keyword-only (required) because it comes after *args

func_a(1, 2, 3, 4, z=5)  # x=1, y=2, args=(3,4), z=5
func_a(1, z=5)           # x=1, y=1 (default), args=(), z=5
func_a(1, 2, 3)          # ERROR: missing required keyword argument 'z'

B: VALID - Both z (required) and y (optional) are keyword-only

func_b(1, 2, 3, z=5)     # x=1, args=(2,3), z=5, y=1 (default)
func_b(1, z=5, y=4)      # x=1, args=(), z=5, y=4
func_b(1, 2, 3)          # ERROR: missing required keyword 'z'

C: VALID - Standard pattern, *args collects extras at the end

func_c(1, 2, 3, 4, 5)    # x=1, y=2, z=3, args=(4,5)
func_c(1)                # x=1, y=1 (default), z=2 (default), args=()
func_c(1, 10)            # x=1, y=10, z=2 (default), args=()

D: VALID (Python 3.8+) - Mixed positioning rules

x must be passed positionally (before /)
y can be positional or keyword (between / and *)
z must be passed as keyword (after *)

func_d(1, 2, z=3)        # Valid: positional x, positional y, keyword z
func_d(1, y=2, z=3)      # Valid: positional x, keyword y, keyword z
func_d(x=1, y=2, z=3)    # ERROR: x must be positional (not keyword)
func_d(1, 2, 3)          # ERROR: z must be keyword (not positional)

Key Patterns to Remember:

Parameters after *args or * are keyword-only
Parameters before / are positional-only (Python 3.8+)
Required parameters must come before optional ones in each section

5.3 Scope and Namespaces

Scope The region of a program where a variable is accessible.

Understanding scope—where variables can be accessed—is crucial for writing bug-free code. Python’s scope rules determine which variables are visible at any point in your program. Without understanding scope, you’ll encounter confusing bugs where variables don’t have the values you expect, or worse, where changing a variable in one place mysteriously affects code elsewhere.

Namespace A container that holds a set of identifiers and their associated objects.

The LEGB Rule

LEGB Local, Enclosing, Global, Built-in - Python’s scope resolution order.

Python resolves variable names using the LEGB rule, searching in this order:

Local: Inside the current function
Enclosing: In the enclosing function (for nested functions)
Global: At the top level of the module
Built-in: In the built-in namespace (print, len, etc.)

  Inside relativistic: beta = 0.500, gamma = 1.155
Inside calculate: E = 9.45e-07 ergs
Global scope: c = 3.00e+10 cm/s
Final energy: 9.45e-07 ergs = 590120.65 eV

The `UnboundLocalError` Trap

Count: 2
Global counter: 2

⚠️ Common Bug Alert: UnboundLocalError

The error happens because Python sees you’re assigning to counter, assumes it’s local, but then can’t find a local value to increment.

Symptoms:

Variable works fine when reading
Crashes when trying to modify
Error message mentions “referenced before assignment”

Fix: Either use global keyword or (better) pass the value explicitly.

Real disaster: A climate model that tried to update global temperature but created local variables instead, producing nonsense results for months before discovery.

Closures: Functions That Remember

Closure A function that remembers variables from its enclosing scope even after that scope has finished.

Rectangle rule: 0.328350
Midpoint rule:  0.333325
Exact:          0.333333
Rectangle error: 0.004983
Midpoint error:  0.000008

🔍 Check Your Understanding

What happens if both inner and outer functions define the same variable name?

Inner sees: inner
Outer sees: outer

Answer

Each function sees its own local variable. The inner function’s x doesn’t affect the outer function’s x. They’re in different namespaces!

Output:

Inner sees: inner
Outer sees: outer

If the inner function didn’t define its own x, it would see the outer’s x due to the LEGB rule (finding it in the Enclosing scope).

💡 Computational Thinking: Why Global Variables Are Dangerous

Global variables violate the “principle of least surprise” in computing:

HIDDEN STATE ANTI-PATTERN:
- Function behavior depends on external state
- Can't understand function in isolation
- Testing requires global setup
- Parallel processing becomes impossible

Real disaster: Mars Climate Orbiter (1999)
- Global unit variable (metric vs imperial)
- One module changed it to imperial
- Another module read it expecting metric
- $327 million spacecraft crashed into Mars

Better pattern: Explicit State Passing
BAD:  current_units = 'metric'; calculate()
GOOD: calculate(units='metric')

5.4 Functional Programming Elements

Side Effect Any state change that occurs beyond returning a value from a function, such as modifying global variables or printing output.

Pure Function A function that always returns the same output for the same input with no side effects.

Python supports functional programming—a style that treats computation as the evaluation of mathematical functions. These concepts are essential for modern scientific frameworks like JAX and lead to cleaner, more testable code.

Lambda Functions

Lambda An anonymous function defined inline using the lambda keyword.

Lambda functions are small, anonymous functions defined inline:

Traditional: 25
Lambda: 25

Sorted by x:
  x=0.8, y=1.2
  x=1.5, y=2.3
  x=2.1, y=4.7

Sorted by relative error:
  x=2.1, relative error=1.1%
  x=1.5, relative error=4.3%
  x=0.8, relative error=16.7%

Map, Filter, and Reduce

These functional tools transform how you process data:

A Pythonic Hierarchy (What to Reach For First)

List comprehension / generator expression (default)
map / filter (fine, but can be harder to read)
reduce (rare in Python; prefer sum, min, max, any, all)

Original: [981, 979, 983, 9900, 980, 982, 978]
Filtered: [981, 979, 983, 980, 982, 978]
Removed 1 outliers
Calibrated: ['983.0', '981.0', '985.0', '982.0', '984.0', '980.0']
Final mean: 982.5 cm/s^2

🔍 Check Your Understanding

Rewrite this loop using functional programming:

#| eval: false
# Square all positive values
results = []
for x in data:
    if x > 0:
        results.append(x**2)

Answer

Two equivalent functional approaches:

#| eval: false
# Using filter and map
results = list(map(
    lambda x: x**2,
    filter(lambda x: x > 0, data)
))

# Using list comprehension (more Pythonic)
results = [x**2 for x in data if x > 0]

The list comprehension is generally preferred in Python for readability, but understanding the functional approach prepares you for libraries like JAX.

Functions as First-Class Objects

In Python, functions are objects you can pass around:

Grams: [100, 200, 300, 400]
Kilograms: [0.1, 0.2, 0.3, 0.4]

Dynes: [100000.0, 200000.0, 300000.0, 400000.0]
Newtons: [1.0, 2.0, 3.0, 4.0]

Original: [2, 4, 6, 8, 10]
Normalized: ['-1.41', '-0.71', '0.00', '0.71', '1.41']

🌟 Why This Matters: Modern Frameworks Require Functional Thinking

Functional programming isn’t just academic—it’s essential for modern scientific computing:

JAX (Google’s NumPy replacement) requires pure functions for automatic differentiation
Parallel processing works best with stateless functions
Testing is trivial when functions have no side effects
GPU computing maps naturally to functional operations

Example: In JAX, you can automatically differentiate through an entire physics simulation if it’s written functionally. This enables techniques like physics-informed neural networks that would be impossibly complex to implement manually.

5.5 Modules and Packages

Module A Python file containing definitions and statements that can be imported.

As your analysis grows from scripts to projects, organization becomes critical. Modules let you organize related functions together and reuse them across projects.

Package A directory containing multiple Python modules and an __init__.py file.

Creating Your First Module

📝 Creating Module Files

You can create module files two ways: 1. Recommended: Use your text editor to create a new .py file 2. Optional: Programmatically write files (not recommended; disabled in the rendered book)

Save this as statistics_tools.py in your current directory:

#| eval: false
"""
statistics_tools.py - Basic statistical calculations.

This module provides fundamental statistical functions.
"""

import math

def mean(data):
    """Arithmetic mean (uses math.fsum for better numerical stability)."""
    if len(data) == 0:
        raise ValueError("mean() requires at least one value")
    return math.fsum(data) / len(data)

def variance(data, ddof=0):
    """
    Variance with optional degrees of freedom adjustment.

    ddof=0 gives population variance (divide by n).
    ddof=1 gives sample variance (divide by n-1).
    """
    n = len(data)
    if n - ddof <= 0:
        raise ValueError(f"variance() requires n > ddof (n={n}, ddof={ddof})")
    m = mean(data)
    return math.fsum((x - m)**2 for x in data) / (n - ddof)

def std_dev(data, ddof=0):
    """Standard deviation corresponding to variance(ddof=...)."""
    return math.sqrt(variance(data, ddof=ddof))

def std_error(data):
    """Standard error of the mean (uses sample std: ddof=1)."""
    return std_dev(data, ddof=1) / math.sqrt(len(data))

if __name__ == "__main__":
    test_data = [1, 2, 3, 4, 5]
    print(f"Test mean: {mean(test_data)}")
    print(f"Test sample std dev: {std_dev(test_data, ddof=1):.3f}")

Using Your Module

#| eval: false
# Different import methods

# Method 1: Import entire module
import statistics_tools

data = [10, 12, 11, 13, 10, 11, 12]
m = statistics_tools.mean(data)
s = statistics_tools.std_dev(data)
print(f"Method 1 - Mean: {m:.2f}, Std: {s:.2f}")

# Method 2: Import specific functions
from statistics_tools import mean, std_error

se = std_error(data)
print(f"Method 2 - Standard error: {se:.3f}")

# Method 3: Import with alias
import statistics_tools as stats

var = stats.variance(data)
print(f"Method 3 - Variance: {var:.3f}")

🧪 Debugging Lab: Import Caching and reload

If you edit statistics_tools.py (or any module) and your changes “don’t show up”, you’re probably seeing import caching: Python only imports a module once per session.

#| eval: false
import importlib
import statistics_tools

importlib.reload(statistics_tools)

Building a Physics Module

Let’s create a comprehensive module for physics calculations:

#| eval: false
"""
physics_cgs.py - Physics calculations in CGS units.

All calculations use CGS (centimeter-gram-second) units.
"""

# Physical constants in CGS
SPEED_OF_LIGHT = 2.998e10      # cm/s
PLANCK_CONSTANT = 6.626e-27    # erg*s
BOLTZMANN = 1.381e-16          # erg/K
ELECTRON_MASS = 9.109e-28      # g
PROTON_MASS = 1.673e-24        # g
GRAVITATIONAL_CONSTANT = 6.674e-8  # cm^3/(g*s^2)

def kinetic_energy(mass_g, velocity_cms):
    """KE = (1/2) m v^2 in ergs."""
    return 0.5 * mass_g * velocity_cms**2

def momentum(mass_g, velocity_cms):
    """p = mv in g*cm/s."""
    return mass_g * velocity_cms

def de_broglie_wavelength(mass_g, velocity_cms):
    """lambda = h/p in cm."""
    p = momentum(mass_g, velocity_cms)
    return PLANCK_CONSTANT / p if p != 0 else float('inf')

def thermal_velocity(temp_k, mass_g):
    """RMS thermal velocity in cm/s."""
    import math
    return math.sqrt(3 * BOLTZMANN * temp_k / mass_g)

def photon_energy(wavelength_cm):
    """E = hc/lambda in ergs."""
    return PLANCK_CONSTANT * SPEED_OF_LIGHT / wavelength_cm

def gravitational_force(m1_g, m2_g, distance_cm):
    """F = Gm_1m_2/r^2 in dynes."""
    if distance_cm == 0:
        return float('inf')
    return GRAVITATIONAL_CONSTANT * m1_g * m2_g / distance_cm**2

class Particle:
    """Preview of Chapter 6: a simple particle with physics methods."""

    def __init__(self, mass_g, velocity_cms):
        self.mass = mass_g
        self.velocity = velocity_cms

    def kinetic_energy(self):
        return kinetic_energy(self.mass, self.velocity)

    def wavelength(self):
        return de_broglie_wavelength(self.mass, self.velocity)

if __name__ == "__main__":
    # Tiny smoke test
    v_electron = 0.01 * SPEED_OF_LIGHT  # cm/s
    ke = kinetic_energy(ELECTRON_MASS, v_electron)
    wavelength = de_broglie_wavelength(ELECTRON_MASS, v_electron)
    print(f"KE={ke:.2e} ergs, lambda={wavelength:.2e} cm")

After saving that as physics_cgs.py, you can import and use it:

#| eval: false
import physics_cgs

v_electron = 0.01 * physics_cgs.SPEED_OF_LIGHT  # cm/s
ke = physics_cgs.kinetic_energy(physics_cgs.ELECTRON_MASS, v_electron)
wavelength = physics_cgs.de_broglie_wavelength(physics_cgs.ELECTRON_MASS, v_electron)

print(f"Electron at 1% c:")
print(f"  Velocity = {v_electron:.2e} cm/s")
print(f"  KE = {ke:.2e} ergs")
print(f"  de Broglie lambda = {wavelength:.2e} cm")

Import Best Practices

#| eval: false
# GOOD: Clear, explicit imports
import numpy as np
import matplotlib.pyplot as plt
from math import sin, cos, pi

# BAD: Wildcard imports pollute namespace
# from numpy import *  # Adds 600+ names!
# from math import *   # Conflicts with numpy!

# Example of namespace collision
import math
import cmath  # Complex math module (introduced in Chapter 2)

# Clear which log we're using
real_log = math.log(10)      # Natural log of real number
complex_log = cmath.log(-1)   # Can handle complex numbers

print(f"math.log(10) = {real_log:.3f}")
print(f"cmath.log(-1) = {complex_log}")  # Returns complex number

# If we had used 'from math import *':
# log(10)  # Which log? math or cmath? Unclear!

🔍 Check Your Understanding

Why is from module import * dangerous in scientific code?

Answer

Wildcard imports are dangerous because:

Name collisions: Multiple libraries have functions with same names (log, sqrt, sin)
Hidden overwrites: Later imports silently replace earlier ones
Unclear source: Can’t tell where a function comes from
Namespace pollution: Hundreds of unwanted names

Real example that caused wrong results:

from numpy import *    # Has log (natural log)
from math import *     # Also has log, overwrites numpy's
from sympy import *    # Has symbolic log, overwrites again

# Which log is this?
result = log(data)  # Could be any of the three!

Always use explicit imports for clarity and safety!

🌟 The More You Know: How Modular Code Detected Gravitational Waves

On September 14, 2015, at 09:50:45 UTC, the Laser Interferometer Gravitational-Wave Observatory (LIGO) detected humanity’s first gravitational wave signal, named GW150914. This discovery, which earned the 2017 Nobel Prize in Physics, was only possible because of meticulously modular software architecture.

The LIGO analysis pipeline consisted of hundreds of independent Python modules, each performing one specific task:

#| eval: false
# Simplified LIGO pipeline structure
def calibrate_strain(raw_data, calibration_data):
    """Convert detector output to strain."""
    # Complex calibration involving transfer functions

def remove_lines(strain_data, line_frequencies):
    """Remove power line interference and harmonics."""
    # Notch filters at 60Hz, 120Hz, etc.

def whiten_data(strain, psd):
    """Normalize frequency response."""
    # Spectral whitening for uniform sensitivity

def matched_filter(data, template_bank):
    """Search for gravitational wave patterns."""
    # Cross-correlation with theoretical waveforms

When the first signal appeared, this modular design enabled unprecedented rapid verification. Within minutes, automated pipelines had processed the data through dozens of independent checks. The modularity allowed the 1000+ member collaboration to:

Test each processing step with simulated signals
Run multiple independent analysis pipelines in parallel
Swap alternative algorithms to verify robustness
Complete peer review in record time

The discovery paper lists over 30 independent software modules, each thoroughly tested and documented. This modular approach—exactly what you’re learning in this chapter—enabled one of physics’ greatest discoveries!

🌟 The More You Know: The $370 Million Integer Overflow

On June 4, 1996, the maiden flight of Ariane 5 ended in spectacular failure just 37 seconds after liftoff. The rocket veered off course and self-destructed, destroying four Cluster satellites worth $370 million. This disaster teaches us a crucial lesson about function design and validation.

The root cause traced back to a single function in the inertial reference system, originally written for Ariane 4. This function converted a 64-bit floating-point horizontal velocity value to a 16-bit signed integer. In Ariane 4’s slower flight profile, this worked perfectly. But Ariane 5 was more powerful. At T+37 seconds, the horizontal velocity exceeded 32,767—the maximum value for a 16-bit signed integer.

The conversion function essentially looked like this conceptually:

#| eval: false
def convert_velocity(velocity_64bit):
    # Direct conversion without validation
    return int(velocity_64bit)  # Assumes it fits in 16 bits!

What they needed was:

#| eval: false
def convert_velocity_safe(velocity_64bit):
    MAX_INT16 = 32767
    MIN_INT16 = -32768

    if velocity_64bit > MAX_INT16:
        # Handle overflow appropriately
        log_error(f"Velocity {velocity_64bit} exceeds 16-bit range")
        return MAX_INT16  # Or raise exception with proper handling
    elif velocity_64bit < MIN_INT16:
        return MIN_INT16

    return int(velocity_64bit)

The overflow caused an exception in both the primary and backup systems (which ran identical code). The main flight computer, suddenly receiving diagnostic data instead of attitude information, interpreted it as actual flight data and commanded a violent course correction. The resulting aerodynamic forces exceeded design limits, triggering automatic self-destruct.

The investigation revealed four critical lessons about function design: 1. Always validate input ranges, especially when converting between data types 2. Never assume code reuse is safe without checking new operating conditions 3. Design for graceful degradation rather than complete failure 4. Test with realistic data ranges that match actual operating conditions

Every validation function you write, every bounds check you add, every type conversion you verify—these aren’t bureaucratic overhead. They’re the difference between mission success and catastrophic failure. This disaster led directly to modern software engineering practices that require explicit contracts for all functions, comprehensive range checking, and formal verification of critical code paths.

5.6 Documentation and Testing

Good documentation and testing make your functions trustworthy and reusable. Professional scientific code requires both to ensure reproducibility and reliability.

Documentation Levels

Integral of x^2 from 0 to 2: 2.750
Exact answer: 2.667
Error: 0.083

Testing Your Functions

Test 1: Simple roots - PASSED
Test 2: No real roots - PASSED
Test 3: Repeated root - PASSED
All tests passed!

Introduction to Type Hints (Optional Enhancement)

📌 Type Hints are Optional

Type hints are an optional Python feature introduced in Python 3.5. They document expected types but Python does not enforce them - they’re primarily for documentation and IDE support.

For this course: Type hints are not required but can help make your code clearer. Focus on writing correct functions first, then add type hints if desired.

Python supports type hints to document expected types:

Energy: 4.10e-09 ergs
Integer inputs still work: 5000.0 ergs
Mean: 3.0, Std: 1.41, Median: 3.0

🛠️ Debug This!

This function has a subtle bug. Can you spot it?

def find_peaks(data, threshold):
    """Find all peaks above threshold - HAS BUG."""
    peaks = []
    peak_indices = []

    for i in range(len(data)):
        if data[i] > threshold:
            # Check if it's a local maximum
            if i == 0 or data[i] > data[i-1]:
                if i == len(data)-1 or data[i] > data[i+1]:
                    peaks.append(data[i])
                    peak_indices.append(i)

    return peaks, peak_indices

Answer and Fix

Bug: The strict > comparisons miss flat-topped (“plateau”) peaks where adjacent values are equal. A naive switch to >= can create a new problem: it may label every point on a plateau as a separate peak.

Fix:

def find_peaks_fixed(data, threshold):
    """
    Find local maxima above threshold.

    Plateau rule (simple heuristic): for a flat-topped peak, return the center index.
    """
    peaks = []
    peak_indices = []

    i = 0
    n = len(data)
    while i < n:
        if data[i] <= threshold:
            i += 1
            continue

        # Expand to full plateau [left, right] where values are equal
        left = i
        while left > 0 and data[left - 1] == data[i]:
            left -= 1
        right = i
        while right < n - 1 and data[right + 1] == data[i]:
            right += 1

        # Peak if plateau is higher than neighbors outside the plateau
        left_ok = (left == 0) or (data[left] > data[left - 1])
        right_ok = (right == n - 1) or (data[right] > data[right + 1])
        if left_ok and right_ok:
            idx = (left + right) // 2
            peaks.append(data[idx])
            peak_indices.append(idx)

        i = right + 1  # skip past the plateau

    return peaks, peak_indices

Peak definitions are domain-specific. This heuristic is a reasonable default, and it avoids marking every plateau point as a separate peak.

5.7 Performance Considerations

Understanding function overhead helps you write efficient code. Let’s measure and optimize!

Function Call Overhead

Function Call Analysis:
Simple function: 5.0 ms
Inline addition: 0.8 ms
Overhead factor: 6.0x

Complex function: 26.2 ms

Lesson: Overhead only matters for trivial operations!

Vectorization (A Quick Preview)

You mentioned vectorization in the learning objectives; here’s what it looks like in practice. The core idea: push work into NumPy’s compiled loops instead of Python loops.

#| eval: false
import numpy as np
import timeit

rng = np.random.default_rng(0)
n = 200_000
masses = rng.uniform(0.5, 2.0, size=n)         # g (toy)
velocities = rng.normal(0.0, 1.0e7, size=n)    # cm/s (toy)

def ke_loop(m, v):
    out = []
    for mi, vi in zip(m, v):
        out.append(0.5 * mi * vi**2)
    return out

def ke_vec(m, v):
    return 0.5 * m * v**2

print("Loop:", timeit.timeit("ke_loop(masses, velocities)", globals=globals(), number=3))
print("Vec :", timeit.timeit("ke_vec(masses, velocities)", globals=globals(), number=50))

Memoization for Expensive Calculations

Memoization Caching function results to avoid recomputing expensive operations.

Fibonacci(28) = 317811
Without memoization: 33.1 ms
With memoization: 0.113 ms
Speedup: 293x

Cache info: CacheInfo(hits=26, misses=29, maxsize=128, currsize=29)

Planck function at 500nm, 5778K: 2.64e+14 erg/(s cm^2 cm sr)
Cache info: CacheInfo(hits=0, misses=1, maxsize=128, currsize=1)

Main Takeaways

Functions transform programming from repetitive scripting into modular, maintainable software engineering. When you encapsulate logic in well-designed functions, you create building blocks that can be tested independently, shared with collaborators, and combined into complex analysis pipelines. The progression from simple functions to modules to packages mirrors how scientific software naturally grows—what starts as a quick calculation evolves into a shared tool used by entire research communities.

The distinction between positional, keyword, and default arguments gives you the flexibility to design interfaces that are both powerful and intuitive. Positional arguments work well for obvious parameters like power(base, exponent), while keyword arguments with defaults enable complex functions that remain simple for common cases. Understanding when to use each type—and the critical danger of mutable default arguments—prevents the subtle bugs that have plagued major scientific packages.

The scope rules and namespace concepts you’ve learned explain why variables sometimes behave unexpectedly in complex programs. Understanding the LEGB rule prevents frustrating bugs where variables have unexpected values or modifications in one place affect seemingly unrelated code. The mutable default argument trap demonstrates why understanding Python’s evaluation model is crucial for writing reliable code. These aren’t just academic concepts—they’ve caused real disasters in production systems.

Functional programming concepts like map, filter, and pure functions prepare you for modern scientific computing frameworks. JAX requires functional style for automatic differentiation, parallel processing works best with stateless functions, and testing becomes trivial when functions have no side effects. The ability to pass functions as arguments and return them from other functions enables powerful patterns like the specialized integrators we created with closures.

The performance measurements showed that function call overhead only matters for trivial operations in tight loops—exactly where you’ll want to use NumPy’s vectorized operations (Chapter 7) instead. For complex calculations, overhead is negligible compared to computation time. Memoization can provide dramatic speedups when expensive calculations repeat, as often happens in optimization and parameter searching.

Looking forward, the functions you’ve learned to write here form the foundation for object-oriented programming in Chapter 6, where functions become methods attached to objects. The module organization skills prepare you for building larger scientific packages, while the documentation practices ensure your code can be understood and maintained by others. Most importantly, thinking in terms of functional contracts and clear interfaces will make you a better computational scientist, capable of building the robust, efficient tools that modern research demands.

With functions and modules, you can hide complexity behind clean interfaces. In Chapter 6, you’ll go one step further: you’ll bundle data + behavior into objects, which is often the most natural way to model scientific systems.

Definitions

argument - The actual value passed to a function when calling it (e.g., in f(5), 5 is an argument)

closure - A function that remembers variables from its enclosing scope even after that scope has finished executing

decorator - A function that modifies another function’s behavior without changing its code

default argument - A parameter value used when no argument is provided during function call

docstring - A string literal that appears as the first statement in a function, module, or class to document its purpose

function - A reusable block of code that performs a specific task, taking inputs and optionally returning outputs

global - A keyword that allows a function to modify a variable in the global scope

keyword argument - An argument passed to a function by explicitly naming the parameter

lambda - An anonymous function defined inline using the lambda keyword

LEGB - The order Python searches for variables: Local, Enclosing, Global, Built-in

memoization - Caching function results to avoid recomputing expensive operations

module - A Python file containing definitions and statements that can be imported and reused

namespace - A container that holds a set of identifiers and their associated objects

package - A directory containing multiple Python modules and an __init__.py file

parameter - A variable in a function definition that receives a value when the function is called

positional argument - An argument passed to a function based on its position in the parameter list

pure function - A function that always returns the same output for the same input with no side effects

return value - The result that a function sends back to the code that called it

scope - The region of a program where a variable is accessible

sentinel value - A special value (like None) used to signal a particular condition, often as default for mutable arguments

side effect - Any state change that occurs beyond returning a value from a function

*args - Syntax for collecting variable positional arguments into a tuple

****kwargs** - Syntax for collecting variable keyword arguments into a dictionary

Key Takeaways

Functions are contracts: they promise specific outputs for given inputs
Choose positional arguments for obvious parameters, keyword arguments for optional ones
The mutable default argument trap occurs because defaults are evaluated once at definition time
Always use None as a sentinel for mutable default arguments
Python searches for variables using LEGB: Local, Enclosing, Global, Built-in
Global variables make code hard to test, debug, and parallelize
Lambda functions are useful for simple operations but limited to single expressions
Functional programming concepts (map, filter, reduce) prepare you for modern frameworks
The if __name__ == "__main__" pattern makes modules both importable and executable
Never use from module import * in production code—it causes namespace pollution
Docstrings are essential for scientific code that others will use and maintain
Function call overhead matters only in tight loops with trivial operations
Memoization can dramatically speed up expensive repeated calculations
Testing and validation functions are as important as calculation functions
Performance optimization should follow: algorithm $\to$ vectorization $\to$ caching $\to$ parallelization

Quick Reference Tables

Function Definition Patterns

Pattern	Syntax	Use Case
Basic function	`def func(x, y):`	Simple operations
Positional only	`def func(a, b, /):`	Force positional
Default arguments	`def func(x, y=10):`	Optional parameters
Keyword only	`def func(*, x, y):`	Force keywords
Variable args	`def func(*args):`	Unknown number of inputs
Keyword args	`def func(**kwargs):`	Flexible options
Combined	`def func(a, args, x=1, *kwargs):`	Maximum flexibility
Lambda	`lambda x: x**2`	Simple inline functions

When to Use Different Argument Types

Argument Type	When to Use	Example
Positional	Obvious meaning, 1-3 params	`power(2, 3)`
Keyword	Many params, optional	`plot(data, color='red')`
Default	Common values	`round(3.14, digits=2)`
*args	Variable inputs	`maximum(1, 2, 3, 4)`
**kwargs	Configuration options	`setup(debug=True, verbose=False)`

Module Import Patterns

Pattern	Example	When to Use
Import module	`import numpy`	Use many functions
Import with alias	`import numpy as np`	Standard abbreviations
Import specific	`from math import sin, cos`	Few specific functions
Import all (avoid!)	`from math import *`	Never in production
Relative import	`from . import module`	Within packages

Common Function Bugs and Fixes

Problem	Symptom	Fix
Mutable default	Data persists between calls	Use `None` sentinel
UnboundLocalError	Can’t modify global	Use `global` or pass value
Missing return	Function returns None	Add `return` statement
Namespace pollution	Name conflicts	Avoid wildcard imports
Slow recursion	Exponential time	Add memoization
Type confusion	Unexpected types	Add type hints/validation

Next Chapter Preview

With functions and modules mastered, Chapter 6 introduces Object-Oriented Programming (OOP)—a paradigm that bundles data and behavior together. You’ll learn to create classes that model physical systems naturally: a Particle class with position and velocity attributes, methods to calculate energy and momentum, and special methods that make your objects work seamlessly with Python’s built-in functions.

The functional programming concepts from this chapter provide essential background for OOP. Methods are just functions attached to objects, and understanding scope prepares you for the self parameter that confuses many beginners. The module organization skills you’ve developed will expand to organizing classes and building object hierarchies. Most importantly, the design thinking you’ve practiced—creating clean interfaces and thinking about contracts—directly applies to designing effective classes that model the complex systems you’ll encounter in computational physics.

References

Historical Events and Technical Details

Ariane 5 Flight 501 Failure (1996)
- Lions, J. L. et al. (1996). “ARIANE 5 Flight 501 Failure: Report by the Inquiry Board.” European Space Agency. Paris: ESA.
- Gleick, J. (1996). “A Bug and A Crash.” The New York Times Magazine, December 1, 1996.
- Nuseibeh, B. (1997). “Ariane 5: Who Dunnit?” IEEE Software, 14(3), 15-16.
LIGO Gravitational Wave Detection (2015)
- Abbott, B. P. et al. (LIGO Scientific Collaboration) (2016). “Observation of Gravitational Waves from a Binary Black Hole Merger.” Physical Review Letters, 116(6), 061102.
- Abbott, B. P. et al. (2016). “GW150914: The Advanced LIGO Detectors in the Era of First Discoveries.” Physical Review Letters, 116(13), 131103.
- LIGO Scientific Collaboration. (2021). “LIGO Algorithm Library - LALSuite.” Available at: https://lscsoft.docs.ligo.org/lalsuite/
- LIGO Open Science Center. (2024). “LIGO Open Data.” Available at: https://www.gw-openscience.org/
Therac-25 Radiation Overdose Disasters (1985-1987)
- Leveson, N.G. & Turner, C.S. (1993). “An Investigation of the Therac-25 Accidents.” IEEE Computer, 26(7), 18-41.
- “Therac-25.” Wikipedia, The Free Encyclopedia. Wikimedia Foundation, Inc. Retrieved from: https://en.wikipedia.org/wiki/Therac-25
Mars Climate Orbiter Loss (1999)
- Stephenson, A. G. et al. (1999). “Mars Climate Orbiter Mishap Investigation Board Report.” NASA.
- Oberg, J. (1999). “Why the Mars Probe Went Off Course.” IEEE Spectrum, 36(12), 34-39.

Python Documentation

Python Language Reference
- Van Rossum, G., & Drake, F. L. (2024). “Python Language Reference, version 3.12.” Python Software Foundation. Available at: https://docs.python.org/3/reference/
Scientific Python Resources
- Harris, C. R. et al. (2020). “Array programming with NumPy.” Nature, 585(7825), 357-362.
- VanderPlas, J. (2016). Python Data Science Handbook. O’Reilly Media. ISBN: 978-1491912058.

Learning Objectives

Prerequisites Check

Chapter Overview

5.1 Defining Functions: The Basics

Your First Function

Functions Without Return Values

Returning Multiple Values

Kinetic Energy in CGS Units

The Design Process: From Problem to Function

5.2 Function Arguments In-Depth

Positional vs Keyword Arguments

Default Arguments Make Functions Flexible

The Mutable Default Trap

Variable Arguments (*args and **kwargs)

Parameter Ordering Rules

Basic Rule: Required Before Optional

The Complete Parameter Order

Understanding Keyword-Only Parameters

Common Patterns You’ll Use

Understanding the Special Markers

5.3 Scope and Namespaces

The LEGB Rule

The UnboundLocalError Trap

Closures: Functions That Remember

5.4 Functional Programming Elements

Lambda Functions

Map, Filter, and Reduce

Functions as First-Class Objects

5.5 Modules and Packages

Creating Your First Module

Using Your Module

Building a Physics Module

Import Best Practices

5.6 Documentation and Testing

Documentation Levels

Testing Your Functions

Introduction to Type Hints (Optional Enhancement)

5.7 Performance Considerations

Function Call Overhead

Vectorization (A Quick Preview)

Memoization for Expensive Calculations

Main Takeaways

Definitions

Key Takeaways

Quick Reference Tables

Function Definition Patterns

When to Use Different Argument Types

Module Import Patterns

Common Function Bugs and Fixes

Next Chapter Preview

References

Historical Events and Technical Details

Python Documentation

The `UnboundLocalError` Trap