Author

Anna Rosen


title: “Chapter 3: Control Flow & Logic” subtitle: “Python Fundamentals | COMP 536” author: “Anna Rosen” draft: false execute: freeze: auto echo: true warning: true error: false format: html: toc: true code-fold: true code-summary: “Show code” —

Learning Objectives

By the end of this chapter, you will be able to:

Note✅ Before Starting This Chapter

If any boxes are unchecked, review the indicated chapters first.


Chapter Overview

Programming is fundamentally about teaching computers to make decisions and repeat tasks. When you write an if statement or a loop, you’re translating human logic into instructions a machine can follow. But here’s the critical insight that separates computational thinkers from mere coders: the logic must be designed before it’s implemented. This chapter transforms you from someone who writes code to someone who designs algorithms.

We’ll start with the lost art of pseudocode — not as a bureaucratic exercise, but as the difference between code that works by accident and code that works by design. You’ll learn to recognize five universal algorithmic patterns that appear across scientific computing: accumulation, filtering, mapping, searching, and convergence. These patterns will appear in every project you build, from N-body simulations to data-driven modeling.

Important💡 The Five Universal Patterns (Canonical List)
  1. Accumulation: combine many values into one (sum, mean, running variance).
  2. Filtering: keep only items that satisfy a condition (quality cuts, SNR thresholds).
  3. Mapping: transform each item (magnitude \(\to\) flux, pixels \(\to\) calibrated pixels).
  4. Searching: find an item or best candidate (first detection, best period, nearest neighbor).
  5. Convergence: iterate until a stopping criterion is met (tolerance, max iterations).

The control flow structures we explore here are where your numerical calculations from Chapter 2 become dynamic algorithms. Every convergence test, every adaptive timestep, every Monte Carlo acceptance criterion depends on mastering these concepts deeply, not just syntactically. By chapter’s end, you’ll see code not as a sequence of commands, but as a carefully orchestrated flow of decisions and iterations that solve real scientific problems — the same style of thinking used in research pipelines and mission-scale software.

TipTL;DR — Why This Chapter Matters

This chapter is the turning point. If you genuinely internalize the material here — not memorize it, but understand it — you stop being someone who writes Python and become someone who designs algorithms that happen to be expressed in Python.

The concepts in this chapter underpin everything that follows: simulations, convergence tests, data pipelines, scientific workflows. When you’re debugging a loop that won’t terminate or a conditional that silently produces wrong results, you’ll return here. When you’re designing your first N-body integrator or writing quality filters for real data, these patterns will guide you.

This chapter is worth revisiting. Come back to it mid-semester. The pseudocode strategies and algorithmic patterns will make more sense once you’ve seen them in action.

Warning🚫 Course Policy: No Jupyter Notebooks

COMP 536 does not permit Jupyter notebooks — not because notebooks are bad tools, but because they actively interfere with what this chapter teaches.

Why notebooks undermine control flow learning:

  • Hidden state: Variables persist invisibly between cells, making it impossible to reason about what your code actually does when run fresh.
  • Non-deterministic execution order: Running cells out of order creates bugs that only appear sometimes — the worst kind.
  • Silent logic errors: A loop that should fail might “work” because a variable was set by a cell you ran earlier and forgot about.
  • Unreproducible results: Your code may not produce the same output when someone else runs it (or when you run it tomorrow).

Control flow is about predictable, reproducible execution. Notebooks make execution unpredictable by design. For exploratory data analysis elsewhere in your career, notebooks can be useful — but for learning to think algorithmically, they create more confusion than they solve.

In this course: Write .py scripts. Run them from the terminal. Know exactly what state your program has at every moment.

NoteHow This Chapter Differs from Most Programming Courses

Most introductory courses teach control flow as syntax: “here’s how to write an if statement, here’s how to write a for loop.” You’ll learn that syntax here too — but that’s not the point.

What this chapter emphasizes instead:

  • Design before code: You’ll learn to sketch algorithms in pseudocode before touching Python, catching logical flaws while they’re still cheap to fix.
  • Universal patterns: The five patterns (accumulation, filtering, mapping, searching, convergence) appear in every scientific domain. Learn them once, apply them everywhere.
  • Failure modes: Real scientific code must handle edge cases, invalid inputs, and non-convergence. You’ll learn to think about what can go wrong, not just what should go right.
  • Correctness over cleverness: A loop that’s easy to verify beats a one-liner that’s hard to debug. We prioritize code you can trust.

The same algorithmic thinking you develop here is used in telescope scheduling systems, climate models, and spacecraft navigation. You’re not learning “Python tricks” — you’re learning how computational scientists think.

Note📖 How to Use This Chapter

This chapter is a reference as much as a reading. You’re not expected to memorize every pattern or internalize every example on first pass.

What we do expect:

  • Read through the chapter to build familiarity with the concepts
  • Return to specific sections when you’re designing algorithms for projects
  • Use the Quick Reference Tables at the end as a lookup resource
  • Recognize when a problem fits one of the five universal patterns

What’s normal:

  • Feeling like some material “clicks” only after you’ve tried to use it in a project
  • Returning to re-read sections on guard clauses or convergence loops mid-semester
  • Finding that pseudocode feels awkward at first but becomes essential later

This material compounds. Give it time, and revisit it when you need it.


3.1 Algorithmic Thinking: The Lost Art of Pseudocode

pseudocode Human-readable algorithm description focusing on logic over syntax

Most students jump straight from problem to code, then wonder why they spend hours debugging. Professional computational scientists spend more time thinking than typing. Pseudocode is how we think precisely about algorithms without getting distracted by syntax. Think of it as your algorithm’s blueprint — you wouldn’t build a telescope without optical designs, so why write code without algorithmic designs?

Why Pseudocode Matters in Scientific Computing

Consider this scenario: You need to implement adaptive timestepping for an orbital integrator. Without pseudocode, you’ll likely write code, run it, watch orbits spiral incorrectly, debug for hours, and maybe get it working through trial and error. With pseudocode, you’ll identify edge cases, boundary conditions, and logical flaws before writing a single line of Python.

#| eval: false
# NOTE: This snippet is intentionally incomplete (placeholder functions/variables)
# and is meant for reading, not running.
# WITHOUT PSEUDOCODE (typical student approach):
# "I'll figure it out as I code..."
def integrate_naive(state, t_end):
    dt = 0.01
    while state.time < t_end:
        new_state = step(state, dt)
        error = estimate_error(state, new_state)
        if error > tolerance:
            dt = dt * 0.5  # Seems reasonable?
        state = new_state
    return state
# Wait, this doesn't work... infinite loop when error is bad!
# Also, dt never increases... hours of debugging ahead

Now let’s see how pseudocode reveals problems immediately! This is exactly how professional scientists and engineers design algorithms: make the logic explicit first, then implement.

Tip✅ Pseudocode Checklist (Use This Every Time)
  • Goal: What should the algorithm compute?
  • State: What values change each iteration?
  • Termination: What makes the loop stop (and what if it never happens)?
  • Invariants: What must remain true throughout?
  • Failure modes: What can go wrong (bad inputs, non-convergence, overflow)?

The Three Levels of Pseudocode Refinement

Professional algorithm development happens in stages, each revealing different issues. Don’t worry if this feels strange at first — every programmer has felt that way! But once you embrace pseudocode, you’ll save countless hours of debugging. Let’s build this skill together:

Level 1: Conceptual Overview (The Big Picture)

WHILE simulation not done:       # WHILE means "repeat as long as condition is true"
    Take a step
    Check IF step was good        # IF means "only do this when condition is true"
    Adjust timestep

This level helps you understand the overall flow. The WHILE construct creates a loop that continues until some condition becomes false. The IF construct makes a decision based on a condition. Already, we can ask critical questions: What defines “done”? What makes a step “good”? How much should we adjust? These questions matter!

Tip🤔 Check Your Understanding

Before continuing, identify at least two problems with the Level 1 pseudocode above. What could go wrong?

  1. No exit condition if step is never “good” — infinite loop risk!
  2. No bounds on timestep adjustment — could grow infinitely or shrink to zero
  3. “Simulation done” is vague — need precise termination condition
  4. No error handling — what if the integration fails completely?

These aren’t nitpicks — they’re the difference between code that runs and code that runs correctly!

Level 2: Structural Detail (The Flow)

FUNCTION adaptive_integrate(initial_state, end_time):  # FUNCTION groups reusable code
    state <- initial_state                              # <- means "assign value to variable"
    dt <- estimate_initial_timestep(state)

    WHILE time < end_time:                             # Loop continues while time hasn't reached end
        DO:                                             # DO-UNTIL creates a loop that runs at least once
            trial_step = integrate(state, dt)
            error = compute_error(trial_step)
        UNTIL error < tolerance OR dt < dt_min         # OR means "either condition can be true"

        state = trial_step
        dt = adjust_timestep(error, dt)

    RETURN state                                        # RETURN sends value back to caller

Now we see the retry logic and minimum timestep safeguard. The DO-UNTIL construct ensures we attempt at least one integration step. The OR operator means either condition being true will exit the inner loop. FUNCTION defines a reusable block of code that can be called with arguments and RETURN a result.

Level 3: Implementation-Ready (Stage 1: Core Logic)

FUNCTION adaptive_integrate(initial_state, end_time, tolerance):
    state <- initial_state
    dt <- estimate_initial_timestep(state)

    WHILE state.time < end_time:
        trial_state <- rk4_step(state, dt)
        error <- estimate_error(state, trial_state)

        IF error < tolerance:                          # Decision point
            state <- trial_state
            dt <- min(dt * 1.5, dt_max)                # Can grow
        ELSE:                                          # ELSE handles "otherwise" case
            dt <- max(dt * 0.5, dt_min)                # Must shrink

Level 3: Implementation-Ready (Stage 2: Add Safety)

FUNCTION adaptive_integrate(initial_state, end_time, tolerance):
    state <- initial_state
    dt <- estimate_initial_timestep(state)
    dt_min <- 1e-10 * (end_time - initial_state.time)
    dt_max <- 0.1 * (end_time - initial_state.time)

    WHILE state.time < end_time:
        step_accepted <- False                          # Boolean flag (True/False)
        attempts <- 0

        WHILE NOT step_accepted AND attempts < MAX_ATTEMPTS:  # NOT inverts, AND requires both
            trial_state <- rk4_step(state, dt)
            error <- estimate_error(state, trial_state)

            IF error < tolerance:
                step_accepted <- True
                state <- trial_state
            ELSE:
                dt <- max(dt * 0.5, dt_min)
            attempts <- attempts + 1

Each refinement level reveals new issues and solutions. The NOT operator inverts a boolean value (True becomes False, False becomes True). The AND operator requires both conditions to be true. This is computational thinking in action!

sentinel value A special marker value (like -999, ‘END’, or None) that signals the end of data or a special condition, allowing loops to know when to stop processing

Important💡 Computational Thinking: The Sentinel Pattern

PATTERN: Sentinel Values

A sentinel is a special value that signals “stop processing.” This pattern appears everywhere in computing:

[1.1, 2.2, 3.3]

Real-world applications:

  • FITS files: END keyword marks end of header
  • Network protocols: Message terminators like \r\n
  • Telescope data: -999 for missing observations
  • String processing: Null terminators in C strings

The sentinel pattern is how computers know when to stop! You’re using the same technique that shows up in network protocols and instrument data streams.

Important💡 Computational Thinking: The Universal Pattern of Adaptive Algorithms

Adaptive timestepping is an instance of a universal pattern:

PATTERN: Adaptive Refinement

  1. Attempt action with current parameters
  2. Evaluate quality of result
  3. If quality insufficient: refine parameters and retry
  4. If quality acceptable: proceed and possibly coarsen
  5. Include safeguards against infinite refinement

This pattern appears everywhere in computational science:

  • Adaptive mesh refinement (AMR) in galaxy formation simulations
  • Step size control in stellar evolution codes like MESA
  • Learning rate scheduling in neural networks for photometric redshifts
  • Convergence acceleration in self-consistent field calculations
  • Importance sampling in Monte Carlo radiative transfer

Once you recognize this pattern, you’ll see it in every sophisticated scientific code!


3.2 Boolean Logic in Scientific Computing

Every decision in your code ultimately reduces to true or false. But in scientific computing, these decisions often involve floating-point numbers, where equality is treacherous and precision is limited. Let’s master this fundamental building block that underlies everything from data quality checks to convergence criteria!

The Complete Set of Comparison Operators

Python provides six comparison operators that return boolean values (True or False):

Greater than: False
Less than: True
Greater or equal: True
Less or equal: True
Equal to: True
Not equal to: True

Main sequence star? True

The != operator (not equal) is particularly useful for filtering out sentinel values or checking if something has changed. The ability to chain comparisons like 3000 < temperature < 50000 is a Python feature that makes code more readable and matches mathematical notation.

The Three Logical Operators: AND, OR, NOT

Python’s logical operators combine or modify boolean values:

Bright AND variable: False
Bright OR variable: True
NOT bright: False
NOT variable: True

Can observe? True

Truth Table for AND:
      1 AND     1 = True
      1 AND     0 = False
      0 AND     1 = False
      0 AND     0 = False
NoteOperator Precedence: not > and > or

When in doubt, add parentheses.

True
False
True

Special Comparison Operators: is, in

Python has two special operators that are incredibly useful in scientific programming:

a == b: True
a is b: False
a is c: True
Checking None: True

Is 'G' a stellar type? True
Is 'X' a stellar type? False
Is FITS file? True

Handy Reductions: any() and all()

These are extremely common in scientific code: they reduce many boolean checks to one decision.

True
False

The Walrus Operator: Assignment Expressions (Python 3.8+)

Python 3.8 introduced the walrus operator (:=) which allows assignment within expressions:

Large dataset: 123 observations
Large dataset: 123 observations
Note: Walrus operator requires Python 3.8+
It's useful but not essential - all code can be written without it
Note📝 Note on the Walrus Operator

The walrus operator (:=) is optional syntactic sugar introduced in Python 3.8. While it can make some code more concise, it’s perfectly fine to write code without it. Some environments (especially shared clusters or system installs) may lag behind, so be cautious about requiring new syntax in shared code.

Use it when:

  • You need to use a value in a condition and then reuse it in the body
  • Reading files line by line in a while loop
  • Avoiding repeated expensive calculations

Avoid it when:

  • It makes the code harder to read
  • Working with Python < 3.8
  • The traditional approach is clearer

The Floating-Point Equality Trap

Never use == with floating-point numbers! Even tiny rounding errors break equality:

Calculated == Expected? False
Tiny difference: 1.00e-10
math.isclose? True

0.1 + 0.2 == 0.3? False
Safe equal? True
math.isclose? True
Warning⚠️ Common Bug Alert: The Equality Trap

Never use == with floating-point numbers! Even tiny rounding errors break equality.

Wrong (dangerous for critical systems):

if velocity == 0.0:  # Dangerous!
    print("At rest")

Right (tolerance-based check):

import math

if math.isclose(velocity, 0.0, abs_tol=1e-10):  # Safe!
    print("Effectively at rest")

This avoids brittle == comparisons and makes your intent explicit.

Short-Circuit Evaluation: Order Matters!

short-circuit evaluation Stopping logical evaluation once result is determined

Python’s and and or operators use short-circuit evaluation — they stop evaluating as soon as the result is determined:

No data or first measurement not positive

Short-circuit OR demonstration:
Result: True
  Running expensive calculation...
Result: True

Space agencies use automated collision avoidance systems that evaluate multiple conditions in sequence for efficiency. The logic follows similar principles: check simple conditions first, then expensive calculations only if needed.

#| eval: false
# PSEUDOCODE: variables like distance/screening_threshold are placeholders.
def check_collision_risk(satellite1, satellite2):
    """
    Simplified collision risk logic similar to conjunction assessment.
    Real systems use complex probability calculations, but follow
    similar efficiency principles.

    Based on standard conjunction assessment practices where:
    - Initial screening uses simple distance checks
    - Detailed analysis only for close approaches
    - Probability calculations only when necessary
    """

    # Check cheap calculations first
    if distance > screening_threshold:
        return "No risk"

    # Only if close, calculate relative velocity
    if closing_velocity < 0:
        return "Moving apart"

    # Only if approaching, compute collision probability
    if collision_probability > probability_threshold:
        return "COLLISION RISK!"

    return "Monitor"

Checking distance first avoids millions of expensive velocity calculations per day. A single wrong comparison can mean missed warnings or wasted compute. As a secondary overview example, see the Wikipedia article on the 2009 Iridium 33 / Cosmos 2251 collision: https://en.wikipedia.org/wiki/2009_satellite_collision.

Note: This example simplifies the actual collision avoidance algorithms for pedagogical clarity. Real systems use complex orbital mechanics and probability distributions, but the core principle of ordered boolean evaluation remains crucial.


3.3 Conditional Statements: Teaching Computers to Decide

guard clause Early return statement that handles edge cases before main logic

Conditional statements are where your code makes decisions. In scientific computing, these decisions often involve numerical thresholds, convergence criteria, and boundary conditions. Let’s build your intuition for writing robust conditionals you can trust in long-running analyses and pipelines.

The if Statement: Your First Decision Maker

The if statement is the simplest conditional — it executes code only when a condition is true:

Star is visible to naked eye (mag 4.5)
Massive star detected!
Mass: 10.0 Msun
Will end as supernova

The if-else Statement: Binary Decisions

The else clause provides an alternative when the condition is false:

z = 0.8: distant galaxy
Low SNR - needs verification

The elif Statement: Multiple Choices

The elif (else if) statement allows multiple conditions to be checked in sequence:

white dwarf
white dwarf (near boundary - uncertain)
black hole

Scientific note: This toy classifier uses coarse mass thresholds for pedagogy. Real stellar endpoints depend on additional physics (metallicity, rotation, binarity, mass loss).

Guard Clauses: Fail Fast, Fail Clear

Guard clauses handle special cases immediately, preventing deep nesting and making code clearer. This pattern is essential for scientific code where invalid inputs can cause subtle bugs hours into a simulation!

Earth: 1.00 years
Mercury: 0.24 years
Proxima Centauri b: 0.031 years

The Ternary Operator: Compact Conditionals

Python’s ternary operator provides a compact way to write simple if-else statements:

Star with magnitude 3.5 is visible
Class G star: ~5778K

In 1999, NASA lost the Mars Climate Orbiter after a units mismatch contributed to navigation errors. A simple guard clause can help catch these mistakes early:

Note: The $327 million figure represents total mission cost. This anecdote simplifies a complex failure to emphasize the importance of unit validation. The actual failure involved multiple factors, but the unit confusion was the primary cause identified in NASA’s investigation reports.

import warnings

def process_thrust_data(force, units):
    """Guard clause example: validate units before using the value."""

    # Validate units BEFORE processing
    valid_units = {'N': 1.0, 'lbf': 4.448222}  # Conversion factors

    if units not in valid_units:
        raise ValueError(f"Unknown units: {units}. Use 'N' or 'lbf'")

    # Convert to standard units (Newtons)
    force_newtons = force * valid_units[units]

    # Additional sanity check (toy threshold)
    if force_newtons > 1000:  # Typical max thruster force
        warnings.warn(f"Unusually high thrust: {force_newtons} N")

    return force_newtons

# Example failure mode: one component produces lbf, another assumes N.
# A guard clause forces the mismatch into an explicit error early.

The mission was lost after the spacecraft entered the atmosphere too low. Guard clauses won’t prevent every failure, but they do prevent many expensive “garbage in, garbage out” mistakes.


3.4 Loops: The Heart of Scientific Computation

Now that you’ve mastered making decisions with conditionals, let’s make your code repeat tasks efficiently! Loops are where your programs gain superpowers — they’re the difference between analyzing one star and analyzing millions. Every N-body simulation, every light curve analysis, every Monte Carlo calculation depends on loops. The patterns you learn here will appear in every algorithm you write for the rest of your career.

The for Loop: Iterating Over Sequences

The for loop iterates over any sequence (list, tuple, string, range):

Class O star
Class B star
Class A star
Class F star
Class G star
Class K star
Class M star

Counting with range:
Observation 0
Observation 1
Observation 2
Observation 3
Observation 4

Every 2nd hour from 20:00 to 02:00 (wrapping past midnight):
20:00
22:00
00:00
02:00

The Accumulator Pattern in Scientific Computing

accumulator pattern Iteratively combining values into a running aggregate

The accumulator pattern is fundamental to scientific computing:

Cluster center of mass: 0.510 pc
Total cluster mass: 6.1 Msun

Common for Loop Patterns

Python provides several useful functions for loop patterns:

Finding bright events with enumerate:
  Alert! Index 4: magnitude 8.2

Parallel iteration with zip:
  t=1s: v=4.9 m/s
  t=2s: v=9.8 m/s
  t=3s: v=14.7 m/s
  t=4s: v=19.6 m/s

The while Loop: Conditional Iteration

The while loop continues as long as a condition remains true:

Iteration 0
Iteration 1
Iteration 2

Convergence example:
  Iter 1: value = 90.100
  Iter 2: value = 81.190
  Iter 3: value = 73.171
  Iter 10: value = 35.519
  Iter 20: value = 13.036
  Iter 30: value = 5.197
  Iter 40: value = 2.463
  Iter 50: value = 1.510
  Iter 60: value = 1.178
  Iter 70: value = 1.062
  Iter 80: value = 1.022
Converged to 1.009 after 88 iterations

Loop Control: break, continue, and else

Python provides additional loop control statements:

Using break to find first detection:
First significant detection: 5.8

Using continue to skip bad data:
Processing: 1.2
Processing: 2.3
Processing: 3.4

Loop else clause:
Target 7 not found in list

The pass Statement: Placeholder

The pass statement does nothing — useful as a placeholder:

Processing 0
Processing 2
Continued after pass

Nested Loops: Processing 2D Data

Loops can be nested to process multi-dimensional data:

Processing 3x3 pixel grid:
(0,0)=0  (0,1)=1  (0,2)=2  
(1,0)=3  (1,1)=4  (1,2)=5  
(2,0)=6  (2,1)=7  (2,2)=8  

Finding peaks in 2D array:
Peak at (1,1): value=9
Warning⚠️ Common Bug Alert: Off-by-One Errors

The most common bug in all of programming! Python’s zero-indexing catches everyone:

Classic Mistake (quietly skipping data):

observations = [1, 2, 3, 4, 5]
# Trying to process all elements
for i in range(1, len(observations)):  # OOPS! Skips first element
    print(observations[i])

# Or worse - going past the end
for i in range(len(observations) + 1):  # IndexError on last iteration!
    print(observations[i])

Remember: - range(n) gives 0, 1, …, n-1 (NOT including n!) - List of length n has indices 0 to n-1 - The last element is at index len(list) - 1

Off-by-one errors show up everywhere. Double-check your ranges!

Warning⚠️ Common Bug Alert: Infinite While Loops

Don’t worry — everyone writes an infinite loop occasionally! Here are two common causes:

Case 1: Floating-point precision prevents exact equality

x = 0.0
while x != 1.0:  # INFINITE LOOP!
    x += 0.1  # After 10 additions, x approx 0.9999999999

# Fix: Use tolerance
while abs(x - 1.0) > 1e-10:
    x += 0.1

Case 2: Forgetting to update loop variable

i = 0
while i < 10:
    print("still looping...")
    # Forgot: i += 1  # INFINITE LOOP!

Always add a maximum iteration safeguard!


3.5 List Comprehensions: Elegant and Efficient

Now that you’ve mastered loops, let’s evolve them into something even more powerful! List comprehensions are Python’s gift to scientific programmers. They transform verbose loops into concise, readable, and faster expressions.

From Loop to Comprehension

Loop result: [0, 4, 16, 36, 64]
Comprehension: [0, 4, 16, 36, 64]

Real Scientific Applications

Observable star count: 6/9
Brightest flux: 1.91e-05
Faintest flux: 9.12e-07

Bright stars dictionary: {'star_0': 12.3, 'star_2': 13.7, 'star_4': 14.5, 'star_6': 11.8, 'star_8': 13.2}

When NOT to Use Comprehensions

Galaxy classifications: ['nearby faint', 'distant bright', 'nearby bright', 'distant faint']

3.6 Advanced Control Flow Patterns

Now let’s explore powerful patterns that appear throughout scientific computing. These aren’t just code tricks — they’re fundamental algorithmic building blocks!

Welford’s Algorithm: Numerically Stable Statistics

Stable algorithm: mean=100000000.5, std=1.12
Naive variance (can go negative due to cancellation): 2.000e+00
Naive std (after clipping at 0 for display): 1.41

Why is Welford’s algorithm numerically stable?

The naive approach (sum all values, then divide) accumulates large sums that can lose precision. For values like [1e8, 1e8+1, 1e8+2], the sum becomes ~3e8, and the tiny variations (1, 2) get lost in floating-point representation.

Welford’s algorithm maintains a running mean and updates it incrementally with small deltas. Instead of computing (1e8 + 1e8 + 1e8)/3, it computes:

  • mean = 1e8
  • mean += (1e8 - 1e8)/2 = 0
  • mean += (1e8 - 1e8)/3 = 0 This keeps all arithmetic operations on similar scales, preserving precision.

The variance calculation similarly avoids subtracting large nearly-equal numbers (catastrophic cancellation) by accumulating squared deviations incrementally.

Why not use log space? Students often ask why we don’t use logarithms here like we did for the luminosity calculation. Log space is perfect for products (multiplication becomes addition) but wrong for statistics. Welford’s algorithm computes arithmetic mean and variance, which require addition. In log space, you’d compute geometric mean instead - a completely different statistic! Plus, log fails on negative values, common in astronomy (radial velocities, position residuals). Welford’s incremental approach is already optimal.

Important💡 Computational Thinking: The Convergence Pattern

PATTERN: Iterative Convergence

initialize state
iteration_count = 0

while not converged and iteration_count < max_iterations:
    new_state = update(state)
    converged = check_convergence(state, new_state, tolerance)
    state = new_state
    iteration_count += 1

if not converged:
    handle_failure()

This pattern appears throughout computational science: - Kepler’s equation solver (finding true anomaly) - Stellar structure integration (hydrostatic equilibrium) - Radiative transfer (temperature iterations) - N-body orbit integration (adaptive timesteps)

Master this pattern and you’ve mastered half of computational physics!


3.7 Debugging Control Flow

Logic errors are the hardest bugs because the code runs without crashing but produces wrong results. Let’s build your debugging arsenal!

Strategic Print Debugging

Testing convergence:
Iter  0: 0.0000 -> 10.0000 (delta=+10.00000)
Iter  1: 10.0000 -> 19.0000 (delta=+9.00000)
Iter  2: 19.0000 -> 27.1000 (delta=+8.10000)
Iter  5: 40.9510 -> 46.8559 (delta=+5.90490)
Iter 10: 65.1322 -> 68.6189 (delta=+3.48678)
Iter 15: 79.4109 -> 81.4698 (delta=+2.05891)
FAILED after 20 iterations

Using Assertions for Validation

The assert statement helps catch bugs during development:

Average magnitude: 10.33
Warning⚠️ Critical Warning: Assertions Are Not for Production!

Never use assertions for user input validation or critical checks! Assertions can be completely disabled when Python runs with optimization (python -O), causing them to be skipped entirely.

WRONG - Don’t do this for user-facing code:

import math

def process_user_data(value):
    assert value > 0  # DANGEROUS! Might not run in production!
    return math.sqrt(value)

RIGHT - Use explicit validation for production:

import math

def process_user_data(value):
    if value <= 0:
        raise ValueError(f"Value must be positive, got {value}")
    return math.sqrt(value)

When to use assertions: - Documenting internal assumptions during development - Catching programming errors early (not user errors) - Self-checks in algorithms (but have a fallback plan) - Test suites and debugging

When NOT to use assertions: - Validating user input - Checking file existence or permissions - Network availability checks - Any check that must run in production

Think of assertions as “developer notes that can catch bugs” rather than “guards that protect your code.”

The Kepler mission’s pipeline used exactly the control flow patterns you just learned. Here’s an intentionally simplified pseudocode skeleton showing the shape of that logic:

Note: This algorithm is greatly simplified for pedagogical purposes. The actual Kepler pipeline used sophisticated techniques including Fourier transforms, multiple detrending algorithms, and extensive validation checks. However, the control flow patterns shown here — guard clauses, filtering, iteration, and conditional validation — formed the backbone of the real system.

#| eval: false
# PSEUDOCODE: helper functions/variables like median, sigma, and fold_light_curve
# are placeholders to highlight control flow, not implementation details.
def kepler_planet_search(star_id, light_curve):
    """Simplified Kepler planet detection algorithm"""

    # Guard clause - data quality check
    if len(light_curve) < 1000:
        return None

    # Remove outliers (cosmic rays, etc.)
    cleaned = [point for point in light_curve
               if abs(point - median) < 5 * sigma]

    # Search for periodic dips
    best_period = None
    best_depth = 0

    for trial_period in range(1, 365):  # Days
        folded = fold_light_curve(cleaned, trial_period)
        depth = measure_transit_depth(folded)

        if depth > best_depth and depth > 3 * noise_level:
            best_period = trial_period
            best_depth = depth

    # Validate as planet (not eclipsing binary)
    if best_period:
        if is_v_shaped(folded):  # Binary check
            return None
        if depth > 0.5:  # Too deep
            return None

        return {'period': best_period, 'depth': best_depth}

    return None

Kepler monitored on the order of ~150,000 stars. The control flow patterns you’ve mastered in this chapter — guard clauses, filtering, iteration, and validation — are exactly what enables pipelines to scale. For a real overview, see Jenkins et al. (2010): https://ui.adsabs.harvard.edu/abs/2010ApJ...713L..87J/abstract (and the open-source code: https://github.com/nasa/kepler-pipeline).

Note🛠️ Debug This! The Telescope Priority Bug

A telescope scheduling system has a subtle bug in its priority logic. Can you find and fix it?

Priority (galaxy, time-critical): 70
Priority (asteroid, time-critical): 80
Priority (dim variable star): 50

The Bug: Operator precedence + missing parentheses.

This line:

elif obj_type == 'variable_star' and magnitude < 12 or time_critical:

is parsed as:

(obj_type == 'variable_star' and magnitude < 12) or time_critical

So when time_critical is True, that branch triggers for every object type (including galaxies and asteroids), and it can even steal priority from later elif cases.

Fix: Add parentheses to match the intended logic.

def assign_telescope_priority_fixed(observation):
    """Fixed version with clearer logic."""
    magnitude = observation['magnitude']
    obj_type = observation['type']
    time_critical = observation['time_critical']

    # Assign base priority by object type
    if obj_type == 'supernova':
        priority = 100
    elif obj_type == 'asteroid' and time_critical:
        priority = 80
    elif obj_type == 'variable_star' and (magnitude < 12 or time_critical):
        priority = 70
    elif obj_type == 'variable_star':
        priority = 50
    elif obj_type == 'galaxy':
        priority = 30
    else:
        priority = 10

    # Boost for bright objects
    if magnitude < 10:
        priority += 20

    return priority

Key Lessons:

  1. Remember precedence: not > and > or
  2. Add parentheses when mixing and/or
  3. Write a tiny test case that breaks the buggy logic

Main Takeaways

What an incredible journey you’ve just completed! You’ve transformed from someone who writes code line by line to someone who designs algorithms systematically. This transformation mirrors the evolution every computational scientist goes through, from tentative beginner to confident algorithm designer.

You started by learning to think in pseudocode, a skill that gives you the power to design before you code. Those three levels of refinement you practiced are your blueprint for success. Every hour you invest in pseudocode saves many hours of debugging. When you design your next algorithm for analyzing galaxy spectra or simulating stellar evolution, you’ll catch logical flaws on paper instead of after hours of computation.

The complete set of comparison and logical operators you’ve mastered — from simple greater-than checks to complex boolean combinations with and, or, and not — gives you the full vocabulary for expressing any logical condition. You understand that == is dangerous with floats, that is checks identity not equality, and that in elegantly tests membership. These aren’t just syntax details; they’re the building blocks of every data validation, every convergence check, every quality filter you’ll ever write.

Your understanding of conditional statements goes beyond syntax to defensive programming philosophy. Guard clauses help you fail fast and avoid wasting hours (or compute budgets) on invalid inputs. The elif chains you practiced will classify objects, determine processing paths, and control how your code responds to real, messy data.

The loop patterns you’ve mastered are universal across scientific computing. Accumulators power statistics and reductions, convergence loops power solvers, and nested loops show up whenever you traverse grids or images. Whether using for loops to iterate through catalogs, while loops to converge solutions, or list comprehensions to filter data, you now have the core toolkit.

Most importantly, you’ve learned that bugs aren’t failures — they’re learning opportunities. Every infinite loop teaches you about termination conditions. Every off-by-one error reinforces indexing. The debugging strategies you’ve developed, from strategic print statements to assertions, will serve you throughout your career.

Remember that every major computational achievement relies on these fundamentals. You’re not just learning Python syntax — you’re building algorithmic literacy that transfers across domains.


Definitions

Accumulator Pattern: An algorithmic pattern where values are iteratively combined into a running total or aggregate, fundamental to reductions and statistical calculations in scientific data processing.

Adaptive Refinement: A universal pattern where parameters are adjusted based on quality metrics, with safeguards against infinite refinement, appearing in timestepping, mesh refinement, and optimization throughout computational science.

and: Logical operator that returns True only if both operands are true, using short-circuit evaluation.

assert: Statement that raises an AssertionError if a condition is false, used for debugging and documenting assumptions during development (not for production validation).

Boolean Logic: The system of true/false values and logical operations (and, or, not) that underlies all conditional execution in programs.

break: Statement that immediately exits the current loop, skipping any remaining iterations.

Conditional Statement: A control structure (if/elif/else) that executes different code blocks based on whether conditions evaluate to true or false.

continue: Statement that skips the rest of the current loop iteration and proceeds to the next iteration.

elif: “Else if” statement that checks an additional condition when the previous if or elif was false.

else: Clause that executes when all previous if/elif conditions were false, or when a loop completes without breaking.

for: Loop that iterates over elements in a sequence or iterable object.

Guard Clause: A conditional statement at the beginning of a function that handles special cases or invalid inputs immediately, preventing deep nesting.

if: Statement that executes code only when a specified condition is true.

in: Operator that tests membership in a sequence or collection.

is: Operator that tests object identity (same object in memory), not just equality of values.

List Comprehension: A concise Python syntax for creating lists: [expression for item in iterable if condition].

not: Logical operator that inverts a boolean value (True becomes False, False becomes True).

or: Logical operator that returns True if at least one operand is true, using short-circuit evaluation.

pass: Null statement that does nothing, used as a placeholder where syntax requires a statement.

Pseudocode: A human-readable description of an algorithm that focuses on logic and structure without syntactic details.

sentinel value: A special marker value (like -999, ‘END’, or None) that signals the end of data or a special condition, allowing loops to know when to stop processing

Short-circuit Evaluation: The behavior where logical operators stop evaluating as soon as the result is determined.

Walrus Operator (:=): Assignment expression operator (Python 3.8+) that assigns a value to a variable as part of an expression, allowing both assignment and testing in a single statement.

while: Loop that continues executing as long as a specified condition remains true.


Key Takeaways

✓ Pseudocode reveals logical flaws before they become bugs — always design before implementing

✓ Master all six comparison operators (>, <, >=, <=, ==, !=) and three logical operators (and, or, not)

✓ Never use == with floating-point numbers; always use tolerance-based comparisons like math.isclose()

✓ The is operator checks identity, not equality — use it for None checks

✓ The in operator elegantly tests membership in sequences or strings

✓ Guard clauses handle special cases first, making main logic clearer

for loops iterate over sequences, while loops continue until a condition becomes false

break exits loops early, continue skips to the next iteration, else runs if loop completes

✓ List comprehensions are faster than loops for simple transformations but become unreadable for complex logic

✓ Short-circuit evaluation in and/or prevents errors and improves performance

✓ The accumulator pattern is fundamental to scientific computing, appearing in all statistical calculations

✓ Always include maximum iteration limits in while loops to prevent infinite loops

✓ Welford’s algorithm solves numerical stability issues in streaming statistics

✓ Use assert statements to document and enforce assumptions during development

✓ Every major scientific discovery relies on the control flow patterns you’ve learned


Quick Reference Tables

Comparison Operators
Operator Description Example
> Greater than if magnitude > 6.0:
< Less than if redshift < 0.1:
>= Greater or equal if snr >= 5.0:
<= Less or equal if error <= tolerance:
== Equal (avoid with floats!) if status == 'complete':
!= Not equal if flag != -999:
Logical Operators
Operator Description Example
and Both must be true if x > 0 and y > 0:
or At least one true if bright or variable:
not Inverts boolean if not converged:
Special Operators
Operator Description Example
in Membership test if 'fits' in filename:
is Identity test if result is None:
is not Negative identity if data is not None:
not in Negative membership test if 'error' not in log_file:
:= Walrus operator (Python 3.8+) if (n := len(data)) > 100:
Control Flow Statements
Statement Purpose Example
if/elif/else Conditional execution if mag < 6: visible = True
for Iterate over sequence for star in catalog:
while Loop while condition true while error > tolerance:
break Exit loop early if converged: break
continue Skip to next iteration if bad_data: continue
pass Do nothing (placeholder) if not ready: pass
assert Debug check assert len(data) > 0
Built-in Functions for Loops
Function Purpose Example
range(n) Generate 0 to n-1 for i in range(10):
range(start, stop, step) Generate with step for i in range(0, 10, 2):
enumerate(seq) Get index and value for i, val in enumerate(data):
zip(seq1, seq2) Parallel iteration for x, y in zip(xs, ys):
len(seq) Sequence length for i in range(len(data)):
Comparison Functions
Function Purpose Example
all(iterable) All elements true if all(x > 0 for x in data):
any(iterable) Any element true if any(x < 0 for x in data):
math.isclose() Safe float comparison if math.isclose(a, b):
math.isfinite() Check not inf/nan if math.isfinite(result):
math.isnan() Check for NaN if not math.isnan(value):
math.isinf() Check for infinity if math.isinf(value):
isinstance() Type checking if isinstance(x, float):
Common Algorithmic Patterns
Pattern Purpose Structure
Accumulator Aggregate values total = 0; for x in data: total += x
Filter Select subset [x for x in data if condition(x)]
Map Transform all [f(x) for x in data]
Search Find first match for x in data: if test(x): return x
Convergence Iterate to solution while not converged and n < max:
Guard clause Handle edge cases if invalid: return None
Sentinel Signal termination if value == -999: break

Python Module & Method Reference (Chapter 3 Additions)

New Built-in Functions

Logical Testing

  • all(iterable) - Returns True if all elements are true
  • any(iterable) - Returns True if any element is true
  • isinstance(obj, type) - Check if object is of specified type

Loop Support

  • enumerate(iterable, start=0) - Returns index-value pairs
  • zip(*iterables) - Combines multiple iterables for parallel iteration
  • range(start, stop, step) - Generate arithmetic progression

Control Flow Keywords

Conditionals

  • if - Execute block if condition is true
  • elif - Check additional condition if previous was false
  • else - Execute if all previous conditions were false

Loops

  • for - Iterate over sequence
  • while - Loop while condition is true
  • break - Exit loop immediately
  • continue - Skip to next iteration
  • else - Execute if loop completes without break

Other

  • pass - Null operation placeholder
  • assert - Raise AssertionError if condition is false

Operators

Comparison

  • >, <, >=, <=, ==, != - Numerical comparisons
  • is, is not - Identity comparisons
  • in, not in - Membership testing

Logical

  • and - Logical AND with short-circuit evaluation
  • or - Logical OR with short-circuit evaluation
  • not - Logical NOT (inversion)

New Math Module Functions

import math
  • math.isclose(a, b, rel_tol=1e-9, abs_tol=0.0) - Safe floating-point comparison
  • math.isfinite(x) - Check if neither infinite nor NaN
  • math.isnan(x) - Check if value is NaN
  • math.isinf(x) - Check if value is infinite

Debugging Support

IPython Magic Commands

  • %debug - Enter debugger after exception
  • %pdb - Automatic debugger on exceptions

Debugger Commands (when in pdb)

  • p variable - Print variable value
  • pp variable - Pretty-print variable
  • l - List code around current line
  • n - Next line
  • s - Step into function
  • c - Continue execution
  • u/d - Move up/down call stack
  • q - Quit debugger

Next Chapter Preview

You’ve conquered control flow — now get ready for the next level! Chapter 4 will reveal how to organize data efficiently using Python’s powerful data structures. You’ll discover when to use lists versus dictionaries versus sets, and more importantly, you’ll understand why these choices can make your algorithms run 100 times faster or 100 times slower.

Imagine trying to find a specific star in a catalog of millions. With a list, you’d check each star one by one — taking minutes or hours. With a dictionary, you’ll find it instantly — in microseconds! The data structures you’ll learn next are the difference between simulations that finish in minutes and ones that run for days.

The control flow patterns you’ve mastered here will operate on the data structures you’ll learn next. Your loops will iterate through dictionaries of astronomical objects. Your conditionals will filter sets of observations. Your comprehensions will transform lists of measurements into meaningful results. Together, control flow and data structures give you the power to handle the massive datasets of modern science — from Gaia’s billion-star catalog to the petabytes of data from the Square Kilometre Array.

Get excited — Chapter 4 is where your code goes from processing dozens of data points to handling thousands efficiently!