[1.1, 2.2, 3.3]
title: “Chapter 3: Control Flow & Logic” subtitle: “Python Fundamentals | COMP 536” author: “Anna Rosen” draft: false execute: freeze: auto echo: true warning: true error: false format: html: toc: true code-fold: true code-summary: “Show code” —
Learning Objectives
By the end of this chapter, you will be able to:
If any boxes are unchecked, review the indicated chapters first.
Chapter Overview
Programming is fundamentally about teaching computers to make decisions and repeat tasks. When you write an if statement or a loop, you’re translating human logic into instructions a machine can follow. But here’s the critical insight that separates computational thinkers from mere coders: the logic must be designed before it’s implemented. This chapter transforms you from someone who writes code to someone who designs algorithms.
We’ll start with the lost art of pseudocode — not as a bureaucratic exercise, but as the difference between code that works by accident and code that works by design. You’ll learn to recognize five universal algorithmic patterns that appear across scientific computing: accumulation, filtering, mapping, searching, and convergence. These patterns will appear in every project you build, from N-body simulations to data-driven modeling.
- Accumulation: combine many values into one (sum, mean, running variance).
- Filtering: keep only items that satisfy a condition (quality cuts, SNR thresholds).
- Mapping: transform each item (magnitude \(\to\) flux, pixels \(\to\) calibrated pixels).
- Searching: find an item or best candidate (first detection, best period, nearest neighbor).
- Convergence: iterate until a stopping criterion is met (tolerance, max iterations).
The control flow structures we explore here are where your numerical calculations from Chapter 2 become dynamic algorithms. Every convergence test, every adaptive timestep, every Monte Carlo acceptance criterion depends on mastering these concepts deeply, not just syntactically. By chapter’s end, you’ll see code not as a sequence of commands, but as a carefully orchestrated flow of decisions and iterations that solve real scientific problems — the same style of thinking used in research pipelines and mission-scale software.
This chapter is the turning point. If you genuinely internalize the material here — not memorize it, but understand it — you stop being someone who writes Python and become someone who designs algorithms that happen to be expressed in Python.
The concepts in this chapter underpin everything that follows: simulations, convergence tests, data pipelines, scientific workflows. When you’re debugging a loop that won’t terminate or a conditional that silently produces wrong results, you’ll return here. When you’re designing your first N-body integrator or writing quality filters for real data, these patterns will guide you.
This chapter is worth revisiting. Come back to it mid-semester. The pseudocode strategies and algorithmic patterns will make more sense once you’ve seen them in action.
COMP 536 does not permit Jupyter notebooks — not because notebooks are bad tools, but because they actively interfere with what this chapter teaches.
Why notebooks undermine control flow learning:
- Hidden state: Variables persist invisibly between cells, making it impossible to reason about what your code actually does when run fresh.
- Non-deterministic execution order: Running cells out of order creates bugs that only appear sometimes — the worst kind.
- Silent logic errors: A loop that should fail might “work” because a variable was set by a cell you ran earlier and forgot about.
- Unreproducible results: Your code may not produce the same output when someone else runs it (or when you run it tomorrow).
Control flow is about predictable, reproducible execution. Notebooks make execution unpredictable by design. For exploratory data analysis elsewhere in your career, notebooks can be useful — but for learning to think algorithmically, they create more confusion than they solve.
In this course: Write .py scripts. Run them from the terminal. Know exactly what state your program has at every moment.
Most introductory courses teach control flow as syntax: “here’s how to write an if statement, here’s how to write a for loop.” You’ll learn that syntax here too — but that’s not the point.
What this chapter emphasizes instead:
- Design before code: You’ll learn to sketch algorithms in pseudocode before touching Python, catching logical flaws while they’re still cheap to fix.
- Universal patterns: The five patterns (accumulation, filtering, mapping, searching, convergence) appear in every scientific domain. Learn them once, apply them everywhere.
- Failure modes: Real scientific code must handle edge cases, invalid inputs, and non-convergence. You’ll learn to think about what can go wrong, not just what should go right.
- Correctness over cleverness: A loop that’s easy to verify beats a one-liner that’s hard to debug. We prioritize code you can trust.
The same algorithmic thinking you develop here is used in telescope scheduling systems, climate models, and spacecraft navigation. You’re not learning “Python tricks” — you’re learning how computational scientists think.
This chapter is a reference as much as a reading. You’re not expected to memorize every pattern or internalize every example on first pass.
What we do expect:
- Read through the chapter to build familiarity with the concepts
- Return to specific sections when you’re designing algorithms for projects
- Use the Quick Reference Tables at the end as a lookup resource
- Recognize when a problem fits one of the five universal patterns
What’s normal:
- Feeling like some material “clicks” only after you’ve tried to use it in a project
- Returning to re-read sections on guard clauses or convergence loops mid-semester
- Finding that pseudocode feels awkward at first but becomes essential later
This material compounds. Give it time, and revisit it when you need it.
3.1 Algorithmic Thinking: The Lost Art of Pseudocode
pseudocode Human-readable algorithm description focusing on logic over syntax
Most students jump straight from problem to code, then wonder why they spend hours debugging. Professional computational scientists spend more time thinking than typing. Pseudocode is how we think precisely about algorithms without getting distracted by syntax. Think of it as your algorithm’s blueprint — you wouldn’t build a telescope without optical designs, so why write code without algorithmic designs?
Why Pseudocode Matters in Scientific Computing
Consider this scenario: You need to implement adaptive timestepping for an orbital integrator. Without pseudocode, you’ll likely write code, run it, watch orbits spiral incorrectly, debug for hours, and maybe get it working through trial and error. With pseudocode, you’ll identify edge cases, boundary conditions, and logical flaws before writing a single line of Python.
#| eval: false
# NOTE: This snippet is intentionally incomplete (placeholder functions/variables)
# and is meant for reading, not running.
# WITHOUT PSEUDOCODE (typical student approach):
# "I'll figure it out as I code..."
def integrate_naive(state, t_end):
dt = 0.01
while state.time < t_end:
new_state = step(state, dt)
error = estimate_error(state, new_state)
if error > tolerance:
dt = dt * 0.5 # Seems reasonable?
state = new_state
return state
# Wait, this doesn't work... infinite loop when error is bad!
# Also, dt never increases... hours of debugging aheadNow let’s see how pseudocode reveals problems immediately! This is exactly how professional scientists and engineers design algorithms: make the logic explicit first, then implement.
- Goal: What should the algorithm compute?
- State: What values change each iteration?
- Termination: What makes the loop stop (and what if it never happens)?
- Invariants: What must remain true throughout?
- Failure modes: What can go wrong (bad inputs, non-convergence, overflow)?
The Three Levels of Pseudocode Refinement
Professional algorithm development happens in stages, each revealing different issues. Don’t worry if this feels strange at first — every programmer has felt that way! But once you embrace pseudocode, you’ll save countless hours of debugging. Let’s build this skill together:
Level 1: Conceptual Overview (The Big Picture)
WHILE simulation not done: # WHILE means "repeat as long as condition is true"
Take a step
Check IF step was good # IF means "only do this when condition is true"
Adjust timestep
This level helps you understand the overall flow. The WHILE construct creates a loop that continues until some condition becomes false. The IF construct makes a decision based on a condition. Already, we can ask critical questions: What defines “done”? What makes a step “good”? How much should we adjust? These questions matter!
Before continuing, identify at least two problems with the Level 1 pseudocode above. What could go wrong?
- No exit condition if step is never “good” — infinite loop risk!
- No bounds on timestep adjustment — could grow infinitely or shrink to zero
- “Simulation done” is vague — need precise termination condition
- No error handling — what if the integration fails completely?
These aren’t nitpicks — they’re the difference between code that runs and code that runs correctly!
Level 2: Structural Detail (The Flow)
FUNCTION adaptive_integrate(initial_state, end_time): # FUNCTION groups reusable code
state <- initial_state # <- means "assign value to variable"
dt <- estimate_initial_timestep(state)
WHILE time < end_time: # Loop continues while time hasn't reached end
DO: # DO-UNTIL creates a loop that runs at least once
trial_step = integrate(state, dt)
error = compute_error(trial_step)
UNTIL error < tolerance OR dt < dt_min # OR means "either condition can be true"
state = trial_step
dt = adjust_timestep(error, dt)
RETURN state # RETURN sends value back to caller
Now we see the retry logic and minimum timestep safeguard. The DO-UNTIL construct ensures we attempt at least one integration step. The OR operator means either condition being true will exit the inner loop. FUNCTION defines a reusable block of code that can be called with arguments and RETURN a result.
Level 3: Implementation-Ready (Stage 1: Core Logic)
FUNCTION adaptive_integrate(initial_state, end_time, tolerance):
state <- initial_state
dt <- estimate_initial_timestep(state)
WHILE state.time < end_time:
trial_state <- rk4_step(state, dt)
error <- estimate_error(state, trial_state)
IF error < tolerance: # Decision point
state <- trial_state
dt <- min(dt * 1.5, dt_max) # Can grow
ELSE: # ELSE handles "otherwise" case
dt <- max(dt * 0.5, dt_min) # Must shrink
Level 3: Implementation-Ready (Stage 2: Add Safety)
FUNCTION adaptive_integrate(initial_state, end_time, tolerance):
state <- initial_state
dt <- estimate_initial_timestep(state)
dt_min <- 1e-10 * (end_time - initial_state.time)
dt_max <- 0.1 * (end_time - initial_state.time)
WHILE state.time < end_time:
step_accepted <- False # Boolean flag (True/False)
attempts <- 0
WHILE NOT step_accepted AND attempts < MAX_ATTEMPTS: # NOT inverts, AND requires both
trial_state <- rk4_step(state, dt)
error <- estimate_error(state, trial_state)
IF error < tolerance:
step_accepted <- True
state <- trial_state
ELSE:
dt <- max(dt * 0.5, dt_min)
attempts <- attempts + 1
Each refinement level reveals new issues and solutions. The NOT operator inverts a boolean value (True becomes False, False becomes True). The AND operator requires both conditions to be true. This is computational thinking in action!
sentinel value A special marker value (like -999, ‘END’, or None) that signals the end of data or a special condition, allowing loops to know when to stop processing
PATTERN: Sentinel Values
A sentinel is a special value that signals “stop processing.” This pattern appears everywhere in computing:
Real-world applications:
- FITS files: END keyword marks end of header
- Network protocols: Message terminators like
\r\n - Telescope data: -999 for missing observations
- String processing: Null terminators in C strings
The sentinel pattern is how computers know when to stop! You’re using the same technique that shows up in network protocols and instrument data streams.
Adaptive timestepping is an instance of a universal pattern:
PATTERN: Adaptive Refinement
- Attempt action with current parameters
- Evaluate quality of result
- If quality insufficient: refine parameters and retry
- If quality acceptable: proceed and possibly coarsen
- Include safeguards against infinite refinement
This pattern appears everywhere in computational science:
- Adaptive mesh refinement (AMR) in galaxy formation simulations
- Step size control in stellar evolution codes like MESA
- Learning rate scheduling in neural networks for photometric redshifts
- Convergence acceleration in self-consistent field calculations
- Importance sampling in Monte Carlo radiative transfer
Once you recognize this pattern, you’ll see it in every sophisticated scientific code!
3.2 Boolean Logic in Scientific Computing
Every decision in your code ultimately reduces to true or false. But in scientific computing, these decisions often involve floating-point numbers, where equality is treacherous and precision is limited. Let’s master this fundamental building block that underlies everything from data quality checks to convergence criteria!
The Complete Set of Comparison Operators
Python provides six comparison operators that return boolean values (True or False):
Greater than: False
Less than: True
Greater or equal: True
Less or equal: True
Equal to: True
Not equal to: True
Main sequence star? True
The != operator (not equal) is particularly useful for filtering out sentinel values or checking if something has changed. The ability to chain comparisons like 3000 < temperature < 50000 is a Python feature that makes code more readable and matches mathematical notation.
The Three Logical Operators: AND, OR, NOT
Python’s logical operators combine or modify boolean values:
Bright AND variable: False
Bright OR variable: True
NOT bright: False
NOT variable: True
Can observe? True
Truth Table for AND:
1 AND 1 = True
1 AND 0 = False
0 AND 1 = False
0 AND 0 = False
not > and > or
When in doubt, add parentheses.
True
False
True
Special Comparison Operators: is, in
Python has two special operators that are incredibly useful in scientific programming:
a == b: True
a is b: False
a is c: True
Checking None: True
Is 'G' a stellar type? True
Is 'X' a stellar type? False
Is FITS file? True
Handy Reductions: any() and all()
These are extremely common in scientific code: they reduce many boolean checks to one decision.
True
False
The Walrus Operator: Assignment Expressions (Python 3.8+)
Python 3.8 introduced the walrus operator (:=) which allows assignment within expressions:
Large dataset: 123 observations
Large dataset: 123 observations
Note: Walrus operator requires Python 3.8+
It's useful but not essential - all code can be written without it
The walrus operator (:=) is optional syntactic sugar introduced in Python 3.8. While it can make some code more concise, it’s perfectly fine to write code without it. Some environments (especially shared clusters or system installs) may lag behind, so be cautious about requiring new syntax in shared code.
Use it when:
- You need to use a value in a condition and then reuse it in the body
- Reading files line by line in a while loop
- Avoiding repeated expensive calculations
Avoid it when:
- It makes the code harder to read
- Working with Python < 3.8
- The traditional approach is clearer
The Floating-Point Equality Trap
Never use == with floating-point numbers! Even tiny rounding errors break equality:
Calculated == Expected? False
Tiny difference: 1.00e-10
math.isclose? True
0.1 + 0.2 == 0.3? False
Safe equal? True
math.isclose? True
Never use == with floating-point numbers! Even tiny rounding errors break equality.
Wrong (dangerous for critical systems):
if velocity == 0.0: # Dangerous!
print("At rest")Right (tolerance-based check):
import math
if math.isclose(velocity, 0.0, abs_tol=1e-10): # Safe!
print("Effectively at rest")This avoids brittle == comparisons and makes your intent explicit.
Short-Circuit Evaluation: Order Matters!
short-circuit evaluation Stopping logical evaluation once result is determined
Python’s and and or operators use short-circuit evaluation — they stop evaluating as soon as the result is determined:
No data or first measurement not positive
Short-circuit OR demonstration:
Result: True
Running expensive calculation...
Result: True
Space agencies use automated collision avoidance systems that evaluate multiple conditions in sequence for efficiency. The logic follows similar principles: check simple conditions first, then expensive calculations only if needed.
#| eval: false
# PSEUDOCODE: variables like distance/screening_threshold are placeholders.
def check_collision_risk(satellite1, satellite2):
"""
Simplified collision risk logic similar to conjunction assessment.
Real systems use complex probability calculations, but follow
similar efficiency principles.
Based on standard conjunction assessment practices where:
- Initial screening uses simple distance checks
- Detailed analysis only for close approaches
- Probability calculations only when necessary
"""
# Check cheap calculations first
if distance > screening_threshold:
return "No risk"
# Only if close, calculate relative velocity
if closing_velocity < 0:
return "Moving apart"
# Only if approaching, compute collision probability
if collision_probability > probability_threshold:
return "COLLISION RISK!"
return "Monitor"Checking distance first avoids millions of expensive velocity calculations per day. A single wrong comparison can mean missed warnings or wasted compute. As a secondary overview example, see the Wikipedia article on the 2009 Iridium 33 / Cosmos 2251 collision: https://en.wikipedia.org/wiki/2009_satellite_collision.
Note: This example simplifies the actual collision avoidance algorithms for pedagogical clarity. Real systems use complex orbital mechanics and probability distributions, but the core principle of ordered boolean evaluation remains crucial.
3.3 Conditional Statements: Teaching Computers to Decide
guard clause Early return statement that handles edge cases before main logic
Conditional statements are where your code makes decisions. In scientific computing, these decisions often involve numerical thresholds, convergence criteria, and boundary conditions. Let’s build your intuition for writing robust conditionals you can trust in long-running analyses and pipelines.
The if Statement: Your First Decision Maker
The if statement is the simplest conditional — it executes code only when a condition is true:
Star is visible to naked eye (mag 4.5)
Massive star detected!
Mass: 10.0 Msun
Will end as supernova
The if-else Statement: Binary Decisions
The else clause provides an alternative when the condition is false:
z = 0.8: distant galaxy
Low SNR - needs verification
The elif Statement: Multiple Choices
The elif (else if) statement allows multiple conditions to be checked in sequence:
white dwarf
white dwarf (near boundary - uncertain)
black hole
Scientific note: This toy classifier uses coarse mass thresholds for pedagogy. Real stellar endpoints depend on additional physics (metallicity, rotation, binarity, mass loss).
Guard Clauses: Fail Fast, Fail Clear
Guard clauses handle special cases immediately, preventing deep nesting and making code clearer. This pattern is essential for scientific code where invalid inputs can cause subtle bugs hours into a simulation!
Earth: 1.00 years
Mercury: 0.24 years
Proxima Centauri b: 0.031 years
The Ternary Operator: Compact Conditionals
Python’s ternary operator provides a compact way to write simple if-else statements:
Star with magnitude 3.5 is visible
Class G star: ~5778K
In 1999, NASA lost the Mars Climate Orbiter after a units mismatch contributed to navigation errors. A simple guard clause can help catch these mistakes early:
Note: The $327 million figure represents total mission cost. This anecdote simplifies a complex failure to emphasize the importance of unit validation. The actual failure involved multiple factors, but the unit confusion was the primary cause identified in NASA’s investigation reports.
import warnings
def process_thrust_data(force, units):
"""Guard clause example: validate units before using the value."""
# Validate units BEFORE processing
valid_units = {'N': 1.0, 'lbf': 4.448222} # Conversion factors
if units not in valid_units:
raise ValueError(f"Unknown units: {units}. Use 'N' or 'lbf'")
# Convert to standard units (Newtons)
force_newtons = force * valid_units[units]
# Additional sanity check (toy threshold)
if force_newtons > 1000: # Typical max thruster force
warnings.warn(f"Unusually high thrust: {force_newtons} N")
return force_newtons
# Example failure mode: one component produces lbf, another assumes N.
# A guard clause forces the mismatch into an explicit error early.The mission was lost after the spacecraft entered the atmosphere too low. Guard clauses won’t prevent every failure, but they do prevent many expensive “garbage in, garbage out” mistakes.
3.4 Loops: The Heart of Scientific Computation
Now that you’ve mastered making decisions with conditionals, let’s make your code repeat tasks efficiently! Loops are where your programs gain superpowers — they’re the difference between analyzing one star and analyzing millions. Every N-body simulation, every light curve analysis, every Monte Carlo calculation depends on loops. The patterns you learn here will appear in every algorithm you write for the rest of your career.
The for Loop: Iterating Over Sequences
The for loop iterates over any sequence (list, tuple, string, range):
Class O star
Class B star
Class A star
Class F star
Class G star
Class K star
Class M star
Counting with range:
Observation 0
Observation 1
Observation 2
Observation 3
Observation 4
Every 2nd hour from 20:00 to 02:00 (wrapping past midnight):
20:00
22:00
00:00
02:00
The Accumulator Pattern in Scientific Computing
accumulator pattern Iteratively combining values into a running aggregate
The accumulator pattern is fundamental to scientific computing:
Cluster center of mass: 0.510 pc
Total cluster mass: 6.1 Msun
Common for Loop Patterns
Python provides several useful functions for loop patterns:
Finding bright events with enumerate:
Alert! Index 4: magnitude 8.2
Parallel iteration with zip:
t=1s: v=4.9 m/s
t=2s: v=9.8 m/s
t=3s: v=14.7 m/s
t=4s: v=19.6 m/s
The while Loop: Conditional Iteration
The while loop continues as long as a condition remains true:
Iteration 0
Iteration 1
Iteration 2
Convergence example:
Iter 1: value = 90.100
Iter 2: value = 81.190
Iter 3: value = 73.171
Iter 10: value = 35.519
Iter 20: value = 13.036
Iter 30: value = 5.197
Iter 40: value = 2.463
Iter 50: value = 1.510
Iter 60: value = 1.178
Iter 70: value = 1.062
Iter 80: value = 1.022
Converged to 1.009 after 88 iterations
Loop Control: break, continue, and else
Python provides additional loop control statements:
Using break to find first detection:
First significant detection: 5.8
Using continue to skip bad data:
Processing: 1.2
Processing: 2.3
Processing: 3.4
Loop else clause:
Target 7 not found in list
The pass Statement: Placeholder
The pass statement does nothing — useful as a placeholder:
Processing 0
Processing 2
Continued after pass
Nested Loops: Processing 2D Data
Loops can be nested to process multi-dimensional data:
Processing 3x3 pixel grid:
(0,0)=0 (0,1)=1 (0,2)=2
(1,0)=3 (1,1)=4 (1,2)=5
(2,0)=6 (2,1)=7 (2,2)=8
Finding peaks in 2D array:
Peak at (1,1): value=9
The most common bug in all of programming! Python’s zero-indexing catches everyone:
Classic Mistake (quietly skipping data):
observations = [1, 2, 3, 4, 5]
# Trying to process all elements
for i in range(1, len(observations)): # OOPS! Skips first element
print(observations[i])
# Or worse - going past the end
for i in range(len(observations) + 1): # IndexError on last iteration!
print(observations[i])Remember: - range(n) gives 0, 1, …, n-1 (NOT including n!) - List of length n has indices 0 to n-1 - The last element is at index len(list) - 1
Off-by-one errors show up everywhere. Double-check your ranges!
Don’t worry — everyone writes an infinite loop occasionally! Here are two common causes:
Case 1: Floating-point precision prevents exact equality
x = 0.0
while x != 1.0: # INFINITE LOOP!
x += 0.1 # After 10 additions, x approx 0.9999999999
# Fix: Use tolerance
while abs(x - 1.0) > 1e-10:
x += 0.1Case 2: Forgetting to update loop variable
i = 0
while i < 10:
print("still looping...")
# Forgot: i += 1 # INFINITE LOOP!Always add a maximum iteration safeguard!
3.5 List Comprehensions: Elegant and Efficient
Now that you’ve mastered loops, let’s evolve them into something even more powerful! List comprehensions are Python’s gift to scientific programmers. They transform verbose loops into concise, readable, and faster expressions.
From Loop to Comprehension
Loop result: [0, 4, 16, 36, 64]
Comprehension: [0, 4, 16, 36, 64]
Real Scientific Applications
Observable star count: 6/9
Brightest flux: 1.91e-05
Faintest flux: 9.12e-07
Bright stars dictionary: {'star_0': 12.3, 'star_2': 13.7, 'star_4': 14.5, 'star_6': 11.8, 'star_8': 13.2}
When NOT to Use Comprehensions
Galaxy classifications: ['nearby faint', 'distant bright', 'nearby bright', 'distant faint']
3.6 Advanced Control Flow Patterns
Now let’s explore powerful patterns that appear throughout scientific computing. These aren’t just code tricks — they’re fundamental algorithmic building blocks!
Welford’s Algorithm: Numerically Stable Statistics
Stable algorithm: mean=100000000.5, std=1.12
Naive variance (can go negative due to cancellation): 2.000e+00
Naive std (after clipping at 0 for display): 1.41
Why is Welford’s algorithm numerically stable?
The naive approach (sum all values, then divide) accumulates large sums that can lose precision. For values like [1e8, 1e8+1, 1e8+2], the sum becomes ~3e8, and the tiny variations (1, 2) get lost in floating-point representation.
Welford’s algorithm maintains a running mean and updates it incrementally with small deltas. Instead of computing (1e8 + 1e8 + 1e8)/3, it computes:
- mean = 1e8
- mean += (1e8 - 1e8)/2 = 0
- mean += (1e8 - 1e8)/3 = 0 This keeps all arithmetic operations on similar scales, preserving precision.
The variance calculation similarly avoids subtracting large nearly-equal numbers (catastrophic cancellation) by accumulating squared deviations incrementally.
Why not use log space? Students often ask why we don’t use logarithms here like we did for the luminosity calculation. Log space is perfect for products (multiplication becomes addition) but wrong for statistics. Welford’s algorithm computes arithmetic mean and variance, which require addition. In log space, you’d compute geometric mean instead - a completely different statistic! Plus, log fails on negative values, common in astronomy (radial velocities, position residuals). Welford’s incremental approach is already optimal.
PATTERN: Iterative Convergence
initialize state
iteration_count = 0
while not converged and iteration_count < max_iterations:
new_state = update(state)
converged = check_convergence(state, new_state, tolerance)
state = new_state
iteration_count += 1
if not converged:
handle_failure()This pattern appears throughout computational science: - Kepler’s equation solver (finding true anomaly) - Stellar structure integration (hydrostatic equilibrium) - Radiative transfer (temperature iterations) - N-body orbit integration (adaptive timesteps)
Master this pattern and you’ve mastered half of computational physics!
3.7 Debugging Control Flow
Logic errors are the hardest bugs because the code runs without crashing but produces wrong results. Let’s build your debugging arsenal!
Strategic Print Debugging
Testing convergence:
Iter 0: 0.0000 -> 10.0000 (delta=+10.00000)
Iter 1: 10.0000 -> 19.0000 (delta=+9.00000)
Iter 2: 19.0000 -> 27.1000 (delta=+8.10000)
Iter 5: 40.9510 -> 46.8559 (delta=+5.90490)
Iter 10: 65.1322 -> 68.6189 (delta=+3.48678)
Iter 15: 79.4109 -> 81.4698 (delta=+2.05891)
FAILED after 20 iterations
Using Assertions for Validation
The assert statement helps catch bugs during development:
Average magnitude: 10.33
Never use assertions for user input validation or critical checks! Assertions can be completely disabled when Python runs with optimization (python -O), causing them to be skipped entirely.
WRONG - Don’t do this for user-facing code:
import math
def process_user_data(value):
assert value > 0 # DANGEROUS! Might not run in production!
return math.sqrt(value)RIGHT - Use explicit validation for production:
import math
def process_user_data(value):
if value <= 0:
raise ValueError(f"Value must be positive, got {value}")
return math.sqrt(value)When to use assertions: - Documenting internal assumptions during development - Catching programming errors early (not user errors) - Self-checks in algorithms (but have a fallback plan) - Test suites and debugging
When NOT to use assertions: - Validating user input - Checking file existence or permissions - Network availability checks - Any check that must run in production
Think of assertions as “developer notes that can catch bugs” rather than “guards that protect your code.”
The Kepler mission’s pipeline used exactly the control flow patterns you just learned. Here’s an intentionally simplified pseudocode skeleton showing the shape of that logic:
Note: This algorithm is greatly simplified for pedagogical purposes. The actual Kepler pipeline used sophisticated techniques including Fourier transforms, multiple detrending algorithms, and extensive validation checks. However, the control flow patterns shown here — guard clauses, filtering, iteration, and conditional validation — formed the backbone of the real system.
#| eval: false
# PSEUDOCODE: helper functions/variables like median, sigma, and fold_light_curve
# are placeholders to highlight control flow, not implementation details.
def kepler_planet_search(star_id, light_curve):
"""Simplified Kepler planet detection algorithm"""
# Guard clause - data quality check
if len(light_curve) < 1000:
return None
# Remove outliers (cosmic rays, etc.)
cleaned = [point for point in light_curve
if abs(point - median) < 5 * sigma]
# Search for periodic dips
best_period = None
best_depth = 0
for trial_period in range(1, 365): # Days
folded = fold_light_curve(cleaned, trial_period)
depth = measure_transit_depth(folded)
if depth > best_depth and depth > 3 * noise_level:
best_period = trial_period
best_depth = depth
# Validate as planet (not eclipsing binary)
if best_period:
if is_v_shaped(folded): # Binary check
return None
if depth > 0.5: # Too deep
return None
return {'period': best_period, 'depth': best_depth}
return NoneKepler monitored on the order of ~150,000 stars. The control flow patterns you’ve mastered in this chapter — guard clauses, filtering, iteration, and validation — are exactly what enables pipelines to scale. For a real overview, see Jenkins et al. (2010): https://ui.adsabs.harvard.edu/abs/2010ApJ...713L..87J/abstract (and the open-source code: https://github.com/nasa/kepler-pipeline).
A telescope scheduling system has a subtle bug in its priority logic. Can you find and fix it?
Priority (galaxy, time-critical): 70
Priority (asteroid, time-critical): 80
Priority (dim variable star): 50
The Bug: Operator precedence + missing parentheses.
This line:
elif obj_type == 'variable_star' and magnitude < 12 or time_critical:
is parsed as:
(obj_type == 'variable_star' and magnitude < 12) or time_critical
So when time_critical is True, that branch triggers for every object type (including galaxies and asteroids), and it can even steal priority from later elif cases.
Fix: Add parentheses to match the intended logic.
def assign_telescope_priority_fixed(observation):
"""Fixed version with clearer logic."""
magnitude = observation['magnitude']
obj_type = observation['type']
time_critical = observation['time_critical']
# Assign base priority by object type
if obj_type == 'supernova':
priority = 100
elif obj_type == 'asteroid' and time_critical:
priority = 80
elif obj_type == 'variable_star' and (magnitude < 12 or time_critical):
priority = 70
elif obj_type == 'variable_star':
priority = 50
elif obj_type == 'galaxy':
priority = 30
else:
priority = 10
# Boost for bright objects
if magnitude < 10:
priority += 20
return priorityKey Lessons:
- Remember precedence:
not>and>or - Add parentheses when mixing
and/or - Write a tiny test case that breaks the buggy logic
Main Takeaways
What an incredible journey you’ve just completed! You’ve transformed from someone who writes code line by line to someone who designs algorithms systematically. This transformation mirrors the evolution every computational scientist goes through, from tentative beginner to confident algorithm designer.
You started by learning to think in pseudocode, a skill that gives you the power to design before you code. Those three levels of refinement you practiced are your blueprint for success. Every hour you invest in pseudocode saves many hours of debugging. When you design your next algorithm for analyzing galaxy spectra or simulating stellar evolution, you’ll catch logical flaws on paper instead of after hours of computation.
The complete set of comparison and logical operators you’ve mastered — from simple greater-than checks to complex boolean combinations with and, or, and not — gives you the full vocabulary for expressing any logical condition. You understand that == is dangerous with floats, that is checks identity not equality, and that in elegantly tests membership. These aren’t just syntax details; they’re the building blocks of every data validation, every convergence check, every quality filter you’ll ever write.
Your understanding of conditional statements goes beyond syntax to defensive programming philosophy. Guard clauses help you fail fast and avoid wasting hours (or compute budgets) on invalid inputs. The elif chains you practiced will classify objects, determine processing paths, and control how your code responds to real, messy data.
The loop patterns you’ve mastered are universal across scientific computing. Accumulators power statistics and reductions, convergence loops power solvers, and nested loops show up whenever you traverse grids or images. Whether using for loops to iterate through catalogs, while loops to converge solutions, or list comprehensions to filter data, you now have the core toolkit.
Most importantly, you’ve learned that bugs aren’t failures — they’re learning opportunities. Every infinite loop teaches you about termination conditions. Every off-by-one error reinforces indexing. The debugging strategies you’ve developed, from strategic print statements to assertions, will serve you throughout your career.
Remember that every major computational achievement relies on these fundamentals. You’re not just learning Python syntax — you’re building algorithmic literacy that transfers across domains.
Definitions
Accumulator Pattern: An algorithmic pattern where values are iteratively combined into a running total or aggregate, fundamental to reductions and statistical calculations in scientific data processing.
Adaptive Refinement: A universal pattern where parameters are adjusted based on quality metrics, with safeguards against infinite refinement, appearing in timestepping, mesh refinement, and optimization throughout computational science.
and: Logical operator that returns True only if both operands are true, using short-circuit evaluation.
assert: Statement that raises an AssertionError if a condition is false, used for debugging and documenting assumptions during development (not for production validation).
Boolean Logic: The system of true/false values and logical operations (and, or, not) that underlies all conditional execution in programs.
break: Statement that immediately exits the current loop, skipping any remaining iterations.
Conditional Statement: A control structure (if/elif/else) that executes different code blocks based on whether conditions evaluate to true or false.
continue: Statement that skips the rest of the current loop iteration and proceeds to the next iteration.
elif: “Else if” statement that checks an additional condition when the previous if or elif was false.
else: Clause that executes when all previous if/elif conditions were false, or when a loop completes without breaking.
for: Loop that iterates over elements in a sequence or iterable object.
Guard Clause: A conditional statement at the beginning of a function that handles special cases or invalid inputs immediately, preventing deep nesting.
if: Statement that executes code only when a specified condition is true.
in: Operator that tests membership in a sequence or collection.
is: Operator that tests object identity (same object in memory), not just equality of values.
List Comprehension: A concise Python syntax for creating lists: [expression for item in iterable if condition].
not: Logical operator that inverts a boolean value (True becomes False, False becomes True).
or: Logical operator that returns True if at least one operand is true, using short-circuit evaluation.
pass: Null statement that does nothing, used as a placeholder where syntax requires a statement.
Pseudocode: A human-readable description of an algorithm that focuses on logic and structure without syntactic details.
sentinel value: A special marker value (like -999, ‘END’, or None) that signals the end of data or a special condition, allowing loops to know when to stop processing
Short-circuit Evaluation: The behavior where logical operators stop evaluating as soon as the result is determined.
Walrus Operator (:=): Assignment expression operator (Python 3.8+) that assigns a value to a variable as part of an expression, allowing both assignment and testing in a single statement.
while: Loop that continues executing as long as a specified condition remains true.
Key Takeaways
✓ Pseudocode reveals logical flaws before they become bugs — always design before implementing
✓ Master all six comparison operators (>, <, >=, <=, ==, !=) and three logical operators (and, or, not)
✓ Never use == with floating-point numbers; always use tolerance-based comparisons like math.isclose()
✓ The is operator checks identity, not equality — use it for None checks
✓ The in operator elegantly tests membership in sequences or strings
✓ Guard clauses handle special cases first, making main logic clearer
✓ for loops iterate over sequences, while loops continue until a condition becomes false
✓ break exits loops early, continue skips to the next iteration, else runs if loop completes
✓ List comprehensions are faster than loops for simple transformations but become unreadable for complex logic
✓ Short-circuit evaluation in and/or prevents errors and improves performance
✓ The accumulator pattern is fundamental to scientific computing, appearing in all statistical calculations
✓ Always include maximum iteration limits in while loops to prevent infinite loops
✓ Welford’s algorithm solves numerical stability issues in streaming statistics
✓ Use assert statements to document and enforce assumptions during development
✓ Every major scientific discovery relies on the control flow patterns you’ve learned
Quick Reference Tables
| Operator | Description | Example |
|---|---|---|
> |
Greater than | if magnitude > 6.0: |
< |
Less than | if redshift < 0.1: |
>= |
Greater or equal | if snr >= 5.0: |
<= |
Less or equal | if error <= tolerance: |
== |
Equal (avoid with floats!) | if status == 'complete': |
!= |
Not equal | if flag != -999: |
| Operator | Description | Example |
|---|---|---|
and |
Both must be true | if x > 0 and y > 0: |
or |
At least one true | if bright or variable: |
not |
Inverts boolean | if not converged: |
| Operator | Description | Example |
|---|---|---|
in |
Membership test | if 'fits' in filename: |
is |
Identity test | if result is None: |
is not |
Negative identity | if data is not None: |
not in |
Negative membership test | if 'error' not in log_file: |
:= |
Walrus operator (Python 3.8+) | if (n := len(data)) > 100: |
| Statement | Purpose | Example |
|---|---|---|
if/elif/else |
Conditional execution | if mag < 6: visible = True |
for |
Iterate over sequence | for star in catalog: |
while |
Loop while condition true | while error > tolerance: |
break |
Exit loop early | if converged: break |
continue |
Skip to next iteration | if bad_data: continue |
pass |
Do nothing (placeholder) | if not ready: pass |
assert |
Debug check | assert len(data) > 0 |
| Function | Purpose | Example |
|---|---|---|
range(n) |
Generate 0 to n-1 | for i in range(10): |
range(start, stop, step) |
Generate with step | for i in range(0, 10, 2): |
enumerate(seq) |
Get index and value | for i, val in enumerate(data): |
zip(seq1, seq2) |
Parallel iteration | for x, y in zip(xs, ys): |
len(seq) |
Sequence length | for i in range(len(data)): |
| Function | Purpose | Example |
|---|---|---|
all(iterable) |
All elements true | if all(x > 0 for x in data): |
any(iterable) |
Any element true | if any(x < 0 for x in data): |
math.isclose() |
Safe float comparison | if math.isclose(a, b): |
math.isfinite() |
Check not inf/nan | if math.isfinite(result): |
math.isnan() |
Check for NaN | if not math.isnan(value): |
math.isinf() |
Check for infinity | if math.isinf(value): |
isinstance() |
Type checking | if isinstance(x, float): |
| Pattern | Purpose | Structure |
|---|---|---|
| Accumulator | Aggregate values | total = 0; for x in data: total += x |
| Filter | Select subset | [x for x in data if condition(x)] |
| Map | Transform all | [f(x) for x in data] |
| Search | Find first match | for x in data: if test(x): return x |
| Convergence | Iterate to solution | while not converged and n < max: |
| Guard clause | Handle edge cases | if invalid: return None |
| Sentinel | Signal termination | if value == -999: break |
Python Module & Method Reference (Chapter 3 Additions)
New Built-in Functions
Logical Testing
all(iterable)- Returns True if all elements are trueany(iterable)- Returns True if any element is trueisinstance(obj, type)- Check if object is of specified type
Loop Support
enumerate(iterable, start=0)- Returns index-value pairszip(*iterables)- Combines multiple iterables for parallel iterationrange(start, stop, step)- Generate arithmetic progression
Control Flow Keywords
Conditionals
if- Execute block if condition is trueelif- Check additional condition if previous was falseelse- Execute if all previous conditions were false
Loops
for- Iterate over sequencewhile- Loop while condition is truebreak- Exit loop immediatelycontinue- Skip to next iterationelse- Execute if loop completes without break
Other
pass- Null operation placeholderassert- Raise AssertionError if condition is false
Operators
Comparison
>,<,>=,<=,==,!=- Numerical comparisonsis,is not- Identity comparisonsin,not in- Membership testing
Logical
and- Logical AND with short-circuit evaluationor- Logical OR with short-circuit evaluationnot- Logical NOT (inversion)
New Math Module Functions
import mathmath.isclose(a, b, rel_tol=1e-9, abs_tol=0.0)- Safe floating-point comparisonmath.isfinite(x)- Check if neither infinite nor NaNmath.isnan(x)- Check if value is NaNmath.isinf(x)- Check if value is infinite
Debugging Support
IPython Magic Commands
%debug- Enter debugger after exception%pdb- Automatic debugger on exceptions
Debugger Commands (when in pdb)
p variable- Print variable valuepp variable- Pretty-print variablel- List code around current linen- Next lines- Step into functionc- Continue executionu/d- Move up/down call stackq- Quit debugger
Next Chapter Preview
You’ve conquered control flow — now get ready for the next level! Chapter 4 will reveal how to organize data efficiently using Python’s powerful data structures. You’ll discover when to use lists versus dictionaries versus sets, and more importantly, you’ll understand why these choices can make your algorithms run 100 times faster or 100 times slower.
Imagine trying to find a specific star in a catalog of millions. With a list, you’d check each star one by one — taking minutes or hours. With a dictionary, you’ll find it instantly — in microseconds! The data structures you’ll learn next are the difference between simulations that finish in minutes and ones that run for days.
The control flow patterns you’ve mastered here will operate on the data structures you’ll learn next. Your loops will iterate through dictionaries of astronomical objects. Your conditionals will filter sets of observations. Your comprehensions will transform lists of measurements into meaningful results. Together, control flow and data structures give you the power to handle the massive datasets of modern science — from Gaia’s billion-star catalog to the petabytes of data from the Square Kilometre Array.
Get excited — Chapter 4 is where your code goes from processing dozens of data points to handling thousands efficiently!