Project 1: Published Model to Verified Code
COMP 536 | Short Projects
The Big Idea
Real computational scientists don’t just write code — they turn models into instruments:
- Take a published model (paper)
- Extract the equations + coefficients
- Implement with correct units + careful validation
- Prove correctness with tests + reproducible plots
The domain here is stellar astrophysics. The skills are universal.
Who this is for
This is a 500-level computational science assignment. You are expected to:
- read technical material with intent
- reason from specs (derive code from requirements, not guesswork)
- debug systematically (form hypotheses, test them)
You are not expected to already know astrophysics.
If terms like “contracts,” “invariants,” or “validation-first” are unfamiliar, read the Software Engineering for Scientists guide first. It covers what to think about before you touch the keyboard.
If the astrophysics vocabulary is unfamiliar, skim the companion page first: Project 1 Science Background
Overview
| Assigned | Tuesday, January 27, 2026 |
| Due | Tuesday, February 10, 2026 (11:59 pm PT) |
| Duration | 2 weeks |
Learning Objectives (in priority order): software engineering and reproducibility, then model formulation, then numerical methods
Textbook Reference: Python Fundamentals Module, Chapters 1-6
This project follows the course AI Use & Growth Mindset Policy.
For Project 1, assume Phase 1 rules apply: struggle first (about 20–30 minutes of real effort), then use AI only for tutoring/clarification (not writing or restructuring your core solution code).
Background
Stars on the Zero-Age Main Sequence (ZAMS) represent stars at the moment they begin stable hydrogen fusion in their cores — essentially their “birth” as true stars. The ZAMS is a theoretical starting point where stars have just achieved hydrostatic equilibrium between gravity and radiation pressure, with a homogeneous composition throughout.
Tout et al. (1996): Download PDF from ADS
This paper provides fitting formulae for ZAMS luminosity and radius as functions of mass and metallicity. You will extract equations and coefficients from this paper.
Stars don’t stay at their ZAMS properties! They evolve continuously on the main sequence. The Sun was about 30% less luminous when it formed 4.6 billion years ago (\(L_{\odot,\mathrm{ZAMS}} \approx 0.7\, L_\odot\)) compared to today. If you get \(L = 1.0\, L_\odot\) for a 1 \(M_\odot\) star, you have the wrong model!
You are not being asked to become a stellar-physics expert. You are being asked to practice a realistic workflow: turn a published model into verified code.
Recommended workflow (about 20–40 min total):
- Skim pass (5–10 min): abstract -> introduction -> section headings -> conclusions. Goal: learn what the model outputs are and where the actual model lives.
- Locate the model equations: find the specific equations defining the mapping \(M \to L\), \(R\), \(T_\mathrm{eff}\).
- Locate the parameter tables: identify which table(s) contain the coefficients used in those equations.
- Translate paper -> code (faithfully):
- One function per equation (small, testable units)
- Coefficients/constants live in
constants.py(single source of truth) - Docstrings cite which equation/table the function implements
- Validate early, validate often: run your
python run.py validatebefore making plots. Most mistakes are transcription/interpretation issues — not “math errors.”
Most failures come from “paper parsing,” not coding ability. Watch for:
log10vsln: papers often mean base-10 logs unless stated otherwise. NumPy’snp.log()is natural log — usenp.log10()instead.- Units & normalization: Tout et al. uses solar units (\(M/M_\odot\), \(L/L_\odot\)). Don’t mix them with CGS.
- Piecewise regimes: many fits change coefficients across different mass ranges. Make sure your code uses the correct branch.
- Scientific notation in tables:
1.23E-04style values are easy to mistype. Double-check exponent signs. - Transcription errors: a single missing minus sign or exponent will destroy your curve.
- Vectorization gotchas:
np.power()for arrays, notmath.pow(). - Table structure: The coefficient tables have 5 columns (\(a, b, c, d, e\)) — make sure you’re reading rows and columns correctly.
Required Modules
| Module | Class? | Required Implementation |
|---|---|---|
| constants.py | No | Define physical constants in CGS units |
| zams.py | No | ZAMS luminosity and radius functions |
| star.py | Yes | Star class with physics methods |
| stellar_population.py | Yes | Collection of stars with vectorization |
| astro_plot.py | No | Reusable plotting functions |
Testing Requirements
Write at least 8 tests in tests/ covering:
| Category | Minimum | Examples |
|---|---|---|
| Scalar inputs | 2 | test_luminosity_solar_mass(), test_radius_solar_mass() |
| Array inputs | 2 | test_luminosity_array_shape(), test_population_vectorized() |
| Invalid inputs | 2 | test_negative_mass_raises(), test_zero_mass_raises() |
| \(Z\) dependence | 2 | test_metallicity_effect_on_luminosity(), test_low_Z_higher_luminosity() |
Functions should raise ValueError for masses outside \([0.1, 100]\, M_\odot\) or metallicities outside \([0.0001, 0.03]\).
Trend tests for metallicity:
def test_luminosity_increases_with_mass():
"""L(M) should increase strongly with mass."""
masses = np.logspace(-1, 2, 100) # 0.1 to 100 M_sun
L = luminosity(masses)
assert np.all(np.diff(L) > 0), "L must increase with M"
def test_metallicity_effect_on_luminosity():
"""Higher Z -> lower L at fixed mass."""
M = 1.0
L_solar = luminosity(M, Z=0.02)
L_low_Z = luminosity(M, Z=0.0001)
assert L_low_Z > L_solar, "Low-Z stars should be more luminous"pytest discovers tests automatically by naming convention. When you run pytest, it:
- Scans for files matching
test_*.pyor*_test.py - Imports each file and looks for functions named
test_* - Runs each discovered function as a test
| Pattern | What pytest does |
|---|---|
test_*.py |
Import as test module |
def test_*() |
Run as a test |
class Test* |
Look inside for test_* methods |
| Anything else | Ignore |
No registration, no decorators, no config. Just name things correctly.
import pytest
import numpy as np
from zams import luminosity
def test_solar_mass_luminosity():
"""Docstring explains what the test checks."""
L = luminosity(1.0)
assert L == pytest.approx(0.698, rel=0.02) # 2% tolerance
def test_luminosity_preserves_shape():
masses = np.array([0.5, 1.0, 10.0])
L = luminosity(masses)
assert L.shape == masses.shape
def test_negative_mass_raises():
with pytest.raises(ValueError):
luminosity(-1.0)Run tests: pytest -v or python run.py test
Submission Contract (read this first)
This course uses GitHub Classroom + CI. That means I need to be able to run every project the same way, on a clean clone, without clicking around in notebooks.
Required repo entrypoint: run.py
Your repository must include a run.py file at the repository root that supports these commands:
python run.py --help
python run.py validate
python run.py make-figures
python run.py testMinimum requirements:
python run.py testruns your test suite (viapytest) and exits non-zero if anything fails.python run.py validateruns fast numerical sanity checks (including the 1 \(M_\odot\) reference values) and exits non-zero if checks fail.python run.py make-figuresgenerates the required figures intooutputs/figures/(no interactive prompts).- Generated outputs (including
outputs/figures/) should be reproducible and should not be committed to git. - Everything is deterministic by default (fixed random seed unless explicitly overridden).
You must also document how to run these commands in your README.md (a short “Quickstart” section is sufficient).
Your README.md must also include a short Overview (3–6 sentences) describing what the project does (what model you implemented and what outputs the repo produces). Scientific interpretation belongs in research_memo.pdf, but the README should orient a reader on a clean clone.
No notebooks for submission
For this course, .ipynb notebooks are not accepted unless a project explicitly says otherwise. Use scripts and modules so CI and graders can run your work non-interactively.
Keep your repo clean (graded)
Your repo should be easy to review. You can lose points if it’s hard for me to run or audit your work. Common issues:
- committing large generated files (plots, data dumps) outside
outputs/, - unclear “main” entrypoint (no
run.py, or multiple competing entrypoints), - tangled imports / code copied into notebooks,
- environment clutter (virtualenvs, caches, OS cruft) committed to git.
Part A: Constants Module
Create constants.py with physical constants in CGS units. Include at minimum:
LSUN,RSUN,MSUN,TSUN— solar reference valuesG,SIGMA_SB— fundamental constantsYEAR,MYR,GYR— time unit conversions
Constants Documentation (required)
Include a constants table in README.md or CONSTANTS.md:
| Name in code | Meaning | Value | Units | Source |
|---|---|---|---|---|
MSUN |
Solar mass | g | ||
RSUN |
Solar radius | cm | ||
LSUN |
Solar luminosity | erg/s | ||
TSUN |
Solar effective temperature | K | ||
SIGMA_SB |
Stefan-Boltzmann constant | erg/(cm\(^2\) s K\(^4\)) | ||
| … | … |
You will lose points for unexplained “mystery numbers.” Every constant in code must appear in constants.py AND in your constants table with a source citation.
Part B: ZAMS Functions Module
Create zams.py implementing Tout et al. (1996) relations for luminosity and radius as functions of mass and metallicity.
Required: Full metallicity dependence
Your ZAMS functions must implement the full \(Z\)-dependence from Equations 3 and 4 of Tout et al. (1996), using all five columns (\(a, b, c, d, e\)) from Tables 1 and 2:
def luminosity(mass, Z=0.02):
"""ZAMS luminosity from Tout et al. (1996) Eq. 1.
Parameters
----------
mass : float or array
Stellar mass in solar units (M/M_sun)
Z : float, optional
Metallicity (default: 0.02 = solar)
Returns
-------
L : float or array
Luminosity in solar units (L/L_sun)
"""The Z=0.02 default means your code works for solar metallicity without specifying \(Z\), but you can also call luminosity(1.0, Z=0.004) for sub-solar metallicity.
Part C: Star Class
Create star.py for individual star objects.
A star has mass, luminosity, radius, temperature, and derived quantities like \(t_\mathrm{MS}\). Without a class, you’d pass tuples like (M, L, R, T) everywhere — easy to mix up argument order. The Star class bundles these together: star.luminosity and star.kelvin_helmholtz_time are unambiguous attributes. Everything is computed once at initialization.
Required constructor signature:
class Star:
def __init__(self, mass, Z=0.02):
"""Initialize a ZAMS star.
Parameters
----------
mass : float
Stellar mass in solar units (M/M_sun)
Z : float, optional
Metallicity (default: 0.02 = solar)
"""Required attributes (computed at initialization):
Your Star should compute and store these as attributes when __init__ runs:
luminosity— fromzams.luminosity(mass, Z)radius— fromzams.radius(mass, Z)effective_temperature(\(T_\mathrm{eff}\))main_sequence_lifetime(\(t_\mathrm{MS}\)) — in Gyrkelvin_helmholtz_time(\(t_\mathrm{KH}\)) — in Myr
Your Star class should call the functions from zams.py — don’t reimplement the Tout fits. This is the single-responsibility principle: zams.py owns the model; Star owns the interface.
Part D: StellarPopulation Class
Create stellar_population.py to handle collections of stars.
A StellarPopulation is not just a list of Star objects — it’s a collection that supports bulk operations. Vectorized methods like population.luminosities compute all values at once without Python loops. This pattern (collection class with vectorized methods) is common in scientific computing: think DataFrames, xarray Datasets, or astropy Tables.
Required attributes (computed at initialization):
When __init__ runs, compute and store these as array attributes:
masses— the input mass arrayluminosities— fromzams.luminosity(masses, Z)(one vectorized call)radii— fromzams.radius(masses, Z)(one vectorized call)effective_temperatures— computed from \(L\) and \(R\)
Call zams.luminosity(masses) and zams.radius(masses) directly on the full array — don’t loop over individual Star objects.
Optional Extension (IMF Sampling):
- Pareto distribution sampling for power-law mass functions
- Comparison with uniform sampling
- Histogram visualization of mass distributions
Mass Sampling Guidance
For analysis: Log-uniform sampling
Sample masses logarithmically for fair coverage across orders of magnitude:
def sample_masses_loguniform(n, m_min, m_max, seed=None):
rng = np.random.default_rng(seed)
log_m = rng.uniform(np.log10(m_min), np.log10(m_max), n)
return 10**log_mMust support a seed argument for reproducibility.
For required plots: Deterministic grid
Do NOT rely on randomness for required plots. Use a deterministic log-spaced grid:
masses = np.logspace(np.log10(M_min), np.log10(M_max), N)Part E: Performance Testing
Compare timing between these two approaches for computing properties of \(N = 1000\) stars:
- Loop approach: Create \(N\)
Starobjects in a loop, accessstar.luminosityfor each - Vectorized approach: Create one
StellarPopulationwith \(N\) masses, accesspopulation.luminositiesonce
Report the speedup factor as (loop time) / (vectorized time) in your research memo. You should typically see roughly 10x speedup, depending on your machine.
Part F: Plotting Module
Create astro_plot.py with reusable plotting functions for:
- HR diagrams
- Mass-luminosity relations
- Multi-panel figures
Validation Workflow (mandatory checkpoints)
Students who defer validation until the end spend hours debugging compounding errors. Follow this workflow — it’s designed to catch mistakes early.
Complete these checkpoints in order:
Checkpoint 1: Implement with \(Z = 0.02\) default
All functions take a Z parameter with default Z=0.02:
L = luminosity(1.0) # Uses Z=0.02 (solar)
L = luminosity(1.0, Z=0.004) # Explicit sub-solar ZAt this stage, you can implement the simpler solar-only case first (just the \(a\) column from Tables 1-2), then extend.
Checkpoint 2: Validate at solar \(Z\)
Run python run.py validate and verify the anchor values below pass. Fix any transcription errors before proceeding.
Checkpoint 3: Generate all plots at solar \(Z\)
Run python run.py make-figures and visually inspect:
- Does \(L(M)\) increase strongly with mass?
- Does \(R(M)\) increase with mass?
- Does the HR diagram look like a main sequence (hot stars top-left, cool stars bottom-right)?
If not, debug now — don’t move to multi-\(Z\) with broken solar-\(Z\) code.
Checkpoint 4: Add full \(Z\) dependence
Implement Equations 3 and 4 (quartic polynomials in \(\xi_Z = \log_{10}(Z/Z_\odot)\)) using all five columns from Tables 1 and 2.
Checkpoint 5: Generate multi-\(Z\) plots
Generate the multi-metallicity plots with \(Z = 0.02, 0.004, 0.0001\). Verify the trends match physical intuition:
- Lower \(Z\) means higher \(L\) at fixed mass (less opacity)
- Lower \(Z\) means smaller \(R\) at fixed mass
- Lower \(Z\) means hotter \(T_\mathrm{eff}\) at fixed mass
Compare visually to Tout+1996 Figures 1, 3, 5 — your curves should show similar trends (not identical styling).
Required Plots
Generate these figures and save them to outputs/figures/.
Solar Metallicity Plots (validate first!)
These correspond to Tout+1996 Figures 1, 3, and 5 — use them to validate your implementation visually.
| Plot | Purpose | Requirements |
|---|---|---|
luminosity_vs_mass.png |
Validate \(L(M)\) — cf. Tout Fig. 1 | Log-log axes, labeled axes with units |
radius_vs_mass.png |
Validate \(R(M)\) — cf. Tout Fig. 3 | Log-log axes, labeled axes |
hr_diagram.png |
Validate \(T_\mathrm{eff}\) — cf. Tout Fig. 5 | log \(L\) vs log \(T_\mathrm{eff}\), invert the temperature axis (hot on left) |
timescales.png |
Compare \(t_\mathrm{KH}\) and \(t_\mathrm{MS}\) | 2-panel figure: left panel \(t_\mathrm{KH}\) (Myr) vs mass, right panel \(t_\mathrm{MS}\) (Gyr) vs mass; both log-log |
Multi-Metallicity Plots (required)
After validating at solar \(Z\), generate the same three plot types with multiple metallicity curves:
| Plot | Purpose | Requirements |
|---|---|---|
luminosity_vs_mass_multiZ.png |
Show \(L(M, Z)\) dependence | Three curves: \(Z = 0.02, 0.004, 0.0001\) |
radius_vs_mass_multiZ.png |
Show \(R(M, Z)\) dependence | Three curves with same \(Z\) values |
hr_diagram_multiZ.png |
Compare ZAMS at different \(Z\) | Three curves with same \(Z\) values |
Required \(Z\) values:
| \(Z\) | Meaning | \(\xi_Z = \log_{10}(Z/Z_\odot)\) |
|---|---|---|
| 0.02 | Solar | 0 |
| 0.004 | 1/5 solar | \(-0.699\) |
| 0.0001 | Nearly primordial | \(-2.301\) |
The Tout+1996 plots were made in 1996. Your plots should reflect modern visualization standards:
- Use LaTeX math in labels:
$L/L_\odot$notL/Lsun - Include units in axis labels:
$M$ [$M_\odot$]orMass ($M_\odot$) - Add legends showing \(Z\) values
- Use distinct line styles for different metallicities (solid, dashed, dotted)
- Choose readable colors and fonts
- Consider colorblind-friendly palettes
Aesthetics matter. How you visualize data reflects how you think about it.
Optional Extension Plots
| Plot | Purpose | Notes |
|---|---|---|
| Performance comparison | Demonstrate vectorization speedup | Multi-panel: loop vs array timing |
| Mass distributions | Compare sampling methods | 2x2 histogram grid (if implementing IMF sampling) |
| Timescales at multiple \(Z\) | Show \(t_\mathrm{MS}(M, Z)\) | How does MS lifetime depend on metallicity? |
Validation Requirements
Sanity checks (trend-based)
Your python run.py validate should verify that over the supported mass range:
- \(L(M)\) increases strongly with mass
- \(R(M)\) increases with mass
- \(T_\mathrm{eff}\) is physically plausible and smoothly varying (except at known piecewise boundaries)
Anchor checks (checksum-style)
These are checksum values to detect transcription errors — they are not “the solution.”
| Mass (\(M_\odot\)) | \(L_\mathrm{ZAMS}\) (\(L_\odot\)) | \(R_\mathrm{ZAMS}\) (\(R_\odot\)) | \(T_\mathrm{eff}\) (K) | Purpose |
|---|---|---|---|---|
| 0.5 | \(0.038 \pm 0.005\) | \(0.459 \pm 0.01\) | \(3760 \pm 100\) | Low-mass regime |
| 1.0 | \(0.698 \pm 0.01\) | \(0.888 \pm 0.01\) | \(5600 \pm 50\) | Solar comparison |
| 10.0 | \(5550 \pm 100\) | \(3.94 \pm 0.1\) | \(25100 \pm 500\) | High-mass regime |
Checkpoints are not a substitute for reading the paper — you still need to locate the correct equations and tables. Think of them as unit-test targets for a published model.
If values don’t match, check: (1) coefficient transcription, (2) \(\log_{10}\) vs \(\ln\), (3) mass range for piecewise fits.
Performance: Vectorization should show roughly 10x speedup over loops
Deliverables
Code Files:
constants.pyzams.pystar.pystellar_population.pyastro_plot.pyrun.pytests/directory with at least 8 tests- (any additional modules under
src/if you choose a package layout)
Provenance Documentation (required):
Include a “Provenance” section in your README.md documenting:
- Which equation numbers from Tout et al. (1996) you implemented
- Which table(s) you extracted coefficients from
- Any interpretation choices you made (log base, units/normalization, piecewise regimes)
- One validation check you used to confirm your extraction wasn’t wrong
Keep it short (5–15 lines). The goal is reproducibility, not writing a book.
Memos (PDF required):
research_memo.pdf(formal; 2–3 pages of text, figures allowed beyond page count)growth_memo.pdf(informal; ~1–2 pages)
You may write in LaTeX or Markdown, but you must commit the final PDFs to your repository.
Research Memo: Figure Interpretation (required)
If you can’t explain what your plot shows, you haven’t validated your implementation.
For each required figure, your research memo must include:
- Trend description: How does the quantity depend on mass? On metallicity?
- Example: “Luminosity increases steeply with mass, approximately as \(L \propto M^{3.5}\) for intermediate masses…”
- Sanity check: Do the trends match physical intuition from the science background?
- Example: “Higher metallicity leads to lower luminosity at fixed mass, consistent with increased opacity…”
- Comparison to paper: How do your plots compare to Tout+1996 Figures 1, 3, 5?
- Example: “The mass-luminosity relation shows the same characteristic steepening above 1 \(M_\odot\) visible in Figure 1…”
Keep it concise — 1–2 sentences per point per figure is enough. The goal is demonstrating understanding, not writing a textbook.
Grading Rubric
| Category | Weight | Criteria |
|---|---|---|
| Core Functionality | 40% | Correct values, working vectorization |
| Code Quality | 30% | Clean design, documentation, modularity |
| Visualization | 30% | Plot quality, reusable module, research memo |
Equations Reference
| Equation | Formula | Notes |
|---|---|---|
| Effective Temperature | \(T_\mathrm{eff}/T_\odot = (L/L_\odot)^{0.25} \times (R/R_\odot)^{-0.5}\) | Use ZAMS \(L\), \(R\); \(T_\odot = 5772\) K |
| MS Lifetime | \(t_\mathrm{MS} = 10\,\mathrm{Gyr} \times (M/M_\odot) \times (L/L_\odot)^{-1}\) | Return in Gyr |
| KH Timescale | \(t_\mathrm{KH} = GM^2/(RL)\) | Compute in CGS, return in Myr |
The KH formula in CGS gives seconds. Convert: \(t_\mathrm{KH}\,[\mathrm{Myr}] = t_\mathrm{KH}\,[\mathrm{s}] / (3.156 \times 10^{13})\)