Project 4 Planning + Build Checklist

Pipeline first — validate always

Author

Dr. Anna Rosen

Published

March 25, 2026

This worksheet is a guide for Project 4. You are not required to complete every section in one sitting.

The goal is to help you build a working inference pipeline without wandering: posterior ingredients \(\to\) forward model \(\to\) toy Gaussian \(\to\) JLA likelihood \(\to\) diagnostics \(\to\) scientific interpretation.

Bring this sheet to lab, office hours, or your own debugging sessions when you want a clean checkpoint.

Group info

Names:

Section / group # (if applicable):

Roles (circle): Posterior lead / Validation checker / Skeptic / Timekeeper

1) What you are building (in 4 sentences)

Write a short description of what this repo does. This becomes the first draft of your README.md overview.

2) Parameter contract

Write the exact parameter contract for your posterior.

Parameters

\(\Omega_m\) means:
\(h\) means:

Prior bounds

\(\Omega_m \in\) ________
\(h \in\) ________

What should your code return if a proposal is outside the prior bounds?

Answer: ________

Why is that useful for MCMC?

Answer: ________

3) Data and likelihood contract

Data files

Redshift / distance modulus file:
Covariance matrix file:

Residual vector

Write the residual definition you will implement:

\[ r_i = \]

Likelihood expression

Write the matrix form of the log-likelihood:

\[ \ln \mathcal{L}(\theta) = \]

Numerical method

Which linear-algebra method will you use to apply \(\mathbf{C}^{-1}\) to the residual vector?

Answer: ________

Why should you not explicitly invert the covariance matrix?

Answer: ________

If you accidentally treat the covariance as diagonal, what kind of scientific mistake do you make?

Answer: ________

4) Forward-model sanity checks

Compute or record the checks you will use before sampling.

Worked example

Redshift: \(z = 0.5\)
Matter density: \(\Omega_m = 0.3\)
Hubble parameter: \(h = 0.7\)
Expected distance modulus: \(\mu \approx\) ________ mag

Boundary condition

What should \(D_L(0)\) be?

Answer: ________

Production choice

Which forward-model implementation will you use first?

Numerical integration
Pen approximation

Why is this your first choice?

Answer: ________

What validation would convince you the forward model is trustworthy?

Answer: ________

5) Toy Gaussian validation plan

Before touching the supernova posterior, use the canonical Project 4 Gaussian below. Do not invent your own unless you have a specific reason.

Your test target

Mean vector: \[ \begin{pmatrix} 0.30 \\ 0.70 \end{pmatrix} \]
Covariance matrix: \[ \begin{pmatrix} 0.04^2 & -0.00048 \\ -0.00048 & 0.02^2 \end{pmatrix} \]

Recommended initial point:

\[ \theta^{(0)} = (0.50, 0.60) \]

Recommended initial diagonal proposal covariance:

\[ \mathrm{diag}(0.02^2, 0.01^2) \]

What success looks like

How will you check that Metropolis-Hastings is working?

Sample mean should be within about: ________ of the true mean in each parameter
Sample covariance should be within about: ________ of the target covariance
Acceptance-rate target should be roughly: ________

Suggested run lengths:

Short tuning run: 2,000 steps
Production validation run: 20,000 steps

Good vs. bad trace behavior

What should a good trace plot look like?

Answer: ________

What should a bad trace plot look like?

Answer: ________

If the Gaussian test fails, what do you debug first?

Answer: ________

Red-flag rule: If you have not passed this Gaussian test by the start of Week 3, you are behind. Stop expanding scope and get help before moving on.

6) MCMC tuning plan

Write the proposal-tuning workflow you will follow.

Initial choices

Initial point \(\theta^{(0)} =\) use the canonical test value above unless you have a reason not to
Initial proposal covariance: start with the canonical diagonal proposal above, then tune from there

Short tuning loop

Complete this logic:

If acceptance rate is too low, I will:
If acceptance rate is too high, I will:
I will stop tuning when:

One important reminder

Why is an acceptance rate near 100% not actually ideal?

Answer: ________

7) Multi-chain diagnostics plan

Write down the checks you will use on the real JLA chains.

Chain setup

Number of independent chains:
Burn-in rule:
Production length target:

Diagnostics to compute

Trace plots:
Acceptance rate:
Autocorrelation:
ESS:
Split-\(\hat{R}\) or other multi-chain check:

Convergence judgment

What evidence would make you say “these chains are usable”?

Answer: ________

What evidence would make you stop and debug instead of writing the memo?

Answer: ________

8) JLA analysis outputs

List the minimum figures and tables your repo must produce.

Validation figure or table:
Trace plots:
Corner plot:
Data vs. model plot:
Posterior summary table:

For each figure, write one sentence about what it must prove.

Trace plots must prove:

Corner plot must prove:

Data vs. model plot must prove:

9) Memo claims before you make the figures

Write the scientific claims you expect your memo to support.

My code validates the forward model by:
My sampler validates on a toy Gaussian by:
The JLA posterior suggests:
The correlation between \(\Omega_m\) and \(h\) means:

If one of those claims ends up unsupported by the evidence, what will you do?

Answer: ________

10) Graduate HMC lane (optional for undergraduates)

If you are a graduate student, fill this in now. If you are an undergraduate, you may leave this section blank unless you want the extension.

Gradient plan

How will you compute \(\nabla \log p(\theta \mid D)\)?

Answer: ________

HMC tuning plan

Initial step size \(\epsilon\):
Initial leapfrog steps \(L\):
What will you monitor for energy behavior?

Comparison plan

How will you compare HMC to MCMC?

Mixing evidence:
Efficiency evidence:
Posterior consistency evidence:

What would count as a convincing HMC success?

Answer: ________