Project 4 Planning + Build Checklist
Pipeline first — validate always
This worksheet is a guide for Project 4. You are not required to complete every section in one sitting.
The goal is to help you build a working inference pipeline without wandering: posterior ingredients \(\to\) forward model \(\to\) toy Gaussian \(\to\) JLA likelihood \(\to\) diagnostics \(\to\) scientific interpretation.
Bring this sheet to lab, office hours, or your own debugging sessions when you want a clean checkpoint.
Group info
Names:
Section / group # (if applicable):
Roles (circle): Posterior lead / Validation checker / Skeptic / Timekeeper
1) What you are building (in 4 sentences)
Write a short description of what this repo does. This becomes the first draft of your README.md overview.
2) Parameter contract
Write the exact parameter contract for your posterior.
Parameters
- \(\Omega_m\) means:
- \(h\) means:
Prior bounds
- \(\Omega_m \in\) ________
- \(h \in\) ________
What should your code return if a proposal is outside the prior bounds?
Answer: ________
Why is that useful for MCMC?
Answer: ________
3) Data and likelihood contract
Data files
- Redshift / distance modulus file:
- Covariance matrix file:
Residual vector
Write the residual definition you will implement:
\[ r_i = \]
Likelihood expression
Write the matrix form of the log-likelihood:
\[ \ln \mathcal{L}(\theta) = \]
Numerical method
Which linear-algebra method will you use to apply \(\mathbf{C}^{-1}\) to the residual vector?
Answer: ________
Why should you not explicitly invert the covariance matrix?
Answer: ________
If you accidentally treat the covariance as diagonal, what kind of scientific mistake do you make?
Answer: ________
4) Forward-model sanity checks
Compute or record the checks you will use before sampling.
Worked example
- Redshift: \(z = 0.5\)
- Matter density: \(\Omega_m = 0.3\)
- Hubble parameter: \(h = 0.7\)
- Expected distance modulus: \(\mu \approx\) ________ mag
Boundary condition
What should \(D_L(0)\) be?
Answer: ________
Production choice
Which forward-model implementation will you use first?
- Numerical integration
- Pen approximation
Why is this your first choice?
Answer: ________
What validation would convince you the forward model is trustworthy?
Answer: ________
5) Toy Gaussian validation plan
Before touching the supernova posterior, use the canonical Project 4 Gaussian below. Do not invent your own unless you have a specific reason.
Your test target
- Mean vector: \[ \begin{pmatrix} 0.30 \\ 0.70 \end{pmatrix} \]
- Covariance matrix: \[ \begin{pmatrix} 0.04^2 & -0.00048 \\ -0.00048 & 0.02^2 \end{pmatrix} \]
Recommended initial point:
\[ \theta^{(0)} = (0.50, 0.60) \]
Recommended initial diagonal proposal covariance:
\[ \mathrm{diag}(0.02^2, 0.01^2) \]
What success looks like
How will you check that Metropolis-Hastings is working?
- Sample mean should be within about: ________ of the true mean in each parameter
- Sample covariance should be within about: ________ of the target covariance
- Acceptance-rate target should be roughly: ________
Suggested run lengths:
- Short tuning run: 2,000 steps
- Production validation run: 20,000 steps
Good vs. bad trace behavior
What should a good trace plot look like?
Answer: ________
What should a bad trace plot look like?
Answer: ________
If the Gaussian test fails, what do you debug first?
Answer: ________
Red-flag rule: If you have not passed this Gaussian test by the start of Week 3, you are behind. Stop expanding scope and get help before moving on.
6) MCMC tuning plan
Write the proposal-tuning workflow you will follow.
Initial choices
- Initial point \(\theta^{(0)} =\) use the canonical test value above unless you have a reason not to
- Initial proposal covariance: start with the canonical diagonal proposal above, then tune from there
Short tuning loop
Complete this logic:
- If acceptance rate is too low, I will:
- If acceptance rate is too high, I will:
- I will stop tuning when:
One important reminder
Why is an acceptance rate near 100% not actually ideal?
Answer: ________
7) Multi-chain diagnostics plan
Write down the checks you will use on the real JLA chains.
Chain setup
- Number of independent chains:
- Burn-in rule:
- Production length target:
Diagnostics to compute
- Trace plots:
- Acceptance rate:
- Autocorrelation:
- ESS:
- Split-\(\hat{R}\) or other multi-chain check:
Convergence judgment
What evidence would make you say “these chains are usable”?
Answer: ________
What evidence would make you stop and debug instead of writing the memo?
Answer: ________
8) JLA analysis outputs
List the minimum figures and tables your repo must produce.
- Validation figure or table:
- Trace plots:
- Corner plot:
- Data vs. model plot:
- Posterior summary table:
For each figure, write one sentence about what it must prove.
Trace plots must prove:
Corner plot must prove:
Data vs. model plot must prove:
9) Memo claims before you make the figures
Write the scientific claims you expect your memo to support.
- My code validates the forward model by:
- My sampler validates on a toy Gaussian by:
- The JLA posterior suggests:
- The correlation between \(\Omega_m\) and \(h\) means:
If one of those claims ends up unsupported by the evidence, what will you do?
Answer: ________
10) Graduate HMC lane (optional for undergraduates)
If you are a graduate student, fill this in now. If you are an undergraduate, you may leave this section blank unless you want the extension.
Gradient plan
How will you compute \(\nabla \log p(\theta \mid D)\)?
Answer: ________
HMC tuning plan
- Initial step size \(\epsilon\):
- Initial leapfrog steps \(L\):
- What will you monitor for energy behavior?
Comparison plan
How will you compare HMC to MCMC?
- Mixing evidence:
- Efficiency evidence:
- Posterior consistency evidence:
What would count as a convincing HMC success?
Answer: ________