Final Project: From Simulation to Surrogate

COMP 536 | Final Project

Author

Dr. Anna Rosen

Published

April 28, 2026

Overview

Assigned	Saturday, April 18, 2026
Due	Wednesday, May 13, 2026 (11:59 pm PT)
Duration	About 3.5 weeks

Learning priorities: numerical methods and verification \(\to\) JAX-native simulation design \(\to\) emulator design \(\to\) probabilistic inference \(\to\) scientific communication

Official course contract

This page follows the public course contract in the syllabus. The final project includes:

a code repository with a reproducible end-to-end pipeline,
a formal 5 - 7 page research report,
a Growth Synthesis reflection.

AI policy

The final project uses Phase 3 - Professional Practice, described in the AI Use & Growth Mindset Policy. AI can support productivity, but you still need to understand, defend, and modify your own work.

The Big Idea

This project brings the semester together. You have already built numerical methods, Monte Carlo tools, Bayesian inference machinery, and JAX-based scientific code. The final project asks you to connect those pieces into one modern scientific workflow: take the physics model, validation logic, and debugging lessons from Project 2, rebuild the simulator in JAX-native Leapfrog form, use it to generate expensive simulation outputs, train a fast emulator on those outputs, and then use that surrogate model for inference.

The scientific throughline is simple but powerful: if direct simulation is too expensive to evaluate thousands of times, can we learn a surrogate that is fast enough for exploration and inference without losing the essential physics?

What you might discover

This project is not only about building a pipeline. It is also a chance to notice real scientific structure. As you vary \(Q_0\) and \(a\), you may see that some clusters stay compact while others expand or lose bound mass, that some observables are much more informative than others, and that the inverse problem is easier in some parts of parameter space than in others. Those are exactly the kinds of patterns that make surrogate models scientifically interesting rather than merely computationally convenient.

What You Are Building

Your final project should produce a reproducible pipeline with these ingredients:

A student-owned JAX-native N-body simulator that uses Leapfrog integration and is clearly rebuilt from the validated ideas, tests, and physical model of Project 2.
A validation layer showing that the simulator is numerically trustworthy before it is used for data generation.
A surrogate emulator, typically a neural network, that predicts summary statistics from simulation inputs.
An evaluation layer that shows whether the emulator is trustworthy on held-out cases.
An inference layer that uses the emulator to recover initial conditions or otherwise solve an inverse problem.
A scientific report that explains your design choices, evidence, and conclusions.

The recommended scientific framing for this course is the one developed in the technical guide: vary the initial virial ratio \(Q_0\) and Plummer scale radius \(a\), emulate the resulting cluster diagnostics, and use the emulator for parameter recovery. If you make a different modeling choice, it still needs to satisfy the same reproducibility and validation standards.

Minimum viable scope

Because the project is only about 3.5 weeks long, a strong baseline matters more than an over-ambitious stretch goal. A solid final project should include:

a validated JAX-native Leapfrog simulator rebuilt from Project 2,
a modest but reproducible dataset, preferably using Latin Hypercube Sampling or a comparable space-filling design, with train, validation/calibration, and held-out test roles separated,
a simple emulator that you can evaluate on held-out cases,
one synthetic or held-out recovery example for the inference stage,
a clear repo and report that explain what you trust and why.

If you start to run short on time, keep that baseline intact and cut optional complexity first.

Required Deliverables

1. Code Repository

Your repository should make the end-to-end workflow easy to inspect and rerun. At minimum, it should contain:

the source code for your JAX-native Leapfrog simulator, emulation, evaluation, and inference,
a clear non-interactive run path,
generated figures and outputs needed to support the report,
a README.md that explains installation and reproduction steps.

2. Final Research Report

Submit a formal scientific writeup of 5 - 7 pages. The report should include:

background and motivation,
methods,
results,
conclusions.

The strongest reports also make the verification logic explicit: what you checked, why those checks matter, and what limitations remain.

3. Growth Synthesis

Submit the final reflective synthesis described on the Growth Synthesis Guide. This replaces the short-project Growth Memo pattern for the end of the semester.

Validation Expectations

The final project is graded on correctness, evaluation methodology, reproducibility, and scientific communication. That means a strong repo needs more than output plots.

Your project should make it easy for a reader to answer three questions:

Does the pipeline run?
Do the results look scientifically believable?
What evidence supports that trust?

For this project, that usually means showing:

Leapfrog simulation sanity checks,
emulator accuracy on held-out data,
uncertainty or failure-mode analysis,
an inference result whose interpretation is clearly explained.

Minimum acceptable simulator validation evidence

Before you move on to emulation, your repo should already show all four of the following:

One simple validation case with expected qualitative behavior. For example, a small-\(N\) orbit or interaction test where the trajectories behave the way your Project 2 physics says they should.
One quantitative conservation diagnostic. Show bounded total-energy behavior over time, and also check center-of-mass or total-momentum behavior when that quantity should be conserved in your setup.
One timestep justification. Compare at least two timestep choices and explain why the timestep you use for data generation is accurate enough for this project.
One reproducible validation path. A reader should be able to run a non-interactive command or script and regenerate the validation artifact without guessing what you did.

You do not need adaptive timestepping for the baseline final project. A fixed timestep is acceptable if your validation evidence supports it.

If that evidence is missing, you should treat the simulator as untrusted and stay in the validation phase rather than moving on to emulator training.

Recommended Public Workflow

The assignment is intentionally open enough for you to make real design choices, but the following sequence is a good default:

Rebuild the core Project 2 simulator logic in a small JAX-native Leapfrog code and validate it on simple cases.
Make the simulation pipeline reproducible on a small dataset.
Define summary statistics and verify that they behave sensibly.
Train a baseline emulator and confirm that it generalizes beyond the training set.
Add uncertainty analysis and stress-test edge behavior.
Run inference only after the emulator is trustworthy enough to use as a scientific instrument.

If you skip that order, you usually end up debugging the model, the data pipeline, and the inference machinery all at once.

Project 2 transfer rule

For the final project, you should reuse the physics, validation habits, and conceptual design from Project 2, but you should rebuild the simulator in a JAX-native style. The goal is not to wrap your old NumPy/Python simulator and call it done. The goal is to show that you can carry forward the scientific understanding while changing the computational substrate.

Supporting Pages

Getting Help

When you ask for help, bring evidence:

the command you ran,
the behavior you expected,
the behavior you got instead,
the smallest artifact that demonstrates the issue.

That habit matters even more on the final project, because the debugging surface is larger and your verification choices are part of the grade.