Part 3: The Virial Theorem as Universal Diagnostic

When Stars Become Particles | Statistical Thinking Module 2 | COMP 536

Author

Anna Rosen

Learning Objectives

By the end of Part 3, you will be able to:


Part 3: The Virial Theorem

3.1 The Universal Energy Balance

Priority: 🔴 Essential ::: {.column-margin} Virial Theorem
For gravitationally bound systems in equilibrium, \(2K + W = 0\), where \(K\) is the total kinetic energy and \(W\) is the total gravitational potential energy (negative). This means \(K = -\frac{W}{2}\). :::

The virial theorem emerges from taking the second moment of the collisionless Boltzmann equation. Just as we derived conservation laws by taking moments of the Boltzmann equation, the virial theorem emerges from a special moment - one that involves both position and velocity.

Note on Notation: In stellar dynamics literature, the gravitational potential energy is traditionally denoted \(W\) (from German “Werk” = work). We’ll use \(W\) here for the gravitational potential energy, which is equivalent to the familiar \(U\) you’ve seen. This distinguishes our approach from Module 2, where we used \(\Omega\) for the gravitational energy in stellar interiors and derived the virial theorem through pressure integrals. Here we’ll use the moment of inertia tensor approach - a fundamentally different derivation that reveals the same universal truth.

The Tensor Virial Theorem Derivation

Consider the moment of inertia tensor for our system of stars:

\[I_{ij} = \sum_k m_k x_{ki} x_{kj}\]

where \(x_{ki}\) is the \(i\)-th coordinate of the \(k\)-th star. The scalar moment of inertia is the trace: \(I = \sum_i I_{ii} = \sum_k m_k r_k^2\).

Taking the first time derivative: \[\dot{I} = 2\sum_k m_k \vec{r}_k \cdot \vec{v}_k\]

Taking the second time derivative and using Newton’s second law (\(m_k \vec{a}_k = \vec{F}_k = -m_k \vec{\nabla}\Phi\)): \[\ddot{I} = 2\sum_k m_k |\vec{v}_k|^2 + 2\sum_k m_k \vec{r}_k \cdot \vec{a}_k = 4K + 2\sum_k m_k \vec{r}_k \cdot (-\vec{\nabla}\Phi)\]

For the gravitational potential \(\Phi = -GM/r\), we need to use Euler’s theorem for homogeneous functions. A function is homogeneous of degree \(n\) if \(f(\lambda\vec{r}) = \lambda^n f(\vec{r})\). Our potential \(\Phi \propto r^{-1}\) is homogeneous of degree \(n = -1\).

Euler’s theorem states that for homogeneous functions of degree \(n\): \[\vec{r} \cdot \vec{\nabla}\Phi = n\Phi\]

Since our gravitational potential has degree \(n = -1\): \[\vec{r} \cdot \vec{\nabla}\Phi = -\Phi\]

This is the key property that makes gravity special - the specific factor of 2 in the virial theorem comes from the \(1/r\) nature of gravity! Therefore:

\[\ddot{I} = 4K + 2W\]

In equilibrium, the system is neither collapsing (\(\ddot{I} < 0\)) nor expanding (\(\ddot{I} > 0\)), so \(\ddot{I} = 0\):

\[\boxed{2K + W = 0}\]

where: \[K = \frac{1}{2}\sum_i m_i v_i^2 \quad \text{(total kinetic energy)}\] \[W = -\sum_{i<j} \frac{Gm_im_j}{r_{ij}} \quad \text{(total gravitational potential energy, negative)}\]

Physical intuition: In equilibrium, the kinetic energy (trying to disperse the system) exactly balances half the binding energy (trying to collapse it). This specific ratio of 2:1 emerges from the 1/r nature of gravity - it’s not arbitrary but mathematically inevitable!

Key Difference from Module 2: In Module 2, we derived the virial theorem for stellar interiors using pressure integrals over the stellar volume. Here, we’ve used the moment of inertia approach for a collection of point particles (stars). Same theorem, completely different derivation - showing the universality of this fundamental principle!

Note💡 N-body Implementation Tips

After decades of debugging N-body codes, the community has learned crucial lessons:

1. Choose your units wisely: Work in scaled physical units:

  • Length in parsecs (pc)
  • Time in megayears (Myr)
  • Mass in solar masses (M_)

With these units, G ≈ 0.00449 pc³/(M_·Myr²). Your positions might range from 0-100 pc, velocities 0-100 km/s, making them easy to interpret.

2. Energy as your truth meter: Track total energy E = K + W at every timestep: \[\frac{|E(t) - E(0)|}{|E(0)|} < 10^{-10} \times N_{steps}\]

3. Initial conditions matter: Use a Plummer sphere or King model for realistic initial conditions already close to equilibrium.

4. The 2-body problem is your friend: Before running N=1000, test with N=2. An equal-mass binary should have a perfectly circular orbit if started correctly.

This applies to all gravitationally bound systems:

  • Molecular clouds: K from thermal motion + turbulence
  • Stars: K from thermal motion of particles
  • Star clusters: K from orbital motion of stars
  • Galaxies: K from stars + gas + dark matter (~85% of total mass)
  • Galaxy clusters: K from galaxies + hot gas + dark matter

The profound discovery: When we apply the virial theorem using only visible matter, galaxies and clusters appear unbound! The observed kinetic energy requires 5-10\(\times\) more mass than we can see. This “missing mass” is dark matter – discovered through the very equation you’re learning.

3.2 Different Forms, Same Physics

The virial theorem takes different mathematical forms depending on what we’re measuring:

For a star cluster (velocity dispersion): \[M\sigma^2 = -\frac{W}{2} = \frac{GM^2}{2R}\]

This gives the fundamental scaling relation: \[\boxed{\sigma^2 \sim \frac{GM}{R}}\]

Key insight: Measuring velocity dispersion \(\sigma\) lets us “weigh” the system! If you observe \(\sigma\) and know \(R\), you can determine \(M\) – this is how we measure dark matter in galaxies.

::: {.callout-note info collapse=“true”} ## 🌟 The More You Know: Vera Rubin’s Revolutionary Persistence

While Fritz Zwicky first noticed the “missing mass problem” in galaxy clusters (1933) using the virial theorem, his work was largely dismissed. It was Vera Rubin who made dark matter undeniable through meticulous observations of galaxy rotation curves in the 1970s.

What she expected: Stars orbiting far from galactic centers should move slowly, like planets far from the Sun (Kepler’s laws: \(v \propto 1/\sqrt{r}\)).

What she found: Stars at all radii orbit at roughly the same speed – the rotation curves are flat! This violated everything we knew about gravity unless…

The revelation: Each galaxy must be embedded in a massive halo of invisible matter. Using the virial theorem and her velocity measurements, she showed that galaxies contain 5-10\(\times\) more dark matter than visible matter.

The human story: Rubin faced significant discrimination as one of the few women in astronomy. Princeton wouldn’t admit women to their astronomy program. She was prohibited from using Palomar Observatory until 1965 because it “lacked proper bathroom facilities for women.” Despite these obstacles, her careful, irrefutable data transformed our understanding of the universe.

Her work exemplifies how careful observation and simple physics (the virial theorem!) can reveal profound truths. Sometimes the universe’s biggest secrets hide in the simplest equations. :::

3.3 The Virial Theorem as Diagnostic Tool

Priority: 🔴 Essential The virial theorem provides the most important diagnostic for N-body simulations:

\[\text{Virial ratio} = \frac{|2K + W|}{|W|}\]

Warning

Common Misconception: The virial theorem (\(2K + W = 0\)) is NOT the same as energy conservation (\(K + W = E = \text{constant}\)).

  • Energy conservation: Always true for isolated systems
  • Virial theorem: Only true for systems in equilibrium

A system can conserve energy perfectly while being far from virial equilibrium!

What different values mean:

  • ≈ 0: System is perfectly virialized (equilibrium)
  • < 0.01: Excellent equilibrium
  • 0.01-0.1: Acceptable for most purposes
  • > 0.1: System not in equilibrium or numerical errors
  • Growing with time: Energy conservation failing!

Implementation hint for Project 2:

def check_virial(particles):
    """
    Check virial equilibrium for N-body simulation.
    Returns virial ratio - should be near zero for equilibrium.
    """
    # Calculate kinetic energy
    K = # Sum of (1/2) * m * v^2 for all particles
    
    # Calculate potential energy (the tricky part!)
    W = 0.0
    for i in range(N):
        for j in range(i+1, N):  # Count each pair once
            r_ij = # distance between particles i and j
            W -= G * m[i] * m[j] / r_ij
    
    virial_ratio = abs(2*K + W) / abs(W)
    
    if virial_ratio > 0.1:
        print(f"WARNING: System not virialized! Ratio = {virial_ratio:.3f}")
    
    return virial_ratio

3.4 Connection to Ergodicity and MCMC

Priority: 🟡 Standard Path The virial theorem assumes something profound that connects to Module 1: time averages equal ensemble averages.

For the virial theorem to hold, a system must explore its entire accessible phase space. Just as gas molecules sample all possible velocities through collisions, stars must sample all possible orbits consistent with their energy. This is ergodicity.

\[\langle K \rangle_{\text{time}} = \lim_{T \to \infty} \frac{1}{T} \int_0^T K(t) \, dt = \langle K \rangle_{\text{ensemble}}\]

This is the ergodic hypothesis – a system explores all accessible phase space given enough time. This connects directly to your future work:

Note🔗 Connection to Project 4 (MCMC)

The ergodic hypothesis underpins MCMC methods:

In stellar dynamics: A single star’s orbit, followed long enough, samples the entire cluster’s phase space distribution.

In MCMC: A single Markov chain, run long enough, samples the entire posterior distribution.

Both rely on ergodicity: \[\langle A \rangle_{\text{time/chain}} = \langle A \rangle_{\text{ensemble/posterior}}\]

The virial theorem works because stellar orbits ergodically fill phase space. MCMC works because properly constructed chains ergodically sample parameter space.

Note🔗 Connection to Project 5 (Gaussian Processes)

Gaussian Processes model functions by their covariance structure - just like stellar systems are characterized by their velocity dispersion tensor!

Stellar Systems: - Velocity dispersion tensor: \(\sigma_{ij}^2 = \langle v_i v_j \rangle - \langle v_i \rangle \langle v_j \rangle\) - Describes “correlation” between velocity components - Virial theorem relates this to gravitational potential

Gaussian Processes: - Covariance kernel: \(k(x_i, x_j)\) - Describes correlation between function values at different points - Kernel encodes our assumptions about smoothness

Both describe systems through their second moments (covariance structure). Both assume underlying smooth fields (gravitational potential for stars, latent function for GPs). The mathematics of characterizing systems by their correlations is universal!

Part 3 Synthesis: The Universal Diagnostic

The virial theorem isn’t just an equation – it’s nature’s way of telling us when a gravitational system has found its balance. Its universality is breathtaking:

Why It Works Everywhere

The virial theorem applies to any system where:

  1. Gravity is the dominant force (\(1/r\) potential)
  2. The system has had time to relax toward equilibrium
  3. External influences are negligible

This covers:

  • Molecular clouds (with thermal pressure included)
  • Star clusters (pure gravity)
  • Galaxies (stars + gas + dark matter)
  • Galaxy clusters (galaxies + hot gas + dark matter)
  • The cosmic web (on appropriate scales)

The Dark Matter Revolution

The virial theorem’s greatest triumph was revealing the invisible universe:

  1. Observe: Velocity dispersions in clusters
  2. Apply virial: \(M = R\sigma^2/G\)
  3. Compare: Virial mass >> visible mass
  4. Conclude: Dark matter exists!

This simple application of statistical mechanics revolutionized cosmology.

Your Computational Toolkit

For Project 2, the virial theorem becomes your:

  • Equilibrium diagnostic: Is your system settled?
  • Energy conservation check: Is your integrator working?
  • Initial condition validator: Are you starting near equilibrium?
  • Physical insight tool: Understanding energy partition

The same theorem that revealed dark matter will debug your code!

Important Limitations

The virial theorem assumes:

  • Spherical symmetry: Many real clusters are triaxial (football or disk-shaped). The scalar virial theorem is an approximation; the full tensor version accounts for shape.
  • Isolated system: Tidal fields from neighbors can strip stars
  • Time-averaged equilibrium: Real systems evolve, merge, and disrupt
  • Point masses: Close binaries and extended objects violate this
  • Pure \(1/r\) potential: Modified gravity or dark energy would change the factor

Despite these limitations, the virial theorem remains remarkably useful because: 1. The tensor virial theorem can handle non-spherical shapes 2. Many clusters are approximately isolated over several dynamical times 3. Systems naturally evolve toward virial equilibrium (though may never fully reach it) 4. Deviations from \(2K + W = 0\) tell us about non-equilibrium dynamics

The virial theorem is like the ideal gas law - never perfectly true but almost always useful!

Warning📊 Note for Observers: Systematic Uncertainties

When measuring cluster masses using the virial theorem, always consider:

  • Projection effects: We see 2D projections of 3D systems
  • Anisotropy: The \(\beta\) parameter affects mass estimates (see Part 2)
  • Binaries: Unresolved binary stars inflate velocity dispersion measurements
  • Membership: Which stars actually belong to the cluster? Interlopers contaminate measurements
  • Non-spherical shape: Real clusters are often triaxial

These systematic effects typically introduce 20-50% uncertainties in mass estimates. Always quote error bars!