Original: 200
Result: 200
title: “Chapter 6: OOP Fundamentals - Organizing Scientific Code” subtitle: “COMP 536 | Python Fundamentals” author: “Anna Rosen” draft: false format: html: toc: true execute: echo: true warning: true error: false freeze: auto —
Learning Objectives
By the end of this chapter, you will be able to:
Prerequisites Check
Before starting this chapter, verify you can:
Quick diagnostic:
If you said both print “200”, you’re ready! Python passes object references (call-by-sharing). If the object is mutable, a function or method can mutate it in-place—which becomes crucial when methods modify object state.
Chapter Overview
You’ve mastered functions to organize behavior and modules to organize related functions. But what happens when data and the functions that operate on it are inseparable? When tracking particles in a simulation, each particle has position, velocity, and mass, along with methods to update position, calculate kinetic energy, and check collisions. Passing all this data between separate functions becomes error-prone and verbose. This is where Object-Oriented Programming transforms your code from a collection of functions to a model of your problem domain.
Object-Oriented Programming (OOP) isn’t just another way to organize code - it’s a fundamental shift in how we think about programs. Instead of viewing code as a sequence of operations on data, we model it as interactions between objects that combine data and behavior. A thermometer knows its temperature and how to convert units. A dataset knows its values and how to calculate statistics. A simulation particle knows its state and how to evolve. This paradigm mirrors how we naturally think about scientific systems, making complex programs more intuitive and maintainable.
This chapter introduces OOP’s essential concepts through practical scientific examples. You’ll learn to create classes (blueprints for objects), instantiate objects (specific instances), and define methods (functions attached to objects). We’ll explore how properties provide computed attributes and validation, ensuring your scientific constraints are always satisfied. Most importantly, you’ll develop judgment about when OOP clarifies code (managing stateful systems, modeling entities) versus when it adds unnecessary complexity (simple calculations, stateless transformations). By the end, you’ll understand why NumPy arrays are objects with methods, setting the foundation for leveraging Python’s scientific ecosystem.
OOP isn’t about syntax — it’s about preventing errors in systems where data and behavior must stay synchronized. When a particle’s position changes, its kinetic energy must update consistently. When a measurement is recorded, its uncertainty must propagate correctly. OOP gives you tools to enforce these invariants automatically: validation in setters catches impossible values before they corrupt your simulation; properties guarantee derived quantities stay consistent with underlying state; encapsulation prevents external code from putting objects into invalid configurations.
The real lesson: You’re not just learning to write classes. You’re learning to build self-protecting data structures that make entire categories of bugs structurally impossible.
Read actively, not passively. When you see a class definition:
- Trace the state — What attributes does
__init__create? How do methods change them? - Identify the invariants — What must always be true about this object? (e.g., radius > 0, energy \(\geq\) 0)
- Find the contracts — Which methods mutate state vs compute values?
- Question the design — Could this be a simple function instead? What would break?
Run every code block. Modify values. Break things on purpose. The error messages teach you more than the working code.
6.1 From Functions to Objects: The Conceptual Leap
Object-Oriented Programming A programming paradigm that organizes code around objects (data) and methods (behavior) rather than functions and logic.
Let’s start with a problem you’ve already solved with functions, then transform it into objects to see the difference. In Python, everything is actually an object - even functions and modules! But some objects are more complex than others, and creating your own classes lets you model your specific problem domain.
Energy: 62.5 ergs
Now let’s see the same problem with OOP:
Energy: 62.5 ergs
Both approaches solve the problem, but notice the differences:
- Organization: Data and methods stay together in the class
- Syntax: Methods are called on objects (
p2.kinetic_energy()) - State: The object maintains its own state between method calls
- Clarity: The object-oriented version reads more naturally
In 2004, NASA’s Spirit rover suddenly stopped responding, 18 days into its mission. The cause? Procedural code managing 250+ hardware components through global variables and scattered functions. When flash memory filled up, the initialization functions couldn’t track which subsystems were already started, causing an infinite reboot loop.
The fix required remotely clearing flash memory and implementing better state tracking. While the actual fix involved procedural error handling and filesystem limits, the incident highlighted why modern rovers use object-oriented design principles for state management. JPL engineer Jennifer Trosper, who had warned about potential state management issues, helped lead the recovery effort. The team’s solution involved better encapsulation of subsystem states - a principle now implemented through OOP in modern missions:
class RoverComponent:
def __init__(self, name):
self.name = name
self.initialized = False
self.error_count = 0
def initialize(self):
if not self.initialized:
# Safe initialization
self.initialized = TrueThis pattern - objects knowing their own state - is now standard in spacecraft software. Spirit went on to operate for 6 years instead of the planned 90 days. When Curiosity launched in 2011, its entire control system used OOP from the start. Each instrument is an object, each motor is an object, even each wheel is an object with its own wear tracking.
You’re learning the same pattern that keeps billion-dollar spacecraft alive on other planets!
[Source: Reeves, G., & Neilson, T. (2005). “The Mars Rover Spirit FLASH Anomaly.” IEEE Aerospace Conference Proceedings.]
6.2 Classes and Objects: Building Blocks
Class A blueprint or template for creating objects that defines attributes and methods.
Before we dive into creating classes, let’s understand why we need them beyond the simple example we just saw. As your programs grow, you face several challenges that classes elegantly solve:
Namespace pollution: Without classes, you might have functions like calculate_star_luminosity(), calculate_planet_mass(), calculate_galaxy_distance() - your namespace becomes cluttered with hundreds of related functions.
Object A specific instance of a class containing data (attributes) and behavior (methods).
Data consistency: When data and functions are separate, nothing prevents you from passing a galaxy’s data to a star’s calculation function, potentially causing silent errors or crashes.
Constructor The __init__ method that initializes new objects when they’re created.
Code reusability: With functions alone, similar behaviors must be duplicated. Every object type needs its own set of functions even when the logic is similar.
Conceptual clarity: We naturally think of stars, planets, and galaxies as entities with properties and behaviors. Classes let us model this intuition directly in code.
A class is a blueprint for creating objects. An object (or instance) is a specific realization of that blueprint. Think of a class as the concept “thermometer” and objects as specific thermometers in your lab.
Temperature: 293.15 +/- 0.1 K
Pressure: 1.01e+06 +/- 500 dyne/cm^2
Pressure relative error: 0.049%
Understanding self
self The first parameter of instance methods, referring to the specific object being operated on.
The self parameter is how each object keeps track of its own data. When you call temp.relative_error(), Python automatically passes temp as the first argument. Here’s what happens behind the scenes:
Counter 1: 2
Counter 2: 1
This seemingly simple concept of bundling data with behavior revolutionized programming. Let me tell you how it started…
In 1962, Norwegian computer scientists Kristen Nygaard and Ole-Johan Dahl faced a mounting challenge at the Norwegian Computing Center. Nygaard had been developing simulations since 1957 - first for nuclear reactor calculations, then for operations research problems. Their early projects included analyzing factory layouts, airport departure systems, and harbor operations. The existing approaches using ALGOL 60 were becoming unwieldy for modeling these complex, interconnected systems with hundreds of interacting components.
Their revolutionary solution? Create “objects” that bundled data with behavior. In 1965, they successfully used SIMULA I to analyze the Raufoss ammunitions factory layout - determining optimal arrangements for cranes and storage points. The program, punched on 1,130 cards, could simulate 2.5 days of factory operations in just 22 seconds. Each crane, storage point, and workstation became an object that knew its own state and could respond to events.
# Simplified concept in modern Python:
class FactoryStation:
def __init__(self, name, capacity, processing_time):
self.name = name
self.capacity = capacity
self.queue = []
def receive_item(self, item):
# Each station manages its own queue and processing
self.queue.append(item)By 1967, Simula 67 formalized these concepts into classes, inheritance, and virtual methods - the foundation of modern OOP. Alan Kay, influenced by Simula (along with Sketchpad and his biology background), coined “object-oriented programming” around the same time. He later explained: “I thought of objects being like biological cells and/or individual computers on a network, only able to communicate with messages.”
The impact was profound but gradual. Simula influenced Smalltalk in the 1970s, then C++ in the 1980s, and eventually Java in the 1990s. The Norwegian Computing Center, initially focused on practical simulation problems, had accidentally created one of computing’s most transformative paradigms. Today, when you create a Particle class or a Galaxy object, you’re using concepts born from the need to simulate factories, airports, and harbors in 1960s Norway!
[Sources: Dahl & Nygaard (1978). “The development of the SIMULA languages.” ACM SIGPLAN Notices; Kay, A. (2003). Email to Stefan Ram on OOP definition]
# WRONG - Missing self parameter
class BadClass:
def method(): # Missing self!
return "something"
# This fails:
# obj = BadClass()
# obj.method() # TypeError: takes 0 arguments but 1 given
# CORRECT - Always include self
class GoodClass:
def method(self): # self is required
return "something"
obj = GoodClass()
print(obj.method()) # Works!This is probably the most common OOP error. Remember: instance methods ALWAYS need self as their first parameter.
Instance vs Class Attributes
Instance Attribute Data unique to each object, defined with self.attribute.
Instance attributes belong to specific objects. Class attributes are shared by all instances. This bundling of data and methods is called encapsulation - a core principle of OOP:
Class Attribute Data shared by all instances of a class, defined directly in the class body.
Total simulations: 2
Sim1 particles: 1000
Sim2 particles: 5000
Speed of light: 3.00e+10 cm/s
Box size: 1.00e-04 cm
What’s the output of this code? Why?
class DataPoint:
count = 0 # Class attribute
def __init__(self, value):
self.value = value # Instance attribute
DataPoint.count += 1
p1 = DataPoint(10)
p2 = DataPoint(20)
p1.count = 100 # What happens here?
print(f"p1.count: {p1.count}")
print(f"p2.count: {p2.count}")
print(f"DataPoint.count: {DataPoint.count}")Output:
p1.count: 100p2.count: 2DataPoint.count: 2
When you write p1.count = 100, you create a new instance attribute that shadows the class attribute for p1 only. The class attribute remains unchanged at 2, and p2 still sees the class attribute. This is a common source of confusion - instance attributes can hide class attributes with the same name!
When you access obj.attr, Python searches in this order:
- Instance
__dict__— attributes specific to this object - Class attributes — shared by all instances
- Base classes — inherited attributes (Chapter 10)
This is why p1.count = 100 creates a new instance attribute that shadows the class attribute. The class attribute still exists at DataPoint.count.
Encapsulation The bundling of data and methods that operate on that data within a single unit (class).
6.3 Methods: Functions Attached to Objects
Method A function defined inside a class that operates on instances of that class.
Methods are functions that belong to a class. They can access and modify the object’s state through self. Let’s build up from simple to complex.
Iron density: 7.87 g/cm^3
Sinks in water: True
Now let’s advance to more complex mathematical methods:
v1 magnitude: 5.0 cm
Dot product: 3 cm^2
Angle: 0.93 radians
After normalization: (0.60, 0.80)
New magnitude: 1.00
Why does normalize() modify the vector in place while magnitude() returns a value?
This follows a fundamental method contract convention:
- Mutation methods modify object state in place and return
None(likelist.sort(),v.normalize()) - Computation methods return values without changing the object (like
sorted(list),v.magnitude())
Why this matters for scientific code: When you call particle.update_position(dt), you expect the particle’s state to change. When you call particle.kinetic_energy(), you expect to get a number back without side effects. Mixing these contracts leads to bugs: if kinetic_energy() secretly modified velocity, your simulation would be wrong.
The names hint at intent: verbs (“normalize”, “update”, “add”) suggest mutation; nouns (“magnitude”, “energy”, “mean”) suggest computation. Document the contract explicitly in docstrings when the name is ambiguous.
Some libraries return self from mutation methods to enable chaining (v.normalize().scale(2)). This breaks the “mutate \(\to\) None” convention but is consistent within those libraries. Pick one convention and be consistent.
Method Types: Instance, Class, and Static
Average: 20.0
Data valid: True
Temperature: 298.15 K
PATTERN: Public Interface vs Private Implementation
In scientific software, methods define how objects interact. Think of methods as the object’s “API” - what it promises to do regardless of internal implementation.
Public Interface (what users see):
particle.update_position(dt)measurement.get_uncertainty()simulation.run_steps(100)
Private Implementation (internal details):
- How position is stored (Cartesian? polar?)
- How uncertainty is calculated
- What algorithm updates the simulation
This separation allows you to change implementation without breaking code that uses your objects. NumPy arrays exemplify this: arr.mean() works the same whether the array is stored in row-major or column-major order, in RAM or memory-mapped.
Best Practice: Start method names with underscore (_) to indicate internal methods not meant for external use.
Here’s how Astropy uses the OOP patterns you’re learning:
from astropy.coordinates import SkyCoord
from astropy import units as u
# SkyCoord is a class with properties and methods!
m31 = SkyCoord(ra=10.68*u.degree, dec=41.27*u.degree)
print(m31.galactic) # Property with coordinate transformation
print(m31.separation(another_coord)) # Method for angular distance
# You're learning the same patterns that power professional astronomy code!6.4 Properties: Smart Attributes
Property A special attribute that executes code when accessed or set, created with the @property decorator.
Setter A property method that validates and sets attribute values, defined with @attribute.setter.
Properties are Python’s mechanism for enforcing invariants — conditions that must always be true about your object. A circle’s radius must be positive. A temperature can’t go below absolute zero. A probability must be between 0 and 1. Without properties, any code could set circle.radius = -5 and corrupt your simulation. With properties, the object protects itself:
What properties give you:
- Validation on assignment — Reject invalid values immediately, not when they cause cryptic errors downstream
- Computed attributes — Derive area from radius automatically, guaranteeing consistency
- Encapsulation — Hide internal storage (
_radius) while exposing a clean interface (radius) - Invariant enforcement — Make it structurally impossible to create invalid object states
Radius: 5 cm
Area: 78.54 cm^2
Circumference: 31.42 cm
New area: 314.16 cm^2
Error: Radius must be positive, got -5
Properties for Unit Safety
Water at 300.0 K
= 26.9 degC
= 80.3 degF
Boiling: 373.1 K
Astronomy-Specific Properties Example
Observing M31:
Airmass: 1.15
Moon illumination: 45.9%
Sky conditions: Grey
The airmass and moon_phase properties above are toy models for teaching purposes. Real astronomical calculations use ephemeris data and more sophisticated formulas. For production work, use Astropy which handles these correctly.
Properties with validation could have prevented one of medical history’s worst software disasters. Between 1985 and 1987, the Therac-25 radiation therapy machine caused at least six accidents where patients received massive radiation overdoses—up to 100 times the intended dose. Three patients died directly from the overdoses.
The root cause was a lack of validation in the software’s state management. The machine could operate in electron-beam mode (low power) or X-ray mode (high power with metal target). A race condition meant the machine could be left in a lethal state: high power WITHOUT the metal target.
With proper validation using properties:
class RadiationTherapyMachine:
def __init__(self):
self._mode = None
self._power_level = 0
self._target_in_place = False
@property
def power_level(self):
return self._power_level
@power_level.setter
def power_level(self, value):
# Validation prevents lethal configuration
if self._mode == "electron" and value > ELECTRON_MAX:
raise ValueError("Power too high for electron mode!")
if self._mode == "xray" and not self._target_in_place:
raise ValueError("X-ray mode requires target!")
self._power_level = valueToday, medical device software uses extensive validation at every state change. Properties ensure that impossible values trigger immediate alerts, not patient deaths. Every validation in your setters follows safety practices written in the aftermath of preventable tragedies.
[Source: Leveson, N.G. & Turner, C.S. (1993). “An Investigation of the Therac-25 Accidents”. IEEE Computer, 26(7), 18-41.]
In 1990, the Hubble Space Telescope reached orbit with a catastrophic flaw - its primary mirror was ground to the wrong shape by 2.2 micrometers, about 1/50th the width of a human hair. The error occurred because a measuring device called a null corrector had been assembled incorrectly, with one lens positioned 1.3mm out of place. But here’s the tragic part: the computer software accepting test measurements had no validation. It accepted clearly impossible values without question.
During testing, technicians actually got measurements showing the mirror was wrong. But other tests (using the faulty null corrector) showed it was “perfect.” The software happily stored both sets of contradictory data. No validation checks asked: “Why do these measurements disagree by orders of magnitude?” or “Is this curvature physically possible for a mirror this size?”
The servicing mission to install COSTAR cost over $600 million (the total impact including delays and lost science time exceeded $1.5 billion). The repair required a daring Space Shuttle mission in 1993 to install COSTAR - essentially giving Hubble “glasses.” But the software fix was equally important. NASA completely rewrote their testing software with aggressive validation:
# Simplified version of the validation concept:
class MirrorMeasurement:
def __init__(self, expected_curvature):
self.expected = expected_curvature
self._curvature_mm = None
@property
def curvature_mm(self):
return self._curvature_mm
@curvature_mm.setter
def curvature_mm(self, value):
# Physical limits based on mirror specifications
if not (2200.0 <= value <= 2400.0):
raise ValueError(f"Impossible curvature: {value}mm")
# Check against expected value
deviation = abs(value - self.expected) / self.expected
if deviation > 0.001: # 0.1% tolerance
raise Warning(f"Curvature {value} deviates {deviation*100:.2f}% from expected")
self._curvature_mm = valueNote: Modern NASA testing actually uses far more sophisticated validation including statistical process control, multiple sensor cross-validation, and machine learning-based anomaly detection - this example shows the core concept.
Today, every NASA mirror goes through validation software that checks measurements at the moment of entry. Properties ensure that impossible values trigger immediate alerts, not billion-dollar disasters. The James Webb Space Telescope, Hubble’s successor, had its mirrors tested with software that validates every measurement against physical constraints, expected ranges, and cross-checks with redundant sensors.
When you write validation in your setters, you’re implementing the same safeguards that now protect every space telescope from Hubble’s fate. That simple if value <= 0: raise ValueError() in your code? That’s the pattern that could have saved one of humanity’s greatest scientific instruments from launching half-blind into space!
[Sources: Allen, L. (1990). The Hubble Space Telescope Optical Systems Failure Report. NASA. Simplified technical details for pedagogical purposes.]
In 1999, NASA’s Mars Climate Orbiter burned up in Mars’ atmosphere. The cause? One team used pound-force seconds, another used newton-seconds. The spacecraft’s thrusters fired with 4.45\(\times\) the intended force.
While Python wasn’t used in the 1999 mission (spacecraft used Ada and C++), modern spacecraft software prevents such disasters using property-based validation - a pattern we can now demonstrate in Python:
class Thruster:
@property
def thrust_newtons(self):
return self._thrust_n
@thrust_newtons.setter
def thrust_newtons(self, value):
self._thrust_n = value
@property
def thrust_pounds(self):
return self._thrust_n * 0.224809
@thrust_pounds.setter
def thrust_pounds(self, value):
self._thrust_n = value / 0.224809Properties ensure units are always consistent internally, regardless of what units the user provides. This pattern is now mandatory in NASA’s modern flight software, preventing the type of error that destroyed the Mars Climate Orbiter.
What happens if you create a property without a setter but try to assign to it?
class ReadOnly:
@property
def value(self):
return 42
obj = ReadOnly()
obj.value = 100 # What happens?You get an AttributeError: can't set attribute. Properties without setters are read-only. This is actually useful for computed values that should never be directly modified, like the area of a circle (which should only change when the radius changes). This pattern enforces data consistency by preventing invalid states.
# WRONG - Infinite recursion!
class BadExample:
@property
def value(self):
return self.value # Calls itself forever!
# CORRECT - Use different internal name
class GoodExample:
def __init__(self):
"""Initialize with internal storage."""
self._value = 0 # Underscore prefix
@property
def value(self):
"""Access the value."""
return self._value # Different name
example = GoodExample()
print(f"Value: {example.value}")Always use a different internal name (usually with underscore) for the actual storage.
6.5 Special Methods: Making Objects Pythonic
Special Method Methods with double underscores (like __init__, __str__) that define object behavior for built-in operations.
Special methods (also called “magic methods” or “dunder methods”) let your objects work with Python’s built-in functions and operators. The term “duck typing” comes from the saying “If it walks like a duck and quacks like a duck, it’s a duck” - meaning Python cares about what an object can do, not what type it is.
Duck Typing Python’s philosophy that an object’s suitability is determined by its methods, not its type.
In December 1989, Guido van Rossum was frustrated. Working at CWI (Centrum Wiskunde & Informatica) in Amsterdam on the Amoeba distributed operating system, he found existing languages inadequate. ABC was too rigid and couldn’t be extended. C was too low-level for rapid development. So during the Christmas vacation (he was bored and the office was closed), he started writing his own language, naming it after the British comedy group Monty Python’s Flying Circus.
Guido made a radical decision that would change programming forever: instead of hiding object behavior behind compiler magic like C++ did, Python would expose everything through special methods that anyone could implement. Want your object to work with len()? Just add __len__(). Want it to support addition? Add __add__(). No special compiler support needed - just simple methods with funny names.
This transparency was revolutionary. In C++, only the compiler could decide what + meant for built-in types. In Python, ANY object could define it:
# This wasn't possible in other languages of the time!
class Vector:
def __add__(self, other):
# YOU decide what + means for YOUR objects
return Vector(self.x + other.x, self.y + other.y)
v1 + v2 # Calls YOUR __add__ methodThe scientific community immediately saw the implications. Jim Hugunin created Numeric (NumPy’s ancestor) in 1995, using special methods to make arrays feel like native Python objects:
# Scientific arrays that felt built-in!
array1 + array2 # Element-wise addition via __add__
array[5:10] # Slicing via __getitem__
len(array) # Size via __len__
print(array) # Readable output via __str__Guido later reflected: “I wanted Python to be a bridge between the shell and C. I never imagined it would become the language of scientific computing” (paraphrased from various interviews). That bridge was built on special methods - the democratic principle that any object could be a first-class citizen.
Alex Martelli, who would later coin the term “duck typing” for Python’s approach, explained it perfectly: “In Python, you don’t check if it IS-a duck, you check if it QUACKS-like-a duck, WALKS-like-a duck” (2000, comp.lang.python newsgroup). This philosophy meant scientific libraries could create objects that integrated seamlessly with Python’s syntax.
When you implement __str__ or __add__, you’re using the democratic principle that made Python the world’s most popular scientific language: your objects are equals with Python’s built-in types. No special privileges needed - just implement the methods, and Python treats your objects as first-class citizens!
[Sources: Van Rossum, G. (1996). Foreword for “Programming Python” (1st ed.). Various interviews compiled. Martelli’s “duck typing” coined circa 2000.]
f1 = 1/2
f1 + f2 = 5/6
f1 as float: 0.5
f1 == Fraction(2,4): True
Essential Special Methods
Length: 5
First: 10
Contains 30: True
Is non-empty: True
10 25 30 40 50
Which special method would you implement to make your object work with the abs() function?
You would implement __abs__(). Python’s built-in abs() function calls the object’s __abs__() method if it exists. For example:
class Vector2D:
def __init__(self, x, y):
self.x = x
self.y = y
def __abs__(self):
# Return magnitude for abs()
return (self.x**2 + self.y**2)**0.5
v = Vector2D(3, 4)
print(abs(v)) # Prints 5.0This pattern extends to many built-ins: len() calls __len__(), str() calls __str__(), etc. Understanding this connection helps you make objects that feel native to Python.
PATTERN: Duck Typing Through Special Methods “If it walks like a duck and quacks like a duck, it’s a duck”
Python doesn’t check types - it checks capabilities. Any object implementing the right special methods can be used anywhere:
Iterator Protocol:
__iter__()and__next__()\(\to\) works inforloops
Container Protocol:
__len__()and__getitem__()\(\to\) works withlen(), indexing
Numeric Protocol:
__add__(),__mul__(), etc. \(\to\) works with math operators
Context Manager Protocol:
__enter__()and__exit__()\(\to\) works with ‘with’ statement
Array Protocol (NumPy):
.shape,.dtype,__getitem__\(\to\) works where NumPy expects array-like
This is why your custom objects can work with built-in functions! A DataSet with __len__ works with len(). A Vector with __add__ works with +. This protocol-based design is central to Python’s flexibility and why scientific libraries integrate so well.
Real-world example: Any object with .shape, .dtype, and __getitem__ can be used where NumPy expects an array-like object. This is how libraries like PyTorch tensors work seamlessly with NumPy functions.
6.6 When to Use Objects vs Functions
Now that you understand HOW to create classes with all their powerful features - properties for validation, special methods for integration, inheritance for code reuse (Chapter 10) - you need wisdom about WHEN to use them. Not every problem needs objects. Creating unnecessary classes can make code harder to understand, not easier. The art of programming lies in choosing the right tool for the right job.
Learning OOP does not mean you should use classes everywhere. Many scientific computations are better expressed as pure functions:
- Kepler’s third law:
period = orbital_period(semi_major_axis)— no state needed - Unit conversion:
kelvin = celsius_to_kelvin(temp)— stateless transformation - Array operations:
mean = np.mean(data)— let NumPy handle it
Use classes when:
- You have state that changes over time (particle positions, running statistics)
- You need to enforce invariants (radius > 0, probability \(\in\) [0,1])
- Data and behavior are inseparable (a measurement knows its uncertainty)
Use functions when:
- The operation is stateless (input \(\to\) output, no memory)
- The logic is generic (works on any data, not tied to an entity)
- A class would just wrap a single method
The test: If your class has only __init__ and one method, it’s probably just a function in disguise. If you’re passing the same 5 variables to every function, they might want to be an object.
Here’s how to decide. Note: In Chapter 10, we’ll explore the “is-a” relationship (inheritance) versus “has-a” relationship (composition) in detail. For now, focus on single classes.
Use Objects When:
- Managing State Over Time
The variance formula above (sum_sq - n*$\mathrm{mean}^{2}$) can suffer from catastrophic cancellation when values are large but variance is small. For production code, use Welford’s algorithm (introduced in Chapter 3) which computes running variance stably. This example prioritizes clarity over numerical robustness.
After 1: mean=1.0, var=0.0
After 2: mean=1.5, var=0.5
After 3: mean=2.0, var=1.0
After 4: mean=2.5, var=1.7
After 5: mean=3.0, var=2.5
- Modeling Real Entities
Andromeda: v=-300 km/s
Use Functions When:
- Simple Transformations
Temperature: 298.15 K
Period: 1.0 years
- Stateless Operations
Mean: 3.0
Std: 1.41
When Travis Oliphant designed NumPy in 2005, he faced this exact decision. Should arrays be simple functions operating on data, or objects with methods?
He chose objects, and it transformed scientific Python:
# If NumPy used only functions:
array = create_array([1, 2, 3])
mean = calculate_mean(array)
reshaped = reshape_array(array, (3, 1))
# Because NumPy uses objects:
array = np.array([1, 2, 3])
mean = array.mean()
reshaped = array.reshape(3, 1)The object approach won because arrays maintain state (shape, dtype, memory layout) and operations naturally belong to the data. This decision made NumPy intuitive and helped it become the foundation of scientific Python. You’re learning to make the same architectural decisions!
Note: For arrays with millions of elements (common in N-body simulations), the performance difference between OOP and functional approaches can matter. NumPy solves this by implementing operations in C while exposing an OOP interface - the best of both worlds!
By 2011, Python astronomy had descended into chaos. Every research group had created their own packages with incompatible interfaces. There was PyFITS for reading FITS files, PyWCS for world coordinate systems, vo.table for Virtual Observatory tables, asciitable for text data, cosmolopy for cosmological calculations, and dozens more. Installing a working astronomy environment was a nightmare - each package had different conventions, different dependencies, and different ways of representing the same concepts.
Erik Tollerud, a graduate student at UC Irvine, described the situation: “I spent more time converting between data formats than doing science” (paraphrased from development discussions). A coordinate might be represented as a tuple in one package, a list in another, and a custom object in a third. Unit conversions were handled differently everywhere. Even reading a simple FITS file could require three different packages that didn’t talk to each other.
At the 2011 Python in Astronomy conference, something remarkable happened. Thomas Robitaille, Perry Greenfield, Erik Tollerud, and developers from competing packages made a radical decision: merge everything into one coherent framework using consistent OOP principles. The design philosophy was simple but powerful:
- If it’s an entity with state and behavior, make it a class (SkyCoord for coordinates, Table for data, Quantity for values with units)
- If it’s a simple transformation, keep it a function (unit conversions, mathematical operations)
- Everything has units, always (no more Mars Climate Orbiter disasters)
- One obvious way to do things (borrowed from Python’s philosophy)
The transformation was remarkable. This incompatible mess:
# Old way - three packages, incompatible outputs
import pyfits
import pywcs
import coords
data = pyfits.getdata('image.fits') # Returns numpy array
header = pyfits.getheader('image.fits') # Returns header object
wcs = pywcs.WCS(header) # Different coordinate object
# Convert pixel to sky - returns plain numpy array, no units!
sky = wcs.wcs_pix2sky([[100, 200]], 1)
# Now convert to different coordinate system - different package!
galactic = coords.Position((sky[0][0], sky[0][1])).galactic()Became this unified interface:
# Astropy way - one package, consistent OOP
from astropy.io import fits
from astropy.wcs import WCS
from astropy.coordinates import SkyCoord
hdu = fits.open('image.fits')[0] # Unified HDU object
wcs = WCS(hdu.header) # Same package, consistent interface
# Returns SkyCoord object with units and frame info!
sky = wcs.pixel_to_world(100, 200)
galactic = sky.galactic # Simple property access for conversionThe key insight? Objects should model astronomical concepts the way astronomers think about them. A coordinate isn’t just numbers - it’s a position with a reference frame, epoch, and possibly distance. A table isn’t just an array - it has columns with units, metadata, and masks. A quantity isn’t just a float - it has units that propagate through calculations.
Today, Astropy has over 10 million downloads and is astronomy’s most-used package. The Large Synoptic Survey Telescope, the Event Horizon Telescope that imaged black holes, and the James Webb Space Telescope data pipelines all build on Astropy’s OOP foundation. When you’re deciding whether to use a class or function, you’re making the same architectural decisions that unified an entire scientific community and enabled discoveries like gravitational waves and exoplanets!
[Sources: Robitaille, T., et al. (2013). Astropy: A community Python package for astronomy. Astronomy & Astrophysics, 558, A33. Development history simplified for pedagogical purposes.]
6.7 Debugging Classes
Understanding how to inspect and debug objects is crucial. Python provides powerful introspection tools to examine objects at runtime:
Type: <class '__main__.Instrument'>
Class name: Instrument
Is Instrument?: True
Has 'calibrate'?: True
Wavelength: 500
Public attributes: ['calibrate', 'name', 'wavelength_nm']
Instance __dict__: {'name': 'HARPS', 'wavelength_nm': 500, '_calibrated': False}
dir() has 31 items (includes inherited)
__dict__ has 3 items (instance only)
Type Hints with Classes (Optional)
Python supports type hints to document expected types, making code clearer:
Observer: J. Smith
Entries: 2
Duration: 2.5 hours
This code has a subtle but critical bug. Can you find it?
class Observatory:
def __init__(self, name, telescopes=[]): # Bug here!
self.name = name
self.telescopes = telescopes
def add_telescope(self, telescope):
"""Add a telescope to this observatory."""
self.telescopes.append(telescope)
# Test the code
keck = Observatory("Keck")
keck.add_telescope("Keck I")
vlt = Observatory("VLT")
vlt.add_telescope("Antu")
print(f"Keck telescopes: {keck.telescopes}")
print(f"VLT telescopes: {vlt.telescopes}") # Unexpected output!The bug is the mutable default argument telescopes=[]. All instances share the same list! When you add a telescope to one observatory, it appears in all of them.
Fix:
def __init__(self, name, telescopes=None):
self.name = name
self.telescopes = telescopes if telescopes is not None else []This bug has caused real problems in production systems. Always use None as default for mutable arguments, then create a new object in the method.
Main Takeaways
You’ve just made a fundamental leap in how you think about programming. Object-Oriented Programming isn’t just a different syntax—it’s a different mental model. Instead of thinking “what operations do I need to perform on this data?”, you now think “what is this thing and what can it do?” This shift from procedural to object-oriented thinking mirrors how we naturally conceptualize scientific systems. A particle isn’t just three numbers for position; it’s an entity with mass, velocity, and behaviors like moving and colliding. This conceptual alignment makes complex programs more intuitive and maintainable.
The power of OOP becomes clear when managing complexity. That simple Particle class with five attributes and three methods might seem like overkill compared to a dictionary. But when your simulation has thousands of particles, each needing consistent updates, validation, and state tracking, the object-oriented approach prevents the chaos that killed the Mars Climate Orbiter mission. Properties ensure units stay consistent. Methods guarantee state updates follow physical laws. Special methods make your objects work seamlessly with Python’s syntax. These aren’t just programming conveniences—they’re safety mechanisms that prevent billion-dollar disasters. The Therac-25 radiation overdoses, Hubble’s mirror error, and countless other failures could have been prevented with proper encapsulation and validation.
The historical journey from SIMULA to modern Python reveals how OOP emerged from practical needs. Norwegian scientists needed to simulate factories and harbors, leading them to bundle data with behavior—the birth of objects. This paradigm spread through Smalltalk, C++, and Java, eventually reaching Python where Guido van Rossum’s radical transparency (special methods anyone can implement) democratized programming. When you write __add__ to define addition for your objects, you’re using the same mechanism that made NumPy arrays feel native to Python. This is why scientific libraries integrate so seamlessly—they all follow the same protocols.
But perhaps the most important lesson is knowing when NOT to use objects. Not every function needs to become a method. Not every data structure needs to become a class. Simple calculations should stay as functions. Stateless transformations don’t need objects. The art lies in recognizing when you’re modeling entities with state and behavior (use classes) versus performing operations on data (use functions). The NumPy decision to make arrays objects wasn’t arbitrary—arrays maintain complex state (shape, dtype, memory layout) and operations naturally belong to the data. The Astropy unification succeeded because astronomical concepts map naturally to objects—a coordinate is more than numbers, it’s a position with a reference frame.
Looking ahead, everything in Python’s scientific stack builds on these concepts. NumPy arrays are objects with methods like .mean() and .reshape(). Every Matplotlib plot is an object maintaining state about axes, data, and styling. When you write array.sum() or figure.savefig(), you’re using the same patterns you just learned. More importantly, you can now create your own scientific classes that integrate seamlessly with these tools. You’re not just learning to use objects—you’re learning to think in objects, and that’s a superpower for scientific computing that will serve you throughout your career.
What’s Next: You now have the conceptual tools to understand why NumPy arrays are objects with methods, why properties enforce dtype constraints, and why special methods like __getitem__ enable slicing syntax. In Chapter 7, you’ll see these OOP patterns in action at scale — arrays with millions of elements, operations that complete in milliseconds instead of minutes, and broadcasting rules that eliminate explicit loops. The classes you learned to build here are the foundation; NumPy is where you learn to leverage them for real scientific work.
Definitions
attribute - A variable that belongs to an object. Instance attributes are unique to each object; class attributes are shared by all instances
class - A blueprint or template for creating objects. Defines what attributes and methods objects will have
class attribute - Data shared by all instances of a class, defined directly in the class body
constructor - The __init__ method that initializes new objects when they’re created
duck typing - Python’s philosophy that an object’s suitability is determined by its methods and attributes, not its type
encapsulation - The bundling of data and methods that operate on that data within a single unit (class)
instance - A specific object created from a class. Each instance has its own set of instance attributes
instance attribute - Data unique to each object, defined with self.attribute
method - A function defined inside a class that operates on instances of that class
object - A specific instance of a class containing data (attributes) and behavior (methods)
object-oriented programming - A programming paradigm that organizes code around objects (data) and methods (behavior)
property - A special attribute that executes code when accessed or set, created with the @property decorator
self - The first parameter of instance methods, referring to the specific object being operated on
setter - A property method that validates and sets attribute values, defined with @attribute.setter
special method - Methods with double underscores (like __init__, __str__) that define object behavior for built-in operations
static method - A method that doesn’t receive self or cls, defined with @staticmethod
Key Takeaways
✓ Classes combine data and behavior – Objects bundle related attributes and methods, keeping code organized and preventing errors from mismatched data and functions
✓ The self parameter connects methods to objects – It’s automatically passed to methods and refers to the specific instance being operated on
✓ Properties provide smart attributes – Use @property for computed values and validation, ensuring data consistency without explicit method calls
✓ Special methods make objects Pythonic – Implementing __str__, __len__, __add__ lets your objects work naturally with built-in functions and operators
✓ Instance attributes belong to objects, class attributes are shared – Choose instance for object-specific data, class for constants and shared state
✓ Not everything needs to be a class – Use objects for stateful entities with behavior, functions for simple calculations and transformations
✓ Properties prevent unit disasters – Validation in setters catches errors immediately, preventing Mars Climate Orbiter-style catastrophes
✓ Everything in Python is an object – Even functions and modules are objects, making Python’s object model consistent and powerful
✓ Duck typing enables flexibility – Objects work based on capabilities (methods) not types, allowing seamless integration with Python’s protocols
✓ OOP emerged from practical simulation needs – SIMULA’s factory simulations birthed the paradigm that now powers scientific computing
Quick Reference Tables
Class Definition Syntax
| Element | Syntax | Example |
|---|---|---|
| Define class | class Name: |
class Particle: |
| Constructor | def __init__(self): |
def __init__(self, mass): |
| Instance attribute | self.attr = value |
self.mass = 1.67e-24 |
| Class attribute | attr = value |
SPEED_OF_LIGHT = 3e10 |
| Instance method | def method(self): |
def velocity(self): |
| Property getter | @property |
@property def energy(self): |
| Property setter | @attr.setter |
@energy.setter |
| Class method | @classmethod |
@classmethod def from_file(cls): |
| Static method | @staticmethod |
@staticmethod def validate(): |
Essential Special Methods
| Method | Purpose | Called By |
|---|---|---|
__init__ |
Initialize object | MyClass() |
__str__ |
Human-readable string | str(obj), print(obj) |
__repr__ |
Developer string | repr(obj) |
__len__ |
Get length | len(obj) |
__getitem__ |
Get by index | obj[i] |
__setitem__ |
Set by index | obj[i] = val |
__contains__ |
Check membership | x in obj |
__iter__ |
Make iterable | for x in obj |
__add__ |
Addition | obj1 + obj2 |
__eq__ |
Equality test | obj1 == obj2 |
__bool__ |
Truth value | if obj:, bool(obj) |
__call__ |
Make callable | obj() |
__abs__ |
Absolute value | abs(obj) |
__float__ |
Convert to float | float(obj) |
When to Use Classes vs Functions
| Use Classes When | Use Functions When |
|---|---|
| Managing state over time | Simple transformations |
| Modeling real entities | Stateless operations |
| Operations belong to data | One-way data flow |
| Need data validation | No state to maintain |
| Complex initialization | Simple input \(\to\) output |
| Multiple related methods | Single operation |
Debugging Object Tools
| Function | Purpose | Example |
|---|---|---|
type(obj) |
Get object’s class | type(particle) |
isinstance(obj, cls) |
Check if object is instance | isinstance(p, Particle) |
hasattr(obj, 'attr') |
Check if attribute exists | hasattr(p, 'mass') |
getattr(obj, 'attr') |
Get attribute safely | getattr(p, 'mass', 0) |
setattr(obj, 'attr', val) |
Set attribute | setattr(p, 'mass', 1.0) |
dir(obj) |
List all accessible attributes | dir(particle) |
vars(obj) or obj.__dict__ |
Get instance attributes only | vars(particle) |
help(obj) |
Get documentation | help(Particle) |
Next Chapter Preview
In Chapter 7: NumPy Fundamentals, you’ll see how the OOP concepts you just learned power the foundation of scientific Python. NumPy arrays aren’t just data containers—they’re sophisticated objects with methods like .reshape(), .mean(), and .dot(). You’ll discover how NumPy combines the intuitive OOP interface you now understand with blazing-fast C implementations, achieving the best of both worlds. We’ll explore array creation, indexing, broadcasting, and vectorization—concepts that eliminate explicit loops and make calculations orders of magnitude faster. Most importantly, you’ll see how NumPy’s object-oriented design enables the entire scientific Python ecosystem, from plotting with Matplotlib to machine learning with scikit-learn. The objects you just learned to create? They’re the same pattern that processes terabytes of astronomical data and simulates the universe’s evolution!
References
- Mars Exploration Rover Spirit Recovery (2004)
- NASA JPL. (2004). Mars Exploration Rover Mission: Spirit Anomaly Report. Jet Propulsion Laboratory.
- Reeves, G., & Neilson, T. (2005). “The Mars Rover Spirit FLASH Anomaly.” IEEE Aerospace Conference Proceedings.
- SIMULA and OOP Origins (1960s)
- Nygaard, K., & Dahl, O. J. (1978). “The development of the SIMULA languages.” ACM SIGPLAN Notices, 13(8), 245-272.
- Holmevik, J. R. (1994). “Compiling SIMULA: A Historical Study of Technological Genesis.” IEEE Annals of the History of Computing, 16(4), 25-37.
- Therac-25 Radiation Accidents (1985-1987)
- Leveson, N.G. & Turner, C.S. (1993). “An Investigation of the Therac-25 Accidents.” IEEE Computer, 26(7), 18-41.
- Hubble Space Telescope Mirror Error (1990)
- Allen, L. (1990). The Hubble Space Telescope Optical Systems Failure Report. NASA-TM-103443.
- Chaisson, E. (1994). The Hubble Wars. New York: HarperCollins. ISBN 0-06-017114-6.
- Mars Climate Orbiter Loss (1999)
- Stephenson, A. G. et al. (1999). Mars Climate Orbiter Mishap Investigation Board Report. NASA.
- Oberg, J. (1999). “Why the Mars Probe Went Off Course.” IEEE Spectrum, 36(12), 34-39.
- NumPy Design Decisions (2005)
- Oliphant, T. E. (2006). A guide to NumPy (Vol. 1). USA: Trelgol Publishing.
- Van Der Walt, S., Colbert, S. C., & Varoquaux, G. (2011). “The NumPy array: a structure for efficient numerical computation.” Computing in Science & Engineering, 13(2), 22-30.
- Astropy Unification (2011-2013)
- Robitaille, T., et al. (2013). “Astropy: A community Python package for astronomy.” Astronomy & Astrophysics, 558, A33.
- Greenfield, P. (2011). “What Python Can Do for Astronomy.” Proceedings of the 20th Annual Python in Science Conference.
- Python Language Design
- Van Rossum, G. (1996). Foreword for “Programming Python” (1st ed.). O’Reilly Media.
- Van Rossum, G., & Drake, F. L. (2009). Python 3 Reference Manual. CreateSpace.
- Martelli, A. (2000). “The Python ‘Duck Typing’ Principle.” comp.lang.python newsgroup archives.
- Object-Oriented Design Principles
- Kay, A. (1993). “The early history of Smalltalk.” ACM SIGPLAN Notices, 28(3), 69-95.
- Kay, A. (2003). Email correspondence to Stefan Ram on the definition of object-oriented programming.
- Python OOP Resources
- Lutz, M. (2013). Learning Python (5th ed.). O’Reilly Media.
- Ramalho, L. (2015). Fluent Python. O’Reilly Media.
- Beazley, D., & Jones, B. K. (2013). Python Cookbook (3rd ed.). O’Reilly Media.