Signal Simulation with LALSuite
In gravitational wave (GW) research, generating synthetic data is a prerequisite for training machine learning models. This section covers the use of LALSuite (LIGO Scientific Collaboration Algorithm Library) to simulate Binary Black Hole (BBH) waveforms and detector noise.
Overview
The bootcamp utilizes the data_prep_bbh.py module to interface with lalsimulation. This module automates the process of:
- Generating relativistic waveforms based on physical parameters (mass, spin, distance).
- Projecting signals onto specific detector networks (e.g., H1, L1).
- Simulating colored detector noise using Power Spectral Densities (PSDs).
- Injecting signals into noise at specific Signal-to-Noise Ratios (SNR).
Core Simulation Workflow
1. Generating Power Spectral Density (PSD)
To simulate realistic detector noise, we first define the sensitivity curve of the instrument. The gen_psd function supports several industry-standard sensitivity configurations.
from data_prep_bbh import gen_psd
fs = 8192 # Sampling frequency (Hz)
T_obs = 1 # Observation duration (seconds)
# Generate an Advanced LIGO Design Sensitivity PSD for the Hanford detector
psd = gen_psd(fs, T_obs, op='AdvDesign', det='H1')
Commonly used op (Operating Point) values:
AdvDesign: Advanced LIGO design sensitivity.AdvEarlyLow/AdvEarlyHigh: Early-stage Advanced LIGO configurations.EinsteinTelescopeP1600143: Future third-generation detector sensitivity.
2. Simulating Waveforms with sim_data
The high-level interface for generating a complete dataset is the sim_data function. It handles parameter sampling, waveform generation, and noise injection.
from data_prep_bbh import sim_data
# Configuration
fs = 8192
T_obs = 1
target_snr = 20
detectors = ['H1', 'L1']
# Generate 1000 samples
strains, labels = sim_data(
fs=fs,
T_obs=T_obs,
snr=target_snr,
detectors=detectors,
Nnoise=1,
size=1000,
mdist='astro'
)
Parameter Reference
| Parameter | Type | Description |
| :--- | :--- | :--- |
| fs | int | Sampling rate in Hz. Typical values: 2048, 4096, 8192. |
| T_obs | int | The duration of the data segment in seconds. |
| snr | float | The desired optimal Signal-to-Noise Ratio for the injected signal. |
| detectors | list | List of detector prefixes (e.g., ['H1', 'L1', 'V1']). |
| mdist | str | Mass distribution strategy: astro (astrophysical), gh (heavy), or metric. |
| size | int | Number of unique gravitational wave signals to generate. |
Data Augmentation and Noise
The simulation engine allows for multiple noise realizations per signal, which is a powerful technique for data augmentation in deep learning.
Nnoise: Increasing this parameter generates multiple segments of random noise for every single physical waveform generated.- Time Shifting: The
convert_betahelper function is used to randomly slide the waveform within the observation window, ensuring the model does not overfit to a specific merger timestamp.
Implementation Details
The underlying simulation uses the lalsimulation C library. Key physical parameters are encapsulated in the bbhparams class:
class bbhparams:
def __init__(self, mc, M, eta, m1, m2, ra, dec, iota, phi, psi, idx, fmin, snr, SNR):
self.m1 = m1 # Mass of first object
self.m2 = m2 # Mass of second object
self.ra = ra # Right Ascension
self.dec = dec # Declination
self.snr = snr # Calculated SNR
# ... and other orbital parameters
Integration with PyTorch
In the bootcamp's main.py, these simulations are wrapped in a DatasetGenerator. This allows for "on-the-fly" data generation during training, preventing the need to store terabytes of synthetic strain data on disk.
# Example of integration in training loop
dataset = DatasetGenerator(
fs=8192,
T=1,
snr=20,
nsample_perepoch=100
)
dataloader = DataLoader(dataset, batch_size=32)