Reproducibility Configuration¶
Overview¶
Reproducibility settings control random number generation and deterministic behavior across the entire training pipeline. These settings are crucial for debugging, research, and comparing experiments.
Configuration Parameters¶
seed¶
- Type: Integer
- Default:
42 - Description: Random seed for all random number generators (Python, NumPy, PyTorch, CUDA)
- Purpose: Ensures reproducibility of experiments
Usage¶
What Gets Seeded¶
- Python's
randommodule - NumPy's
np.random - PyTorch's random number generators (CPU)
- All CUDA devices
- DataLoader worker processes
When to Change¶
Use Different Seeds: - Running multiple independent experiments for statistical significance - Training ensemble models (each model should use different seed) - Exploring variance in results
Keep Same Seed: - Debugging training issues (consistent behavior across runs) - Comparing different architectures or hyperparameters fairly - Reproducing published results
Example¶
# Train 5 models with different seeds for ensemble
ml-train --seed 42
ml-train --seed 123
ml-train --seed 456
ml-train --seed 789
ml-train --seed 101112
deterministic¶
- Type: Boolean
- Default:
false - Description: Enable fully deterministic operations at the cost of performance
- Purpose: Guarantee bit-exact reproducibility across runs
Usage¶
deterministic: false # Fast, approximately reproducible (default)
# OR
deterministic: true # Slower, fully reproducible
Technical Details¶
When false (default):
- Uses cuDNN benchmark mode to find fastest algorithms
- Algorithms may be non-deterministic (slight variations across runs)
- Faster training (baseline speed: 1.0x)
- Approximate reproducibility (same seed → similar results)
When true:
- Forces PyTorch to use deterministic algorithms
- Disables cuDNN benchmark mode
- Sets torch.use_deterministic_algorithms(True)
- Slower training (typically 0.7-0.9x speed)
- Exact reproducibility (same seed → identical results)
Performance Trade-off¶
| Mode | Speed | Reproducibility | Use Case |
|---|---|---|---|
false |
Fast (1.0x) | Approximate | Production, general training |
true |
Slower (0.7-0.9x) | Bit-exact | Debugging, research papers |
When to Use true (Deterministic Mode)¶
✅ Debugging training issues - Ensures consistent behavior when troubleshooting - Makes it easier to identify problems
✅ Comparing optimization algorithms - Fair comparison requires identical conditions - Eliminates randomness as a confounding factor
✅ Publishing reproducible research - Academic papers should provide reproducible results - Enables others to verify your findings
✅ Legal/compliance requirements - Some industries require deterministic model training - Audit trails need exact reproducibility
When to Use false (Non-Deterministic Mode - Default)¶
✅ Production training - Speed matters more than exact reproducibility - Approximate reproducibility is sufficient
✅ Hyperparameter search - Running many experiments quickly - Exact reproducibility not critical
✅ General experimentation - Exploring ideas and iterating fast - Acceptable variation across runs
✅ Large-scale training - Long training times make speed critical - 10-30% speedup is significant
Complete Examples¶
Example 1: Research Paper (Full Reproducibility)¶
Why: - Readers can reproduce exact results - Eliminates randomness concerns - Speed is less critical for one-time training
Example 2: Production Training (Fast)¶
Why: - Need to train many models quickly - Approximate reproducibility sufficient - Speed is critical for iteration
Example 3: Hyperparameter Search¶
Why: - Running hundreds of experiments - Need quick feedback - Statistical trends more important than exact numbers
Example 4: Debugging¶
Why: - Need identical behavior across runs - Easier to isolate problems - Can step through code deterministically
Implementation Details¶
What the Framework Does¶
When you set these parameters, the framework:
-
Seeds all random number generators:
-
Configures cuDNN behavior:
-
Seeds DataLoader workers:
Limitations¶
Even with deterministic: true, exact reproducibility requires:
- Same PyTorch version
- Same CUDA version
- Same GPU hardware
- Same operating system (potentially)
- Single-GPU training (multi-GPU has additional challenges)
Bottom line: deterministic: true helps significantly, but 100% reproducibility across different hardware/software is challenging.
Best Practices¶
- Always set a seed (even if
deterministic: false) - Enables approximate reproducibility
-
Useful for debugging
-
Document your seed in experiment logs
- Makes it possible to reproduce later
-
Include in paper/report methods sections
-
Use deterministic mode for critical work
- Research papers
- Production models (after hyperparameter search)
-
Debugging
-
Use non-deterministic mode for exploration
- Faster iteration
- Hyperparameter search
-
Prototyping
-
Test reproducibility
Related Configuration¶
- CLI Overrides - How to set seed via command line
- Training Configuration - Other training parameters
- Examples - See reproducibility in complete configs