Documentation

Tutorials & Learning

Comprehensive learning path with Jupyter notebooks covering the complete pipeline from installation through training to inference. Follow our step-by-step guides to master exoplanet detection with machine learning.

Learning Path

Our tutorials are organized into progressive phases, each building on the previous one. Start with Phase 1 if you're new to ExoBengal, or jump to advanced topics if you're already familiar with the basics.

  • Phase 1: Getting Started (15-30 min) - Basic setup and first prediction
  • Phase 2: Understanding the Data (30-45 min) - Exoplanet parameters and their meanings
  • Phase 3: Training Your First Model (45-60 min) - Train Random Forest classifier
  • Phase 4: Exploring Different Algorithms (60-90 min) - Compare all four models
  • Phase 5: Hyperparameter Tuning (90-120 min) - Optimize model performance
  • Phase 6: Advanced Topics (120+ min) - Batch predictions, ensemble methods

Prerequisites

  • Python 3.8+ required
  • Dependencies: numpy, pandas, matplotlib, seaborn, scikit-learn, joblib, tensorflow
  • Data: NASA Exoplanet Archive cumulative table (cumulative.csv)
  • Hardware: Minimum 4GB RAM, recommended 8GB RAM
  • For Colab: Google account for Drive access

Quick Start

Local Setup

cd ExoBengal/tutorial
jupyter lab
# Open test.ipynb

Google Colab

Click the "Open in Colab" badge in pip_test.ipynb, mount Google Drive when prompted, and run the pip install cell.

What You'll Learn

  • How to set up ExoParams with exoplanet parameters
  • Training workflows for all four ML models
  • Making predictions with trained models
  • Comparing model performance and outputs
  • Understanding evaluation metrics (classification reports, confusion matrices, AUC-ROC)
  • Calculating and interpreting Earth Similarity Index (ESI)
  • Hyperparameter tuning for better accuracy
  • Batch predictions and ensemble methods

Common Issues

  • Data file not found: Ensure cumulative.csv is in data/ directory
  • TensorFlow installation issues: Use tensorflow-cpu for CPU-only
  • Memory errors: Reduce CNN batch size or use subset of data
  • Slow training: Enable GPU in Colab or reduce epochs
  • Import errors: Reinstall exobengal package

Next Steps

After completing tutorials:

  • Explore the API Reference for detailed class documentation
  • Read Model Artifacts documentation for architecture details
  • Try the live Cerebrium API for production deployments
  • Contribute to the project on GitHub
Interstellar
Background Music
30%