Get Started
Welcome to Harness Chaos Engineering! This guide will help you set up your first chaos experiment and execute it on your target infrastructure in just a few minutes.
Prerequisites
- Harness Account with Chaos Engineering access: Sign up for free if you don't have one and ensure you have access to the Chaos Engineering module
- Target Infrastructure: Kubernetes cluster with kubectl access, or Linux machine with admin privileges
- Basic Permissions: Admin access to your target infrastructure for installing chaos agents
Create your first chaos experiment
- Interactive Guide
- Step-by-step
Access Harness Chaos Engineering
- Sign up or log in to your Harness account
- Navigate to the Chaos Engineering module from the left sidebar
- Create a new project or ask your administrator to add you to an existing project
Create an Environment
A chaos experiment is executed in an infrastructure that is associated with an environment.
- Navigate to the Environments page and select New Environment
- Specify the environment name, description (optional), and tags (optional)
- Select the environment type: Production or Non-Production
- Select Create to add the new environment
tip
You can also select one of the existing environments from the list if available.
Set Up Chaos Infrastructure
After creating an environment, add an infrastructure to it:
For Kubernetes (Recommended for First Experiment)
- Select +New Infrastructure in your environment
- Choose Kubernetes as the infrastructure type
- Select installation mode:
- Cluster-wide access: Target resources across all namespaces
- Specific namespace access: Restrict chaos injection to specific namespace
- Copy and run the provided installation command in your cluster:
# Example installation command (use the one provided in UI)
kubectl apply -f https://app.harness.io/chaos/delegate/manifest/...
- Wait for the infrastructure to show CONNECTED status
For Linux
- Select +New Infrastructure and choose Linux
- Download and install the chaos agent:
# Download the agent
curl -O https://app.harness.io/chaos/linux-agent
chmod +x linux-agent
# Install with your infrastructure ID and access key
sudo ./linux-agent --install --infra-id=<YOUR_INFRA_ID> --access-key=<YOUR_ACCESS_KEY>
Create Your First Chaos Experiment
Now let's create and run your first chaos experiment. We recommend starting with Pod Delete as it has a small blast radius and is safe for most applications.
Identify Your Target
- Identify the microservice in your application that you will target
- For Kubernetes, we'll delete a pod from your application
- Pod delete is the simplest chaos experiment recommended as the first step
Create the Experiment
- Navigate to Chaos Experiments and select New Experiment
- Choose Blank Canvas to create from scratch, or select a Template
- Configure your experiment:
- Name: "My First Pod Delete Experiment"
- Description: "Testing pod resilience"
- Tags: Add relevant tags for organization
Add Chaos Fault
- In the experiment builder, select Add Fault
- Choose Kubernetes → Pod → Pod Delete
- Configure the fault:
- Target Pods: Select specific pods or use label selectors
- Chaos Duration: Start with 30 seconds
- Force: Keep as false for graceful deletion
Add Resilience Probes (Recommended)
Probes validate your hypothesis during the experiment:
- Select Add Probe in your experiment
- Choose HTTP Probe to monitor application availability:
- URL: Your application endpoint
- Method: GET
- Success Criteria: Response code 200
- Run Properties: Execute during chaos
Run Your First Experiment
- Review your experiment configuration
- Save the experiment
- Run the experiment by clicking the Run button
- Monitor the experiment execution in real-time:
- Watch the experiment timeline
- Observe probe results
- Check system metrics and logs
Analyze Results
After the experiment completes:
- Review the Resilience Score: Overall system resilience rating based on probe results
- Check Probe Results: Success/failure of health checks during chaos
- Examine Timeline: Detailed view of experiment execution phases
- View Logs: Detailed execution logs for troubleshooting
Understanding Results
- Passed Probes: Your application handled the chaos well
- Failed Probes: Areas that need improvement
- Resilience Score: Higher scores indicate better resilience
Next Steps
Congratulations! You've successfully run your first chaos experiment. Here's what to explore next:
- Explore more chaos faults for different failure scenarios.
- Set up advanced probes for comprehensive monitoring.
- Organize GameDays for team chaos engineering events.
- Integrate with CI/CD to automate chaos testing in your pipelines.