Audit Checks Load Testing

Automated load testing framework for audit checks with Claude Code assistance.

Prerequisites

Access to k-repo repository
S3 access to klaviyo-data-platform-orchestration-v1
Airflow access for manual DAG triggers
PyCharm (for reviewing config diffs)

Setup (One-Time)

cd ~/Klaviyo/Repos/audit-load-test

# Configure target repo path
cp .env.example .env
# Edit .env to set TARGET_REPO path

# Ensure k-repo is on the test branch
cd ~/Klaviyo/Repos/k-repo
git checkout main
git pull
git checkout -b audit_checks_load_testing


# Login to AWS
s2a-login

Quick Start

Enter into audit load test repo

cd ~/Klaviyo/Repos/audit-load-test

Login to Claude

claude

First-time setup (if Claude doesn't know the context):

"Read README.md, start 2x load test"

This single command tells Claude to:

Read this README to understand the entire workflow
Begin the load test process

Tell Claude:

Starting tests:

"Read README.md, start 2x load test" - Start 2x for 15m (default) - USE THIS FIRST TIME
"Start 4x load test for hourly" - Start 4x for hourly schedule
"Start 8x load test for daily" - Start 8x for daily schedule

During test execution:

"I am going to run another round of test" - Captures timestamp for round tracking
"I just run the hourly dag" - Captures timestamp for hourly schedule test

After test completes:

"Test is done, generate report" - Same cluster (Claude remembers)
"Test is done, cluster id is j-XXXXX, generate report" - Different cluster
"Same cluster, generate report" - Alternative for same cluster

Progressing to next load level:

"Commit results, start 4x load test" - Move from 2x to 4x (any schedule)
"Commit results, start 8x load test for hourly" - Move to 8x for hourly

Note:

Load multipliers (2x, 4x, 8x) work for any schedule (15m, hourly, daily)
Following Load Test Workflow explains the whole test procedures, including how human and Claude collaboration
This is an example of Claude generated checks file for 4x load test - https://github.com/klaviyo/k-repo/pull/19659

Load Test Workflow

Phase 1: Preparation (HUMAN)

cd ~/Klaviyo/Repos/audit-load-test
s2a-login

# Ensure k-repo is on correct branch
cd ~/Klaviyo/Repos/k-repo
git status  # Should be on audit_checks_load_testing branch

Claude checkpoint: Verify k-repo is on audit_checks_load_testing branch

Phase 2: Generate Load Test Configs (CLAUDE)

Input: Load multiplier (e.g., 2x, 4x, 8x)

Claude will:

Read .env to find TARGET_REPO and CHECKS_PATH
Read existing configs from: ${TARGET_REPO}/${CHECKS_PATH}/
Modify configs in-place with load multiplier
Prompt human to review changes

Config Location:

${TARGET_REPO}/python/klaviyo/data_platform/table_config_service/
unified_table_definitions/checks/iceberg/

Modification Rules:

Check Type	File Pattern	Modification	Example (2x)
15m	`*_15m.yml`	Multiply `INTERVAL 'X' hour` by load factor	`INTERVAL '1' hour` → `INTERVAL '2' hour` `INTERVAL '3' hour` → `INTERVAL '6' hour`
Hourly	`*_hourly.yml`	NO CHANGES	Skip - uses Airflow runtime params (`ts_start`, `ts_end`)
Daily	`*_daily.yml`	Multiply `INTERVAL 'X' day` by load factor	`INTERVAL '1' day` → `INTERVAL '2' day` `INTERVAL '31' day` → `INTERVAL '62' day`

Example Modifications (2x load):

# 15m check - BEFORE
where: created_at >= current_timestamp - INTERVAL '3' hour and created_at <= current_timestamp

# 15m check - AFTER (2x)
where: created_at >= current_timestamp - INTERVAL '6' hour and created_at <= current_timestamp

# Daily check - BEFORE
where: matched_at < current_timestamp - INTERVAL '31' day

# Daily check - AFTER (2x)
where: matched_at < current_timestamp - INTERVAL '62' day

Claude output:

✓ Modified 47 config files for 2x load
  - 23 files in *_15m.yml
  - 0 files in *_hourly.yml (skipped - uses runtime params)
  - 24 files in *_daily.yml

Changes location: ${TARGET_REPO}/python/klaviyo/.../checks/iceberg/

→ HUMAN: Open k-repo in PyCharm and review git diff
→ Verify all interval multiplications are correct
→ Reply "approved" to continue, or tell me about any issues to fix

HUMAN CHECKPOINT:

Open k-repo in PyCharm
Review git diff for modified files
Verify interval changes are correct
Reply "approved" or report issues

Phase 3: Upload Configs to S3 (CLAUDE)

If configs approved, Claude will:

3a. Upload Check Configs

./scripts/upload_checks.sh 2x

Script behavior:

Reads TARGET_REPO and CHECKS_PATH from .env
Uploads modified configs from k-repo to S3:
- *_15m.yml, *_15min.yml, *_freshness.yml → s3://.../checks_load_test/configs/15m/2x/
- *_daily.yml → s3://.../checks_load_test/configs/daily/2x/
- *_hourly.yml, *_uniqueness.yml → s3://.../checks_load_test/configs/hourly/2x/
Copies *_table_metrics.yml from prod only to hourly folders (1x, 2x, 4x)

3b. Upload DAG and Cluster Configs

./scripts/upload_dags.sh

Script behavior:

Checks current LOAD variable in yang_audit_load_test.py
Prompts for LOAD value (1, 2, 4, 8, etc.)
- This is CRITICAL: LOAD must match your config upload level
- LOAD = 2 → reads from s3://.../configs/{schedule}/2x/
- If you just press Enter, it keeps the current value
Updates LOAD variable in the DAG file if changed
Checks if s3://.../dags/datalake/audit_loadtest/ exists
If exists: Prompts "Folder exists. Overwrite? (y/n)"
Uploads from ./dags/:
- yang_audit_load_test.py (with updated LOAD variable)
- cluster_configs/yang_audit_load_test_cluster_15m.json
- cluster_configs/yang_audit_load_test_cluster_hourly.json
- cluster_configs/yang_audit_load_test_cluster_daily.json

HUMAN CHECKPOINT: Confirm both uploads succeeded

Phase 4: Trigger Load Test (HUMAN)

Actions:

Open Airflow UI
Find DAG: yang_audit_load_test_15m (or _hourly, _daily)
Trigger the DAG manually
Wait for test to complete

After test completes, tell Claude:

Test is done

Or if using the same cluster from previous conversation:

Same cluster, generate report

Or provide a different cluster ID:

Cluster ID: j-XXXXXXXXXXXXX, generate report

Note: You don't need to tell Claude the load level - Claude already knows it from the previous steps.

Phase 5: Monitor & Generate Report (CLAUDE)

Claude will:

Monitor EMR cluster status using AWS CLI
Poll until all steps complete
Fetch step execution metadata:
- Step name
- Start timestamp
- End timestamp
- Elapsed time
- Status (SUCCESS/FAILED)
Generate performance report

Report includes:

Summary statistics (total steps, success/failure counts, total runtime)
Top 20 slowest steps sorted by elapsed time
Performance metrics (average, median, P95, P99)
Failed steps with error details
Optimization suggestions

Report saved to: ./reports/{load_level}_{timestamp}/report.md

Multiple Test Rounds on Same Cluster

If you need to run multiple test rounds on the same cluster, use timestamp filtering:

Before triggering new test round:

# Get current timestamp to mark the start of this round
./scripts/generate_report.py --current-timestamp
# Output: 2025-12-08T15:30:00.123456-05:00

Important: The timestamp includes timezone (e.g., -05:00 for EST in December, -04:00 for EDT in summer). The script automatically uses your system's current timezone to match AWS EMR timestamps.

After test completes:

# Generate report only for steps created after the timestamp
./scripts/generate_report.py \
  --cluster-id j-12LC3PZVYNXDJ \
  --load 2x \
  --after-timestamp "2025-12-08T15:30:00.123456-05:00"

Note: Use the complete timestamp with timezone as output by --current-timestamp. This ensures accurate filtering that matches AWS EMR's timestamp format.

This approach filters steps by creation time, ensuring you only see results from the current test round.

Example Report:

# Load Test Report - 2x Load
Date: 2025-12-07 15:30:00
Cluster: j-2ABCDEFG12345

## Summary
- Total Steps: 150
- Successful: 148
- Failed: 2
- Total Runtime: 2h 34m 18s

## Top 20 Slowest Steps
| Rank | Step Name | Elapsed | Status | Start Time |
|------|-----------|---------|--------|------------|
| 1 | check_events_daily_2x | 45m 23s | SUCCESS | 15:30:05 |
| 2 | check_profiles_hourly_2x | 38m 12s | SUCCESS | 15:31:22 |
...

## Performance Metrics
- Average: 5m 12s
- Median (P50): 3m 45s
- P95: 32m 45s
- P99: 42m 18s

## Optimization Suggestions
1. check_events_daily (45m) - Consider increasing executor memory
2. check_profiles_hourly (38m) - Review join strategy
...

HUMAN CHECKPOINT: Review report → decide next action

Iteration Workflows

Scenario 1: Fix Config Generation Issues

Human: "The config for check_events_15m has wrong interval, should be 4 hours not 6"

Claude: I'll fix that issue.

        Updating: ${TARGET_REPO}/.../check_events_15m.yml
        Change: INTERVAL '6' hour → INTERVAL '4' hour

        ✓ Fixed
        → Review the change and reply "approved" to continue

Scenario 2: Tune Cluster Performance and Rerun

Human: "Performance is slow. I'll update cluster config and rerun 2x"

Claude: Understood. Here's the iteration workflow:

        1. Keep check configs as-is (already uploaded to S3 2x folder)
        2. Update cluster configs in: ./dags/cluster_configs/*.json
        3. Run: ./scripts/upload_dags.sh (re-upload with new cluster configs)
        4. Trigger DAG again in Airflow (uses existing 2x check configs from S3)
        5. Provide new cluster ID to Claude for monitoring

        → Ready to update cluster configs now?```

Scenario 3: Move to Next Load Level

Human: "2x test complete. Commit results and start 4x"

Claude: Recording 2x test results:

        In k-repo (commit modified configs):
        cd ${TARGET_REPO}
        git add ./python/klaviyo/.../checks/
        git commit -m "Load test 2x configs - 2025-12-07"

        In audit-load-test repo (commit report):
        cd ~/Klaviyo/Repos/audit-load-test
        git add ./reports/2x_*
        git commit -m "Load test 2x results - [key findings]"

        ✓ Ready for 4x load test
        → Proceed to generate 4x configs?

Scenario 4: Revert Configs to Baseline

Human: "Revert configs to baseline"

Claude: Reverting configs in k-repo:

        cd ${TARGET_REPO}
        git checkout ./python/klaviyo/.../checks/iceberg/

        ✓ All configs reverted to baseline
        → Ready for new load test or other operations

S3 Structure

s3://klaviyo-data-platform-orchestration-v1/

├── checks_load_test/
│   ├── configs/                         # Check configs by load level
│   │   ├── 15m/
│   │   │   ├── 1x/                      # Baseline
│   │   │   ├── 2x/
│   │   │   └── 4x/
│   │   ├── hourly/
│   │   │   ├── 1x/
│   │   │   ├── 2x/
│   │   │   └── 4x/                      # ← includes *_table_metrics.yml
│   │   └── daily/
│   │       ├── 1x/
│   │       ├── 2x/
│   │       └── 4x/
│   └── results/                         # EMR outputs (not used for timing)
│       └── ...
│
└── dags/datalake/audit_loadtest/        # DAG and cluster configs
    ├── yang_audit_load_test.py
    ├── cluster_configs/
    │   ├── yang_audit_load_test_cluster_15m.json
    │   ├── yang_audit_load_test_cluster_hourly.json
    │   └── yang_audit_load_test_cluster_daily.json

Local Repo Structure

audit-load-test/
├── README.md                            # This file
├── .env                                 # Local config (not committed)
├── .env.example                         # Template
├── .gitignore
├── scripts/
│   ├── upload_checks.sh                 # Upload check configs to S3
│   └── upload_dags.sh                   # Upload DAG + cluster configs to S3
├── dags/
│   ├── yang_audit_load_test.py          # Airflow DAG
│   └── cluster_configs/
│       ├── yang_audit_load_test_cluster_15m.json
│       ├── yang_audit_load_test_cluster_hourly.json
│       └── yang_audit_load_test_cluster_daily.json
└── reports/                             # Generated reports
    ├── 2x_20251207_153000/
    │   └── report.md
    └── 4x_20251207_183000/
        └── report.md

Configuration Reference

`.env` File

# Target repository path
TARGET_REPO=~/Klaviyo/Repos/k-repo

# Path to checks configs within target repo
CHECKS_PATH=python/klaviyo/data_platform/table_config_service/unified_table_definitions/checks/iceberg

# AWS S3 bucket
S3_BUCKET=klaviyo-data-platform-orchestration-v1

# Source for table_metrics configs
PROD_HOURLY_CHECKS=s3://klaviyo-data-platform-orchestration-v1/checks/configs/prod/hourly

Troubleshooting

Permission Issues

# Make scripts executable
chmod +x ./scripts/*.sh

# Refresh AWS credentials
s2a-login

Git Branch Issues

# Check current branch
cd ${TARGET_REPO}
git status

# Switch to test branch
git checkout audit_checks_load_testing
git pull

S3 Upload Failures

# Test AWS CLI access
aws s3 ls s3://klaviyo-data-platform-orchestration-v1/

# Check credentials
aws sts get-caller-identity

EMR Cluster Not Found

# Verify cluster ID format
aws emr list-clusters --active

# Check region
aws configure get region

Config Modification Issues

Problem: Claude modified wrong intervals

Solution: Tell Claude the specific issue:

"Fix check_events_15m.yml: change INTERVAL '6' hour back to INTERVAL '4' hour"

Tips and Best Practices

Always review git diff before uploading configs
Commit after each successful load test for record keeping
Use descriptive commit messages with key findings
Don't merge test branches - keep them separate for load testing only
Update cluster configs iteratively based on performance reports
Use PyCharm for better diff visualization
Keep reports in version control for historical comparison
Test incrementally - start with 2x before jumping to 4x or higher

FAQ

Q: Can I run multiple load levels in parallel?
A: No, run one load level at a time to get accurate performance measurements.

Q: Do I need to upload DAGs every time?
A: Only when cluster configs change. Check configs can be uploaded independently.

Q: What if I want to test 3x or 5x load?
A: Just tell Claude "start 3x load test" - it works with any multiplier.

Q: Can I revert configs without Claude?
A: Yes, just run git checkout . in k-repo to revert all changes.

Q: Where can I find audit test results?
A: Check CloudWatch Logs or S3 at s3://.../checks_load_test/results/

Q: How do I compare two load test results?
A: Ask Claude: "Compare reports from 2x and 4x load tests"

Version History

v1.0 (2025-12-07) - Initial framework with Claude Code integration

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
dags		dags
reports		reports
scripts		scripts
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Audit Checks Load Testing

Prerequisites

Setup (One-Time)

Quick Start

Load Test Workflow

Phase 1: Preparation (HUMAN)

Phase 2: Generate Load Test Configs (CLAUDE)

Phase 3: Upload Configs to S3 (CLAUDE)

3a. Upload Check Configs

3b. Upload DAG and Cluster Configs

Phase 4: Trigger Load Test (HUMAN)

Phase 5: Monitor & Generate Report (CLAUDE)

Multiple Test Rounds on Same Cluster

Iteration Workflows

Scenario 1: Fix Config Generation Issues

Scenario 2: Tune Cluster Performance and Rerun

Scenario 3: Move to Next Load Level

Scenario 4: Revert Configs to Baseline

S3 Structure

Local Repo Structure

Configuration Reference

.env File

Troubleshooting

Permission Issues

Git Branch Issues

S3 Upload Failures

EMR Cluster Not Found

Config Modification Issues

Tips and Best Practices

FAQ

Version History

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`.env` File

Packages