Skip to content

feat: Add recipe validation integ test for HP-ModelCustomization-RecipeValidator pipeline#5779

Closed
mollyheamazon wants to merge 3 commits intoaws:masterfrom
mollyheamazon:feat/recipe-integ
Closed

feat: Add recipe validation integ test for HP-ModelCustomization-RecipeValidator pipeline#5779
mollyheamazon wants to merge 3 commits intoaws:masterfrom
mollyheamazon:feat/recipe-integ

Conversation

@mollyheamazon
Copy link
Copy Markdown
Contributor

Summary

Adds a pytest-based recipe validation test that will be invoked by the HP-ModelCustomization-RecipeValidator pipeline to validate that new/modified recipes in a private SageMaker Hub can be fetched, parsed, and used to instantiate the correct sagemaker.train Trainer class.

Design doc: https://tiny.amazon.com/mn08ehy8/quipubV4Desi

What does this change do?

When the RecipeValidator pipeline detects new or modified recipes, a CodeBuild project clones this repo and runs test_new_recipes_create_valid_trainers.

The test:

  1. Reads HYPERPOD_HUB_NAME from the environment (set by the pipeline's CodeBuildTrigger)
  2. Lists all models in the private hub via list_hub_contents
  3. For each model, parses the RecipeCollection and filters for FineTuning recipes
  4. Detects training type (SFT/DPO/RLAIF/RLVR) and LoRA vs full fine-tuning from the recipe name
  5. Instantiates the corresponding Trainer (SFTTrainer, DPOTrainer, RLAIFTrainer, RLVRTrainer)
  6. Collects all errors across all models and reports them in a single assertion

If the test fails, the pipeline halts and recipes don't reach JumpStart.

Files changed

  • sagemaker-train/tests/integ/train/recipe_tests/__init__.py — new package
  • sagemaker-train/tests/integ/train/recipe_tests/test_recipe_validation.py — new test

Testing

Validated against SageMakerPublicHub in us-west-2 — test successfully iterated all models and validated fine-tuning recipes for gated Llama models (SFT, DPO, RLAIF, RLVR).

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@mollyheamazon mollyheamazon marked this pull request as ready for review April 20, 2026 21:34
training_type_enum = detect_lora_or_full(recipe_name)
trainer_class = TRAINER_MAPPING[training_type]

trainer = trainer_class(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it sufficient to check here that we can instantiate a trainer class? Could we also submit a test job and verify that interaction with smjobs/k8s will work?

We can potentially use a small/dummy dataset so that the job doesn't run for long but still verify that the end to customer interaction via PySDK will work for new recipes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instantiation-only is the right scope for this validation step — it catches the most likely breakages: schema mismatches, missing fields, and unsupported training types in the hub content fetch → recipe parsing → Trainer construction path.

Running real training jobs would require significant infrastructure changes to the validation account — GPU instance quotas, CreateTrainingJob permissions, per-technique dummy datasets, and cleanup logic, none of which exist today. We do already have e2e integ tests in the PySDK repo that submit real training jobs for a subset of recipes, so the full job path is partially covered. If we want broader e2e coverage for all new recipes, I'd suggest scoping that as a follow-up with its own infrastructure workstream.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we do want to be able to test that the job is able to start/run to verify the customer workflow before launch. Could you please add a Note here as a follow up task?

Copy link
Copy Markdown

@namannandan namannandan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Please add a note to include job submission as well to the test as a follow up task.

training_type_enum = detect_lora_or_full(recipe_name)
trainer_class = TRAINER_MAPPING[training_type]

trainer = trainer_class(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we do want to be able to test that the job is able to start/run to verify the customer workflow before launch. Could you please add a Note here as a follow up task?

@mollyheamazon
Copy link
Copy Markdown
Contributor Author

This test will be live in HP-ModelCustomization-PySDKValidation package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants