feat: Add recipe validation integ test for HP-ModelCustomization-RecipeValidator pipeline#5779
feat: Add recipe validation integ test for HP-ModelCustomization-RecipeValidator pipeline#5779mollyheamazon wants to merge 3 commits intoaws:masterfrom
Conversation
| training_type_enum = detect_lora_or_full(recipe_name) | ||
| trainer_class = TRAINER_MAPPING[training_type] | ||
|
|
||
| trainer = trainer_class( |
There was a problem hiding this comment.
Is it sufficient to check here that we can instantiate a trainer class? Could we also submit a test job and verify that interaction with smjobs/k8s will work?
We can potentially use a small/dummy dataset so that the job doesn't run for long but still verify that the end to customer interaction via PySDK will work for new recipes.
There was a problem hiding this comment.
Instantiation-only is the right scope for this validation step — it catches the most likely breakages: schema mismatches, missing fields, and unsupported training types in the hub content fetch → recipe parsing → Trainer construction path.
Running real training jobs would require significant infrastructure changes to the validation account — GPU instance quotas, CreateTrainingJob permissions, per-technique dummy datasets, and cleanup logic, none of which exist today. We do already have e2e integ tests in the PySDK repo that submit real training jobs for a subset of recipes, so the full job path is partially covered. If we want broader e2e coverage for all new recipes, I'd suggest scoping that as a follow-up with its own infrastructure workstream.
There was a problem hiding this comment.
Yes, we do want to be able to test that the job is able to start/run to verify the customer workflow before launch. Could you please add a Note here as a follow up task?
namannandan
left a comment
There was a problem hiding this comment.
Looks good. Please add a note to include job submission as well to the test as a follow up task.
| training_type_enum = detect_lora_or_full(recipe_name) | ||
| trainer_class = TRAINER_MAPPING[training_type] | ||
|
|
||
| trainer = trainer_class( |
There was a problem hiding this comment.
Yes, we do want to be able to test that the job is able to start/run to verify the customer workflow before launch. Could you please add a Note here as a follow up task?
|
This test will be live in HP-ModelCustomization-PySDKValidation package. |
Summary
Adds a pytest-based recipe validation test that will be invoked by the
HP-ModelCustomization-RecipeValidatorpipeline to validate that new/modified recipes in a private SageMaker Hub can be fetched, parsed, and used to instantiate the correct sagemaker.train Trainer class.Design doc: https://tiny.amazon.com/mn08ehy8/quipubV4Desi
What does this change do?
When the RecipeValidator pipeline detects new or modified recipes, a CodeBuild project clones this repo and runs
test_new_recipes_create_valid_trainers.The test:
If the test fails, the pipeline halts and recipes don't reach JumpStart.
Files changed
sagemaker-train/tests/integ/train/recipe_tests/__init__.py— new packagesagemaker-train/tests/integ/train/recipe_tests/test_recipe_validation.py— new testTesting
Validated against SageMakerPublicHub in us-west-2 — test successfully iterated all models and validated fine-tuning recipes for gated Llama models (SFT, DPO, RLAIF, RLVR).
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.