Support TranslateGemma 4B text-only inference (Gemma 3 4B LM-only, no VIT)

## Summary

TranslateGemma 4B (`google/translategemma-4b-it`) is Google's translation-specific model built on Gemma 3 4B architecture. It is a **text-only** model (no vision encoder needed for inference), but gemma.cpp currently only supports the VLM variant of Gemma 3 4B.

## Problem

When running TranslateGemma 4B with gemma.cpp:

1. **SBS conversion**: `convert_from_safetensors.py` assumes PaliGemma VLM format — requires vision tower tensors that TranslateGemma doesn't have
2. **Loading**: `ConfigGemma3_4B()` returns VLM config with `vit_config.image_size=896`, causing `Tensor enc_norm_bias is required but not found in file` error
3. **No LM-only dispatch**: `ConfigGemma3_4B_LM()` exists in code but is never used as the primary config

## What We Did (Workarounds)

We successfully ran TranslateGemma on gemma.cpp with these changes:

### 1. Convert script modifications
- Skip `vision_tower.*` and `multi_modal_projector.*` tensors during loading
- Fix vocab_size (262144 instead of PaliGemma's 257152+64 trim)
- Add QK norm tensors (`query_norm`, `key_norm`) to layer config as BF16
- Zero out `vit_config` in SBS metadata before writing

### 2. C++ changes needed
- **configs.cc**: Dispatch `GEMMA3_4B` to `ConfigGemma3_4B_LM()` when no VIT tensors present
- **tensor_info.cc**: Guard VIT tensor registration with `if (config.vit_config.image_size > 0)`
- **weights.h**: Conditional VIT MatPtr initialization
- **python/configs.cc**: Add missing Gemma 3 model enums (`GEMMA3_1B`, `GEMMA3_4B`, `GEMMA3_12B`, `GEMMA3_27B`)

### 3. Result
- TranslateGemma 4B runs successfully on CPU with SFP 8-bit format
- 4.3GB SBS file, translation works across 55+ languages
- All 34 layers + QK norms correctly loaded

## Feature Request

1. **Auto-detect LM-only vs VLM** — when VIT tensors are absent in SBS, use `ConfigGemma3_*_LM()` instead of VLM config
2. **Update `convert_from_safetensors.py`** to support text-only Gemma 3 models (not just PaliGemma)
3. **Add Gemma 3 4B/12B/27B to Python enum** in `python/configs.cc`

## Environment
- gemma.cpp: latest main (April 2026)
- CPU: AMD EPYC (AVX-512 VNNI)
- OS: Ubuntu 24.04
- Model: google/translategemma-4b-it

## References
- TranslateGemma paper: https://arxiv.org/abs/2601.09012
- TranslateGemma on HuggingFace: https://huggingface.co/google/translategemma-4b-it
- SL Translator Pro (production use case): https://sltranslate.com

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support TranslateGemma 4B text-only inference (Gemma 3 4B LM-only, no VIT) #888

Summary

Problem

What We Did (Workarounds)

1. Convert script modifications

2. C++ changes needed

3. Result

Feature Request

Environment

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support TranslateGemma 4B text-only inference (Gemma 3 4B LM-only, no VIT) #888

Description

Summary

Problem

What We Did (Workarounds)

1. Convert script modifications

2. C++ changes needed

3. Result

Feature Request

Environment

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions