Skip to content

Support TranslateGemma 4B text-only inference (Gemma 3 4B LM-only, no VIT) #888

@Selami79

Description

@Selami79

Summary

TranslateGemma 4B (google/translategemma-4b-it) is Google's translation-specific model built on Gemma 3 4B architecture. It is a text-only model (no vision encoder needed for inference), but gemma.cpp currently only supports the VLM variant of Gemma 3 4B.

Problem

When running TranslateGemma 4B with gemma.cpp:

  1. SBS conversion: convert_from_safetensors.py assumes PaliGemma VLM format — requires vision tower tensors that TranslateGemma doesn't have
  2. Loading: ConfigGemma3_4B() returns VLM config with vit_config.image_size=896, causing Tensor enc_norm_bias is required but not found in file error
  3. No LM-only dispatch: ConfigGemma3_4B_LM() exists in code but is never used as the primary config

What We Did (Workarounds)

We successfully ran TranslateGemma on gemma.cpp with these changes:

1. Convert script modifications

  • Skip vision_tower.* and multi_modal_projector.* tensors during loading
  • Fix vocab_size (262144 instead of PaliGemma's 257152+64 trim)
  • Add QK norm tensors (query_norm, key_norm) to layer config as BF16
  • Zero out vit_config in SBS metadata before writing

2. C++ changes needed

  • configs.cc: Dispatch GEMMA3_4B to ConfigGemma3_4B_LM() when no VIT tensors present
  • tensor_info.cc: Guard VIT tensor registration with if (config.vit_config.image_size > 0)
  • weights.h: Conditional VIT MatPtr initialization
  • python/configs.cc: Add missing Gemma 3 model enums (GEMMA3_1B, GEMMA3_4B, GEMMA3_12B, GEMMA3_27B)

3. Result

  • TranslateGemma 4B runs successfully on CPU with SFP 8-bit format
  • 4.3GB SBS file, translation works across 55+ languages
  • All 34 layers + QK norms correctly loaded

Feature Request

  1. Auto-detect LM-only vs VLM — when VIT tensors are absent in SBS, use ConfigGemma3_*_LM() instead of VLM config
  2. Update convert_from_safetensors.py to support text-only Gemma 3 models (not just PaliGemma)
  3. Add Gemma 3 4B/12B/27B to Python enum in python/configs.cc

Environment

  • gemma.cpp: latest main (April 2026)
  • CPU: AMD EPYC (AVX-512 VNNI)
  • OS: Ubuntu 24.04
  • Model: google/translategemma-4b-it

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions