Optimize Memory Usage by Using Exported Program Instead of PyTorch Module in Converter Signature #201

KennethanCeyer · 2024-09-06T09:03:39Z

Related Issue

OOM Error in Gemini 2 2B TFLite Conversion with Quantization on 80GB RAM #192

Overview

This update enhances memory efficiency during model conversion by replacing the use of the PyTorch module with the exported program in the converter signature.

When defining the signature, the PyTorch model is cleaned up before proceeding with the TFLite conversion process, leading to a memory usage improvement of approximately 7-11%. This is achieved by refining the handling of model tensors, as demonstrated below:

pytorch_model = gemma2.build_2b_model(
    checkpoint_path, kv_cache_max_len=kv_cache_max_len
)
# Tensors used to trace the model graph during conversion.
prefill_tokens = torch.full((1, prefill_seq_len), 0, dtype=torch.long)
prefill_input_pos = torch.arange(0, prefill_seq_len)
decode_token = torch.tensor([[0]], dtype=torch.long)
decode_input_pos = torch.tensor([0], dtype=torch.int64)

quant_config = quant_recipes.full_int8_dynamic_recipe() if quantize else None
converter = (
    ai_edge_torch
      .signature(
          "prefill", pytorch_model, (prefill_tokens, prefill_input_pos)
      )
      .signature("decode", pytorch_model, (decode_token, decode_input_pos))
)

del pytorch_model
gc.collect()

edge_model = converter.convert(quant_config=quant_config)
edge_model.export(output_path)

KennethanCeyer added 3 commits September 6, 2024 16:52

feat: replace pytorch.nn.Module to ExportedProgram in the Signature

4ba7683

Merge branch 'google-ai-edge:main' into feat/perf-optimize-memory

1176a0f

lint: apply format

f0f1a66

KennethanCeyer requested a review from a team as a code owner September 6, 2024 09:03

chunnienc requested a review from majiddadashi September 6, 2024 16:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize Memory Usage by Using Exported Program Instead of PyTorch Module in Converter Signature #201

Optimize Memory Usage by Using Exported Program Instead of PyTorch Module in Converter Signature #201

KennethanCeyer commented Sep 6, 2024 •

edited

Loading

Optimize Memory Usage by Using Exported Program Instead of PyTorch Module in Converter Signature #201

Are you sure you want to change the base?

Optimize Memory Usage by Using Exported Program Instead of PyTorch Module in Converter Signature #201

Conversation

KennethanCeyer commented Sep 6, 2024 • edited Loading

Related Issue

Overview

KennethanCeyer commented Sep 6, 2024 •

edited

Loading