Skip to content

Meta-Llama-3 model text-generation example output is unexpected on 2 nodes  #1451

@aslanxie

Description

@aslanxie

System Info

deepspeed                 0.14.4+hpu.synapse.v1.18.0
optimum-habana            1.14.0

docker image: vault.habana.ai/gaudi-docker/1.18.0/ubuntu22.04/habanalabs/pytorch-installer-2.4.0:latest

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

  1. Setup 2 nodes for test
  2. Run text generation example
    python3 ../gaudi_spawn.py --hostfile hostfile --use_deepspeed --world_size 16 --master_port 29500 \ run_generation.py \ --model_name_or_path /data1/zhixue/Llama-3.1-70B-Instruct/ \ --bf16 \ --batch_size 1 \ --use_hpu_graphs --limit_hpu_graphs \ --max_new_tokens 512
  3. The generation output looks like:
    10.233.108.205: Input/outputs: 10.233.108.205: input 1: ('DeepSpeed is a machine learning framework',) 10.233.108.205: output 1: ('DeepSpeed is a machine learning framework!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!',)

Expected behavior

If test with model Llama-2-7b-hf, the output is below. I found the issue on the latest meta-llama-3 and meta-llama-3.1 with 2 nodes inference.
10.233.108.205: input 1: ('DeepSpeed is a machine learning framework',) 10.233.108.205: output 1: ('DeepSpeed is a machine learning framework for deep learning. It is designed to be fast and efficient, while also being easy to use. DeepSpeed is based on the TensorFlow framework, and it uses the TensorFlow Lite library to run on mobile devices.\nDeepSpeed is a deep learning framework that is designed to be fast and efficient. It is based on the TensorFlow framework and uses the TensorFlow Lite library to run on mobile devices. DeepSpeed is designed to be easy to use and to provide a high level of performance.\nDeepSpeed is a deep learning framework that is designed to be fast and efficient. It is based on the TensorFlow framework and uses the TensorFlow Lite library to run on mobile devices. DeepSpeed is designed to be easy to use and to provide a high level of performance.\nDeepSpeed is a deep learning framework that is designed to be fast and efficient. It is based on the TensorFlow framework and uses the TensorFlow Lite library to run on mobile devices. DeepSpeed is designed to be easy to use and to provide a high level of performance.\nDeepSpeed is a deep learning framework that is designed to be fast and efficient. It is based on the TensorFlow framework and uses the TensorFlow Lite library to run on mobile devices. DeepSpeed is designed to be easy to use and to provide a high level of performance.\nDeepSpeed is a deep learning framework that is designed to be fast and efficient. It is based on the TensorFlow framework and uses the TensorFlow Lite library to run on mobile devices. DeepSpeed is designed to be easy to use and to provide a high level of performance.\nDeepSpeed is a deep learning framework that is designed to be fast and efficient. It is based on the TensorFlow framework and uses the TensorFlow Lite library to run on mobile devices. DeepSpeed is designed to be easy to use and to provide a high level of performance.\nDeepSpeed is a deep learning framework that is designed to be fast and efficient. It is based on the TensorFlow framework and uses the TensorFlow Lite library to run on mobile devices. DeepSpeed is designed to be easy to use and to provide a high level of performance.\nDeepSpeed is a deep learning framework that is designed to be fast and efficient. It is based on the TensorFlow framework and uses the TensorFlow Lite library to run on mobile devices. DeepSpeed is',)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions