Skip to content
This repository was archived by the owner on Jul 9, 2025. It is now read-only.
This repository was archived by the owner on Jul 9, 2025. It is now read-only.

Segmentation fault when train the net #17

@Harvey-Mei

Description

@Harvey-Mei

python train.py --batch_size 24 --experiment_name shapenet-ldif
--model_directory $models --model_type "ldif"
--dataset_directory $dataset
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
INFO: Making dataset...
INFO: Optimized dataset detected at ./shapenet/optimized
INFO: Mapping...
INFO: is_invalid vs lower_coords: [24, 32, 1] vs [24, 32, 3]
INFO: Post-where lower_coords: [24, 32, 3]
INFO: is_invalid vs sdf coords: [24, 32, 1] vs [24, 32, 1]
INFO: In-out image summaries have been removed.
INFO: The 0-th GPU has 22390 MB free.
INFO: TensorFlow can use up to 93.1397945511389% of the total GPU memory.
INFO: Initializing variables...
INFO: No previous checkpoint detected, training from scratch.
Fatal Python error: Segmentation fault

Thread 0x00007fd78cff9700 (most recent call first):
File "/home/mayo/anaconda3/envs/tf-1.15/lib/python3.8/threading.py", line 302 in wait
File "/home/mayo/anaconda3/envs/tf-1.15/lib/python3.8/queue.py", line 170 in get
File "/home/mayo/anaconda3/envs/tf-1.15/lib/python3.8/site-packages/tensorflow_core/python/summary/writer/event_file_writer.py", line 159 in run
File "/home/mayo/anaconda3/envs/tf-1.15/lib/python3.8/threading.py", line 932 in _bootstrap_inner
File "/home/mayo/anaconda3/envs/tf-1.15/lib/python3.8/threading.py", line 890 in _bootstrap

Thread 0x00007fd9b5258340 (most recent call first):
File "/home/mayo/anaconda3/envs/tf-1.15/lib/python3.8/site-packages/tensorflow_core/python/client/session.py", line 1441 in _call_tf_sessionrun
File "/home/mayo/anaconda3/envs/tf-1.15/lib/python3.8/site-packages/tensorflow_core/python/client/session.py", line 1349 in _run_fn
File "/home/mayo/anaconda3/envs/tf-1.15/lib/python3.8/site-packages/tensorflow_core/python/client/session.py", line 1365 in _do_call
File "/home/mayo/anaconda3/envs/tf-1.15/lib/python3.8/site-packages/tensorflow_core/python/client/session.py", line 1358 in _do_run
File "/home/mayo/anaconda3/envs/tf-1.15/lib/python3.8/site-packages/tensorflow_core/python/client/session.py", line 1179 in _run
File "/home/mayo/anaconda3/envs/tf-1.15/lib/python3.8/site-packages/tensorflow_core/python/client/session.py", line 955 in run
File "train.py", line 263 in main
File "/home/mayo/anaconda3/envs/tf-1.15/lib/python3.8/site-packages/absl/app.py", line 258 in _run_main
File "/home/mayo/anaconda3/envs/tf-1.15/lib/python3.8/site-packages/absl/app.py", line 312 in run
File "train.py", line 283 in
./reproduce_shapenet_autoencoder.sh: line 50: 1295263 Segmentation fault (core dumped) python train.py --batch_size 24 --experiment_name shapenet-ldif --model_directory $models --model_type "ldif" --dataset_directory $dataset

I have generated the dataset from raw ShapnetCoreV1/03001627 models, by converting .obj file to .ply and then generating watertight .ply file using gaps tools. After that I used the command in the script named reproduce_shapenet_autoencoder.sh to make dataset, everything done successfully. But when I tried to train the net with the dataset, it failed and got the log showed above.

BTW, the enviroment with my computer: ubuntu20.04 with RTX3090, cuda version = 11.1, and I run the code on tensorflow-1.15.
Could you give me some advice for this issue?
Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions