- CUDA Toolkit (with cuBLAS)
- NVIDIA GPU
- g++ (for host code)
nvcc -o resnet_main resnet_main.cu -lcublas
./resnet_main
- This is a minimal demo: no data/weight loading, only random values.
- Only a single residual block is implemented for demonstration.
- Uses im2col for convolution and cuBLAS for matrix multiplication.
- Extend as needed for full ResNet.
- core/: Tensor, im2col, cuBLAS wrappers
- layers/: Conv2D, BatchNorm, ReLU
- blocks/: BasicBlock (residual)
- models/: ResNet34
- utils/: Utilities
- main.cpp: Entry point
nvcc -o resnet34 main.cpp \
core/tensor.cpp core/im2col.cu core/cublas_utils.cpp \
layers/conv2d.cpp layers/batchnorm.cpp layers/relu.cpp \
blocks/basic_block.cpp models/resnet34.cpp \
-lcublas
./resnet34
- All weights and input are random.
- Only forward pass is implemented.
- Output is the first value of the final feature map.