Support gpu by xllgit · Pull Request #42 · asu-cactus/cactusdb

xllgit · 2024-03-19T01:43:53Z

This branch support GPU and need install CUDA and change libtorch-cpu to libtorch-gpu.

lixi-zhou

The CUDA code looks good to me. Please see the comments for the library configuration, thank you.

lixi-zhou · 2024-03-19T17:10:05Z

 endif

-NUM_THREADS ?= $(shell getconf _NPROCESSORS_CONF 2>/dev/null || echo 1)
+#NUM_THREADS ?= $(shell getconf _NPROCESSORS_CONF 2>/dev/null || echo 1)


Please discard the hard-coded num_threads, it should be set automatically

lixi-zhou · 2024-03-19T17:10:57Z

 target_link_libraries(velox_gpu_hash_table_test Folly::folly gflags::gflags)
 set_target_properties(velox_gpu_hash_table_test PROPERTIES CUDA_ARCHITECTURES
-                                                           native)
+                                                           75)


Could you please clarify the reason for the changes as well as the following similar changes.

lixi-zhou · 2024-03-19T17:19:24Z


 find_package(Torch REQUIRED)
 find_package(xgboost REQUIRED)
+find_package(CUDA REQUIRED)


This may lead to a compilation error when comping in a CPU-only option. I see Velox provides a flag, VELOX_ENABLE_GPU. I think it would be better to cooperate this configuration code with the flag, VELOX_ENABLE_GPU

lixi-zhou · 2024-03-19T17:21:48Z

@@ -0,0 +1,53 @@
+#include "velox/ml_functions/gpufunctions.h"


Please move your .cu file to the ml_functions folder.

xllgit · 2024-03-19T21:17:31Z

Hi Lixi, In Makefile, I hard-coded the num-threads because I met the out-of-memory problem when I used all the threads. I mistakenly uploaded this file, and I can change it. set_target_properties(velox_gpu_hash_table_test PROPERTIES CUDA_ARCHITECTURES - native) + 75) The reason why I changed "native" to '75' is because CMake does not support 'native' before version 3.24, The CMakeLists doesn't need to change if you use CMake 3.24 or above. https://cmake.org/cmake/help/latest/prop_tgt/CUDA_ARCHITECTURES.html +find_package(CUDA REQUIRED) For now, we used the 'use_gpu' flag and mixed the cuda code and CPU code, To compile the GPU function, the cuda package is necessary. If we want to compile in a CPU-only option, I think one way is to separate the Matrix multiply class into two versions(CPU and GPU), another way is we compile the GPU function first and provide it by a library.

…

On Tue, Mar 19, 2024 at 10:22 AM Lixi Zhou ***@***.***> wrote: ***@***.**** commented on this pull request. The CUDA code looks good to me. Please see the comments for the library configuration, thank you. ------------------------------ In Makefile <#42 (comment)>: > @@ -63,7 +63,8 @@ GENERATOR += -DVELOX_FORCE_COLORED_OUTPUT=ON endif endif -NUM_THREADS ?= $(shell getconf _NPROCESSORS_CONF 2>/dev/null || echo 1) +#NUM_THREADS ?= $(shell getconf _NPROCESSORS_CONF 2>/dev/null || echo 1) Please discard the hard-coded num_threads, it should be set automatically ------------------------------ In velox/experimental/gpu/tests/CMakeLists.txt <#42 (comment)>: > @@ -15,4 +15,4 @@ add_executable(velox_gpu_hash_table_test HashTableTest.cu) target_link_libraries(velox_gpu_hash_table_test Folly::folly gflags::gflags) set_target_properties(velox_gpu_hash_table_test PROPERTIES CUDA_ARCHITECTURES - native) + 75) Could you please clarify the reason for the changes as well as the following similar changes. ------------------------------ In velox/ml_functions/CMakeLists.txt <#42 (comment)>: > @@ -19,10 +19,14 @@ set(CMAKE_PREFIX_PATH "$CONDA_PREFIX") find_package(Torch REQUIRED) find_package(xgboost REQUIRED) +find_package(CUDA REQUIRED) This may lead to a compilation error when comping in a CPU-only option. I see Velox provides a flag, VELOX_ENABLE_GPU. I think it would be better to cooperate this configuration code with the flag, VELOX_ENABLE_GPU ------------------------------ In velox/ml_functions/tests/GPUFunctions.cu <#42 (comment)>: > @@ -0,0 +1,53 @@ +#include "velox/ml_functions/gpufunctions.h" Please move your .cu file to the ml_functions folder. — Reply to this email directly, view it on GitHub <#42 (review)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AXMZ6CJQGNKUUJFVFWFJM43YZBX5VAVCNFSM6AAAAABE4U2KQWVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMYTSNBWHA4DCMZYGA> . You are receiving this because you authored the thread.Message ID: ***@***.***>

lixi-zhou · 2024-03-19T23:17:08Z

Hi Lixi, In Makefile, I hard-coded the num-threads because I met the out-of-memory problem when I used all the threads. I mistakenly uploaded this file, and I can change it. set_target_properties(velox_gpu_hash_table_test PROPERTIES CUDA_ARCHITECTURES - native) + 75) The reason why I changed "native" to '75' is because CMake does not support 'native' before version 3.24, The CMakeLists doesn't need to change if you use CMake 3.24 or above. https://cmake.org/cmake/help/latest/prop_tgt/CUDA_ARCHITECTURES.html

Thanks for the feedback. In Velox, I think we can leverage this line's code: CMakeLists.txt#L371 to automatically set the cuda architecture.

find_package(CUDA REQUIRED) For now, we used the 'use_gpu' flag and mixed the cuda code and CPU code, To compile the GPU function, the cuda package is necessary. If we want to compile in a CPU-only option, I think one way is to separate the Matrix multiply class into two versions(CPU and GPU), another way is we compile the GPU function first and provide it by a library.

Yes, I agree with you. Let me think about this and get back to you later about how to resolve it more conveniently.

xllgit · 2024-03-20T04:05:51Z

Thanks for the feedback. In Velox, I think we can leverage this line's code: CMakeLists.txt#L371 to automatically set the cuda architecture.

Yeah, but it did not work for me. CMake did not set the Cuda architecture, so I hard-coded it.

xllgit added 2 commits March 14, 2024 18:27

Update for gpu matrix multiply support

27cba60

add gpu support for torchDNN_two_layers

4954a22

lixi-zhou reviewed Mar 19, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support gpu#42

Support gpu#42
xllgit wants to merge 2 commits into
mainfrom
support_GPU

xllgit commented Mar 19, 2024

Uh oh!

lixi-zhou left a comment

Uh oh!

lixi-zhou Mar 19, 2024

Uh oh!

lixi-zhou Mar 19, 2024

Uh oh!

lixi-zhou Mar 19, 2024

Uh oh!

lixi-zhou Mar 19, 2024

Uh oh!

xllgit commented Mar 19, 2024 via email

Uh oh!

lixi-zhou commented Mar 19, 2024

Uh oh!

xllgit commented Mar 20, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		@@ -0,0 +1,53 @@
		#include "velox/ml_functions/gpufunctions.h"

Uh oh!

Conversation

xllgit commented Mar 19, 2024

Uh oh!

lixi-zhou left a comment

Choose a reason for hiding this comment

Uh oh!

lixi-zhou Mar 19, 2024

Choose a reason for hiding this comment

Uh oh!

lixi-zhou Mar 19, 2024

Choose a reason for hiding this comment

Uh oh!

lixi-zhou Mar 19, 2024

Choose a reason for hiding this comment

Uh oh!

lixi-zhou Mar 19, 2024

Choose a reason for hiding this comment

Uh oh!

xllgit commented Mar 19, 2024 via email

Uh oh!

lixi-zhou commented Mar 19, 2024

Uh oh!

xllgit commented Mar 20, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants