Update CUDA runtime version to 8.0#4
Conversation
…y allocated memory per block for the computeGradient kernel
|
This is awesome, thanks for doing this. Let me look at your work here... BTW - why do you want to squash the commits before merging? |
|
Great work, and here are some concerns:
|
|
@bryancatanzaro I just thought these changes would be better represented in the log if they were presented as one single "Updated CUDA runtime to 8.0" rather than 16 separate "Updated <module1>", "Updated <module2>", etc. Most projects do squash before merging afaik, but it's up to the maintainer of the repo. EDIT: Changed 7.5 to 8.0 |
|
@hao-lh Correction from my side - this code is compatible with runtime version 8.0, not 7.5. I shall fix the title of the pull request. As for the scope for improvement, I thought this repo implemented everything that was discussed in "Efficient, High-Quality Image Contour Detection" by Catanzaro et al. Is there any specific optimization that you're referring to? |
|
@AdithyaBenny Most of bryan's code was written more than five years ago, since parallel computation and CUDA is evolving actively these years, I was wondering if there exists methods for better performance, totally no offense for bryan's original algorithm and your work, just want this code runs faster :) |
|
Hi @AdithyaBenny , I use the code you commit, and still encounter the problem that cudaErrorIllegalAddress and the error message is CUDA error at lanczos.cu:217 code=77(cudaErrorIllegalAddress) "cudaMemcpy(devVector, d_aVectorQQ, nPixels * sizeof(float), cudaMemcpyDeviceToDevice)", could you help me, thanks a lot. i'm using titanx and ubuntu 14.04 adn i download the acml5.3.0. thanks. |
|
and here is the completely output Eig 9 Tol 0.001000 Texton 1
Max time: 0.000406 seconds
|
I've migrated all API calls to the new CUDA SDK, and fixed the illegal memory access issue (#2). The program runs successfully on Ubuntu 14.04 with a Tesla K20 GPU. The run time is about 2 seconds when given the default Polynesia image.
For it to work, I'd to create two symlinks - one file at
./lib/libblas.sowhich points to/usr/lib/libblas.so.3since BLAS wasn't being detected, and another directory at./lib/acmlwhich points to the location for the uncompressed ACML package. Additionally, I'd to also set the dynamic library path for it to detect ACML, by addingexport LD_LIBRARY_PATH="$HOME/acml5.3.1/ifort64/lib/:$LD_LIBRARY_PATH"to my bashrc.Do remember to squash the commits before merging :)