You are not logged in.
So far, I've found that the version of rocm (6.0.2) in the official repositories is broken and out of date.
Try to run stable diffusion --> runs into memory corruption and GPU resets.
So I've taken it upon myself (foolishly) to try to compile it. I found a useful template over at https://www.ixip.net/rocm/ which is recent-ish rocm compile for slackware collection of scripts.
So using that as a basis, I tried compiling piece by piece, executing the scripts line by line (though using the latest git version rather than 6.0 or 6.1, as I wanted to see if I can somehow bisect), but I got stuck on HIP compilation.
First, there's the hip_prof_gen.py script, which is badly documented. Anyone know how to use that thing? This is what I do now, but it's wrong somewhere.
OPT_PROF_API=""
PROF_API_STR="$BASEDIR/09_hip/clr/hipamd/include/hip/amd_detail/hip_prof_str.h"
PROF_API_HDR="$BASEDIR/09_hip/hip/include/hip/hip_runtime_api.h"
PROF_API_SRC="$BASEDIR/09_hip/clr/hipamd/src"
PROF_API_LOG="$BASEDIR/09_hip/build/hip_prof_gen_log.txt"
PROF_API_STR_NEW="$BASEDIR/09_hip/clr/hipamd/include/hip/amd_detail/hip_prof_str.h.new"
# set(PROF_API_CMD "${PROF_API_GEN} -v ${OPT_PROF_API} ${PROF_API_HDR} ${PROF_API_SRC} ${PROF_API_STR} >${PROF_API_LOG}")
python $BASEDIR/09_hip/clr/hipamd/src/hip_prof_gen.py -v ${OPT_PROF_API} ${PROF_API_HDR} ${PROF_API_SRC} ${PROF_API_STR} ${PROF_API_STR_NEW} >${PROF_API_LOG}
This doesn't produce anything, so I'm still stuck with circular reference in hip_prof_str.h. I get various errors depending on how I call this thing.
Warning: bad args: args_str: 'amd::Os::FileDesc fdesc,size_t fsize,const void** image,const std::vector<std::string>& device_names,std::vector<std::pair<const void* ,size_t>>& code_objs' arg_pair: 'std::vector<std::pair<const void* ', file '/home/aphid/src/rocm/09_hip/clr/hipamd/src/hip_code_object.cpp', line (498)
or this one
/home/aphid/src/rocm/09_hip/clr/hipamd/src/hip_prof_gen.py Warning: implementation not found: hipGetErrorString
/home/aphid/src/rocm/09_hip/clr/hipamd/src/hip_prof_gen.py Warning: 1 API calls missing in interception layer
Here, it does seem to output a new file. But trying the new file gets me the same problem as the old one, the reason why I set out on all this trouble in the first place;
ninja: error: dependency cycle: hipamd/include/hip/amd_detail/hip_prof_str.h -> hipamd/include/hip/amd_detail/hip_prof_str.h
Just by default, not modifying it, it gets stuck trying to double-include/redefine a bunch of stuff from cmake, really basic math functions like 'min' and 'max' as well. I've made a couple patches that modify stuff at https://github.com/AphidGit/rocm_compile/tree/main to get around the redefinitions.
Last edited by Aphid_ARCH (2024-07-20 05:31:29)
Offline
So far, I've found that the version of rocm (6.0.2) in the official repositories is broken and out of date.
You can always try opencl-amd as an alternative ROCm library.
Offline
Oh, also, this:
pacman -Sy rocminfo
PATH=$PATH:/opt/rocm:/opt/rocm/bin
rocminfo
#> rocminfo: error while loading shared libraries: librocprofiler-register.so.0: cannot open shared object file: No such file or directory
# Okay, let's try installing that.
pkgfile librocprofiler-register.so.0
#>
# Crickets...
Out of the box it can't even work without hacky workarounds (compiling missing libraries yourself).
Offline
Oh, also, this:
You probably have extra-staging repository enabled which has updated rocminfo to the latest version.
Last edited by Luciddream (2024-08-16 06:45:00)
Offline