This is used to modify the number of available hardware threadi shown to the user to the size of the OpenMP team.
It is needed in order for OpenMP Applications to work on GPU with DeviceRTL MPI.
Also add getMaxTeamWarps that return the numbers of available active Warp/Wavefront in the current team/workgroup/threadblock (taking SPMD mode into account).
Try to add support for vprintf using rpc without implementing variadic arguments.
Using Buildtin to transform fprintf call to vfprintf & then replace vfprintf call to a call to __omp_vprintf.
This is the methode currently used to make printf work using rpcs.
fprintf is not recognise as a builtin for now, so it does not work.
Cast constant declare in adress space 1 to adress space 0.
This is needed for rpc calls as a function call
cannot pass as arguments to a function a pointer in address space 1.
Change Oiffload HSA Queue to use HSA_QUEUE_COOPERATIVE instead of HSA_QUEUE_MULTI.
This is needed in order to get Fair scheduling between teams/workgroup/thread blocks.
(This does not appear to work as specified in HSA-SysArch-1.2)
- add missing headers in rpc.h.def
- add an opcode in libc rpc to handle gpu first host functions calls
- Fixe pointer casting
- Fixe Generated function to account for AMD address space
- remove LibC duplicate FILE declarations
- remove global variable to allow asyncronize rpc call
Summary:
This patch adds new extensions that allow us to export the RPC interface
from `libc` to other programs. This should allow external users of the
GPU `libc` to interface with the RPC client (which more or less behaves
like syscalls in this context). This is done by wrapping the interface
into a C-style function call.
Obviously, this approach is far less safe than the carefully crafted C++
interface. For example, we now expose the internal packet buffer, and it
is theoretically possible to open a single port with conflicting opcodes
and break the whole interface. So, extra care will be needed when
interacting with this. However, the usage is very similar as shown by
the new test.
This somewhat stretches the concept of `libc` just doing `libc` things,
but I think this is important enough to justify it. It is difficult to
split this out, as we do not want to have the situation where we have
multiple RPC clients running at one time, so for now it makes sense to
just leave it in `libc`.
```
%1:gr64 = COPY $eflags
OP1 may update eflags
$eflags = COPY %1
OP2 may use eflags
```
To use eflags as input at 4th instruction, we need to use SETcc to
preserve the eflags before 2, and update the source condition of OP2
according to value in GPR %1.
In this patch, we support CCMP/CTEST as OP2.
```
.irp foo,1
nop
.endr
nop
```
expands to an excess EOL between two nop lines. Other loop directives
and .macro have the same issue.
`Lex()` at "Jump to the macro instantiation and prime the lexer"
requires that there is one single \n token in CurTok. Therefore, we
cannot consume the trailing \n when parsing the macro(-like) body.
(commit c6e787f771 (reverted by
1e5f29af81))
Instead, skip the potential \n after jumpToLoc at handleMacroExit.
The purpose of this PR is to remove the 'etime-function.mlir' file that
I mistakenly committed in
https://github.com/llvm/llvm-project/pull/92571. This file is not
necessary in source code control, and its presence may cause confusion
or misunderstanding.