Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WiP: resolved some minor errors that raised from the initial clone of the branch #45

Draft
wants to merge 33 commits into
base: feature/openacc-library-routines
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
ca701ff
resolved minor errors in forked repo
monoatamd Nov 4, 2022
1fad76b
additional examples to test acc rtlib
monoatamd Nov 4, 2022
08076e0
calling gpufortrt instead of acc
monoatamd Nov 7, 2022
2929de3
deleted parts not compliant with gpufort
monoatamd Nov 7, 2022
c040997
fixed issue with acc_copy and gpufortrt_copy
monoatamd Nov 7, 2022
1706325
Fixed incorrect templates for copy
monoatamd Nov 7, 2022
3af85dc
acc_is_present impl. with type(*) and dimension
monoatamd Nov 8, 2022
59d4293
avoided camelCase & added implicit none explicitly
monoatamd Nov 8, 2022
af7087e
minor update
monoatamd Nov 8, 2022
a69f450
fixed renaming conflict
monoatamd Nov 8, 2022
aa4c16a
added list of openacc rtlib routines
monoatamd Nov 9, 2022
611c40f
Reverted frontend chngs, added impl. for 4 apis
monoatamd Nov 14, 2022
cfd4e14
added more high priority APIs
monoatamd Nov 16, 2022
423d1ac
added more APIs
monoatamd Nov 16, 2022
486acd3
added acc_get_property
monoatamd Nov 17, 2022
081b5ec
added wait and async APIs
monoatamd Nov 17, 2022
e92324d
making APIs look identical to acc
monoatamd Nov 18, 2022
750bda9
added copyin and copyout
monoatamd Nov 18, 2022
0797f5f
Used sizeof instead of size to get num of bytes
monoatamd Nov 21, 2022
aa63276
added set and get default async
monoatamd Nov 21, 2022
497afdf
added acc_create and test programs
monoatamd Nov 21, 2022
927a09c
added acc_delete and a test program
monoatamd Nov 21, 2022
71a809f
added acc_update/device and a test program
monoatamd Nov 21, 2022
9da7378
added deviceptr and updated implementation_status
monoatamd Nov 22, 2022
bec8050
added acc_get_num_devices and a test program
monoatamd Nov 22, 2022
0562abc
added gpufortrt_is_present and added test programs
monoatamd Nov 23, 2022
26587db
LOG_ERROR updated in gpufortrt_is_present
monoatamd Nov 23, 2022
84fb3fe
added acc_malloc and a test program
monoatamd Nov 23, 2022
487f985
updated acc_malloc & test program, added acc_free
monoatamd Nov 24, 2022
93abafa
added acc_map_data and a test program
monoatamd Nov 25, 2022
3cad2b6
Remved templats & macros. use_device needs revison
monoatamd Dec 6, 2022
246a629
added use_device
monoatamd Dec 8, 2022
871c6b5
removed optional attrib. from non-optional args.
monoatamd Dec 15, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -13,3 +13,4 @@ gpufort.h
gpufort_reductions.h
render*.template.*
render.py.in
.vscode/settings.json
1 change: 1 addition & 0 deletions python/gpufort/scanner/parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -629,6 +629,7 @@ def is_end_statement_(tokens, kind):
current_linemap["statements"]):
try:
expand_statement_functions_(current_statement)
original_statement = current_statement["body"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't accept frontend changes now as I am currently changing it on a separate development branch.
I suggest to keep a local copy and not include to this PR.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please feel free to reject the changes. I have local copies.

original_statement_lower = current_statement["body"].lower()
util.logging.log_debug4(opts.log_prefix,"parse_file","parsing statement '{}' associated with lines [{},{}]".format(original_statement_lower.rstrip(),\
current_linemap["lineno"],current_linemap["lineno"]+len(current_linemap["lines"])-1))
Expand Down
1 change: 0 additions & 1 deletion python/gpufort/util/parsing/parsing.py
Original file line number Diff line number Diff line change
Expand Up @@ -1942,7 +1942,6 @@ def is_declaration(tokens):
def is_blank_line(statement):
return not len(statement.strip())


def is_fortran_directive(statement,modern_fortran):
"""If the statement is a directive."""
return len(statement) > 2 and (modern_fortran and statement.lstrip()[0:2] == "!$"
Expand Down
4 changes: 2 additions & 2 deletions runtime/gpufortrt/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,8 @@ $(CXX_OBJ): %.cpp.o: %.cpp
$(F_OBJ): %.o: %.f90
$(FC) -c $< $(FCFLAGS)

codegen:
python3 codegen.py src/gpufortrt_api.template.f90 -d 7
# codegen:
# python3 codegen.py src/gpufortrt_api.template.f90 -d 7

clean_all:
rm -f *.o *.mod *.a
Expand Down
26 changes: 23 additions & 3 deletions runtime/gpufortrt/include/gpufortrt_api.h
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,18 @@ extern "C" {

void gpufortrt_set_device_num(int dev_num);
int gpufortrt_get_device_num();

// Explicit Fortran interfaces that assume device number starts from 1
void gpufortrt_set_device_num_f(int dev_num);
int gpufortrt_get_device_num_f();

size_t gpufortrt_get_property(int dev_num,
gpufortrt_device_property_t property);
const
char* gpufortrt_get_property_string(int dev_num,
gpufortrt_device_property_t property);

size_t gpufortrt_get_property_f(int dev_num, gpufortrt_device_property_t property);
const char* gpufortrt_get_property_string_f(int dev_num, gpufortrt_device_property_t property);

void gpufortrt_init();
void gpufortrt_shutdown();

Expand Down Expand Up @@ -84,6 +89,10 @@ extern "C" {
void* gpufortrt_present(
void* hostptr,
std::size_t num_bytes);

bool gpufortrt_is_present(
void* hostptr,
std::size_t num_bytes);

void* gpufortrt_create(
void* hostptr,
Expand Down Expand Up @@ -140,13 +149,24 @@ extern "C" {
bool if_arg);
void gpufortrt_wait_all_async(int* async_arg,int num_async,
bool if_arg);

void gpufortrt_wait_device(int* wait_arg, int num_wait,
int dev_num, bool if_arg);
void gpufortrt_wait_device_async(int* wait_arg, int num_wait,
int* async_arg, int num_async,
int dev_num, bool if_arg);
void gpufortrt_wait_all_device(int dev_num, bool if_arg);
void gpufortrt_wait_all_device_async(int* async_arg, int num_async,
int dev_num, bool if_arg);
int gpufortrt_async_test(int wait_arg);
int gpufortrt_async_test_device(int wait_arg, int dev_num);
int gpufortrt_async_test_all(void);
int gpufortrt_async_test_all_device(int dev_num);

gpufortrt_queue_t gpufortrt_get_stream(int async_arg);
void* gpufortrt_malloc(size_t bytes);
void gpufortrt_free(void* data_dev);
void gpufortrt_map_data(void* data_arg, void* data_dev,
size_t bytes);

/** \return device pointer associated with `hostptr`, or nullptr.
* First searches through the structured region stack and then
Expand Down
2 changes: 1 addition & 1 deletion runtime/gpufortrt/include/gpufortrt_types.h
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ extern "C" {
extern int gpufortrt_async_sync;
extern int gpufortrt_async_default;

enum gpufortrt_property_t {
enum gpufortrt_device_property_t {
gpufortrt_property_memory = 0,//>integer, size of device memory in bytes
gpufortrt_property_free_memory,//>integer, free device memory in bytes
gpufortrt_property_shared_memory_support,//>integer, nonzero if the specified device supports sharing memory with the local thread
Expand Down
36 changes: 22 additions & 14 deletions runtime/gpufortrt/include/openacc.h
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,11 @@
// Copyright (c) 2020-2022 Advanced Micro Devices, Inc. All rights reserved.
#ifndef OPENACC_LIB_H
#define OPENACC_LIB_H
#include "gpufortrt_types.h"
#ifdef __cplusplus
extern "C" {
#endif
#include "gpufortrt_types.h"


extern int acc_async_noval;
extern int acc_async_sync;
Expand All @@ -16,18 +17,17 @@ extern int acc_async_default;

/** \note Enum values assigned according to `acc_set_device_num` description.*/
enum acc_device_t {
acc_device_default = -1,
acc_device_all = 0,
acc_device_none = 1,
acc_device_none = 0,
acc_device_default,
acc_device_host,
acc_device_current,
acc_device_not_host,
acc_device_hip = acc_device_not_host,
acc_device_radeon = acc_device_hip,
acc_device_nvidia = acc_device_hip
acc_device_current,
acc_device_hip,
acc_device_radeon,
acc_device_nvidia
};

enum acc_property_t {
enum acc_device_property_t {
acc_property_memory = 0,//>integer, size of device memory in bytes
acc_property_free_memory,//>integer, free device memory in bytes
acc_property_shared_memory_support,//>integer, nonzero if the specified device supports sharing memory with the local thread
Expand All @@ -51,7 +51,9 @@ void acc_set_device_type(acc_device_t dev_type);
acc_device_t acc_get_device_type(void);

void acc_set_device_num(int dev_num, acc_device_t dev_type);
void acc_set_device_num_f(int dev_num, acc_device_t dev_type);
int acc_get_device_num(acc_device_t dev_type);
int acc_get_device_num_f(acc_device_t dev_type);

size_t acc_get_property(int dev_num,
acc_device_t dev_type,
Expand All @@ -61,7 +63,14 @@ char* acc_get_property_string(int dev_num,
acc_device_t dev_type,
acc_device_property_t property);

void acc_init(acc_on_device_t dev_type);
size_t acc_get_property_f(int dev_num,
acc_device_t dev_type,
acc_device_property_t property);
const
char* acc_get_property_string_f(int dev_num,
acc_device_t dev_type,
acc_device_property_t property);
void acc_init(acc_device_t dev_type);
void acc_shutdown(acc_device_t dev_type);

int acc_async_test(int wait_arg);
Expand All @@ -72,9 +81,8 @@ int acc_async_test_all_device(int dev_num);
void acc_wait(int wait_arg);
void acc_wait_device(int wait_arg, int dev_num);
void acc_wait_async(int wait_arg, int async_arg);
void acc_wait_device_async(int wait_arg, int async_arg,
int dev_num);
void acc_wait_all(void);
void acc_wait_device_async(int wait_arg, int async_arg, int dev_num);
void acc_wait_all();
void acc_wait_all_device(int dev_num);
void acc_wait_all_async(int async_arg);
void acc_wait_all_device_async(int async_arg, int dev_num);
Expand Down Expand Up @@ -141,7 +149,7 @@ void acc_memcpy_from_device(h_void* data_host_dest,
d_void* data_dev_src, size_t bytes);
void acc_memcpy_from_device_async(h_void* data_host_dest,
d_void* data_dev_src, size_t bytes,
int async_arg)
int async_arg);

void acc_attach(h_void** ptr_addr);
void acc_attach_async(h_void** ptr_addr, int async_arg);
Expand Down
57 changes: 57 additions & 0 deletions runtime/gpufortrt/openacc_library_routines.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
---
geometry: margin=2cm
---

# Implemented API

| API | Lang\* | OpenACC | GPUFORTRT\*\* | Priority\*\*\* |
|-----|--------|---------|-------------|----------|
|acc\_get\_num\_devices|C/C++, Fortran|implemented|implemented|high|
|acc\_set\_device\_type|C/C++, Fortran|implemented|implemented|high|
|acc\_get\_device\_type|C/C++, Fortran|implemented|implemented|high|
|acc\_set\_device\_num|C/C++, Fortran|implemented|implemented||
|acc\_get\_device\_num|C/C++, Fortran|implemented|implemented||
|acc\_get\_property|C/C++, Fortran|implemented|implemented||
|acc\_init|C/C++, Fortran|implemented|implemented||
|acc\_shutdown|C/C++, Fortran|implemented|implemented||
|acc\_async\_test|C/C++, Fortran|implemented|implemented||
|acc\_async\_test\_device|C/C++, Fortran|implemented|implemented||
|acc\_async\_test\_all|C/C++, Fortran|implemented|implemented||
|acc\_async\_test\_all\_device|C/C++, Fortran|implemented|implemented||
|acc\_wait|C/C++, Fortran|implemented|implemented||
|acc\_wait\_device|C/C++, Fortran|implemented|implemented|high|
|acc\_wait\_async|C/C++, Fortran|implemented|implemented||
|acc\_wait\_device\_async|C/C++, Fortran|implemented|implemented|high|
|acc\_wait\_all|C/C++, Fortran|implemented|implemented||
|acc\_wait\_all\_device|C/C++, Fortran|implemented|implemented|high|
|acc\_wait\_all\_async|C/C++, Fortran|implemented|implemented||
|acc\_wait\_all\_device\_async|C/C++, Fortran|implemented|implemented|high|
|acc\_get\_default\_async|C/C++, Fortran|implemented|implemented||
|acc\_set\_default\_async|C/C++, Fortran|implemented|implemented||
|acc\_on\_device||||low|
|acc\_malloc||||low|
|acc\_free||||low|
|acc\_copyin|C/C++, Fortran|implemented|implemented||
|acc\_create|C/C++, Fortran|implemented|implemented||
|acc\_copyout|C/C++, Fortran|implemented|implemented||
|acc\_delete|C/C++, Fortran|implemented|implemented||
|acc\_update\_device|C/C++, Fortran|implemented|implemented||
|acc\_update\_self|C/C++, Fortran|implemented|implemented||
|acc\_map\_data||||low|
|acc\_unmap\_data||||low|
|acc\_deviceptr|C/C++||implemented||
|acc\_hostptr|C/C++|||low|
|acc\_is\_present|||implemented||
|acc\_memcpy\_to\_device||||low|
|acc\_memcpy\_from\_device||||low|
|acc\_memcpy\_device||||low|
|acc\_attach||||low|
|acc\_detach||||low|
|acc\_memcpy\_d2d||||low|

Remarks:

* \* While some APIs are exposed only to C according to the OpenACC standard, `GPUFORTRT` may expose some C interfaces also to Fortran. An \* indicates that this feature was exposed by the GPUFORTRT to Fortran despite the OpenACC standard not requiring this.
* \*\* `GPUFORTRT` signatures are prefixd by `gpufortrt_` instead of `acc_` and the number and meaning of
arguments may differ compared to the OpenACC signature.
* \*\*\* Current priorities for implementing missing APIs. This column will disappear as soon as all are implemented.
2 changes: 1 addition & 1 deletion runtime/gpufortrt/rules.mk
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ HIP_PLATFORM ?= amd
LIBGPUFORTRT = libgpufortrt_$(HIP_PLATFORM).a

FC = gfortran -fmax-errors=5
FCFLAGS ?= -std=f2008 -ffree-line-length-none -cpp
FCFLAGS ?= -ffree-line-length-none -cpp

#FCFLAGS += -g -ggdb -O0 -fbacktrace -fmax-errors=5 # -DDEBUG=3

Expand Down
Loading