Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI improvements #51

Draft
wants to merge 110 commits into
base: master
Choose a base branch
from
Draft

CI improvements #51

wants to merge 110 commits into from

Conversation

janheinrichmerker
Copy link
Contributor

@janheinrichmerker janheinrichmerker commented Mar 11, 2025

@janheinrichmerker janheinrichmerker added the enhancement New feature or request label Mar 11, 2025
@janheinrichmerker janheinrichmerker self-assigned this Mar 11, 2025
@janheinrichmerker janheinrichmerker marked this pull request as draft March 11, 2025 01:20
@janheinrichmerker janheinrichmerker linked an issue Mar 11, 2025 that may be closed by this pull request
Copy link

codecov bot commented Mar 11, 2025

Codecov Report

Attention: Patch coverage is 0% with 2 lines in your changes missing coverage. Please review.

Project coverage is 78.27%. Comparing base (7e33d4a) to head (9649f0c).
Report is 56 commits behind head on master.

Files with missing lines Patch % Lines
...y/src/main/kotlin/io/tira/tirex/tracker/Tracker.kt 0.00% 0 Missing and 1 partial ⚠️
python/tirex_tracker/__init__.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master      #51      +/-   ##
============================================
- Coverage     78.36%   78.27%   -0.09%     
  Complexity       20       20              
============================================
  Files             3        3              
  Lines          1151     1151              
  Branches         53       54       +1     
============================================
- Hits            902      901       -1     
  Misses          208      208              
- Partials         41       42       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member

@TheMrSheldon TheMrSheldon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for the changes so far! Though this is still a draft PR, I have reviewed some parts of it to give early feedback on some concerns I have.

Note also that the switch to C++20 is quite drastic and I believe that, the code should at least not run on MSVC (even with the configuration errors ammended that I pointed out in my review). Some language features that are used here were introduced in C++20. E.g., designated initializers which are used in various places). I know that g++ and clang++ are quite leanient with these but some compilers (AFAIK especially MSVC) are not and we can't claim C++17 compatibility yet.

for designated initializers I suggest that we simply comment out the variable names to keep readability intact.

Comment on lines +93 to +94
FetchContent_Declare(doxawesome GIT_REPOSITORY https://github.com/jothepro/doxygen-awesome-css GIT_TAG v2.3.4)
FetchContent_MakeAvailable(doxawesome)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

${doxawesome_SOURCE_DIR} above is populated with FetchContent. As such, FetchContent must precede it.

Comment on lines +68 to +77
set(CPACK_GENERATOR "DEB" CACHE STRING "" FORCE)
set(CPACK_PACKAGE_NAME "tirex-tracker" CACHE STRING "" FORCE)
set(CPACK_SET_DESTDIR TRUE CACHE PATH "" FORCE)
set(CPACK_DEBIAN_PACKAGE_MAINTAINER "Tim Hagen" CACHE STRING "" FORCE)
set(CPACK_PACKAGE_VERSION "${TIREX_TRACKER_VERSION}" CACHE STRING "" FORCE)
set(CPACK_DEBIAN_PACKAGE_DESCRIPTION "Automatic resource and metadata tracking for information retrieval experiments." CACHE STRING "" FORCE)
set(CPACK_RESOURCE_FILE_README "${CMAKE_CURRENT_SOURCE_DIR}/../../README.md" CACHE PATH "" FORCE)
set(CPACK_RESOURCE_FILE_LICENSE "${CMAKE_CURRENT_SOURCE_DIR}/../../LICENSE" CACHE PATH "" FORCE)
set(CPACK_DEBIAN_PACKAGE_DEPENDS "" CACHE STRING "" FORCE)
set(CPACK_PACKAGE_VENDOR "" CACHE STRING "" FORCE)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be checked but I don't think they need to be forced and caching them should also not be necessary.

if(NOT TIREX_TRACKER_ONLY_DOCS AND (CMAKE_PROJECT_NAME STREQUAL PROJECT_NAME) AND TIREX_TRACKER_BUILD_DEB)
# Include helper module.
include(CPack)
include(GNUInstallDirs) # FIXME: Is this needed?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this used?

# The Debian Packaging build target is only available if this is the root project
if(NOT TIREX_TRACKER_ONLY_DOCS AND (CMAKE_PROJECT_NAME STREQUAL PROJECT_NAME) AND TIREX_TRACKER_BUILD_DEB)
# Include helper module.
include(CPack)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The include must come after the configuration. See here:

Before including this CPack module in your CMakeLists.txt file, there are a variety of variables that can be set to customize the resulting installers.

##################
# The Debian Packaging build target is only available if this is the root project
if(NOT TIREX_TRACKER_ONLY_DOCS AND (CMAKE_PROJECT_NAME STREQUAL PROJECT_NAME) AND TIREX_TRACKER_BUILD_DEB)
# Include helper module.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment seems redundant to me. It does not really explain anything that is not directly obvious by the line below.

@@ -37,8 +36,8 @@ struct tirexMeasureHandle_st final {
monitorthread.join();

// Stop measuring
for (auto& provider : providers | std::views::reverse)
provider->stop();
for (std::size_t i = providers.size() - 1; providers.size() > i; --i)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would personally prefer a reverse iterator directly instead of relying on a random access container.

Comment on lines +81 to +86
[key, &result](auto&& arg) {
using T = std::decay_t<decltype(arg)>;
if constexpr (std::is_same_v<T, std::string>)
result.emplace_back(key, std::move(arg));
else if constexpr (std::is_same_v<T, tirex::TimeSeries<unsigned>>)
result.emplace_back(key, toYAML(arg));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the advantage over overloading. Especially since the type is then again checked within the function. This doubles the number of conditions per call and looks entirely odd, wheras overloading is a standard design pattern. If overloading is not preferred, the visit should be made explicit via get_if.

struct overloaded : Ts... {
using Ts::operator()...;
};
static std::string toYAML(const tirex::TimeSeries<unsigned>& timeseries) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this only work for tirex::TimeSeries<unsigned>?

Comment on lines +44 to +68
std::stringstream timestampsStream;
timestampsStream << "[";
bool firstTImestamp = true;
for (auto& elem : timestamps) {
if (!firstTImestamp)
timestampsStream << ",";
timestampsStream << "\"";
timestampsStream << elem.count();
timestampsStream << "\"";
firstTImestamp = false;
}
timestampsStream << "]";

template <class... Ts>
overloaded(Ts...) -> overloaded<Ts...>;
std::stringstream valuesStream;
valuesStream << "[";
bool firstValue = true;
for (auto& elem : values) {
if (!firstValue)
valuesStream << ",";
valuesStream << "\"";
valuesStream << elem;
valuesStream << "\"";
firstValue = false;
}
valuesStream << "]";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be split into its own function that is called once for timestamps and once for values.

Comment on lines +429 to +446
if (unified) {
if (!first)
cachesStream << ",";
cachesStream << _fmt::format("\"l{}\": {}", cacheIdx, unified);
first = false;
}
if (instruct) {
if (!first)
cachesStream << ",";
cachesStream << _fmt::format("\"l{}i\": {}", cacheIdx, instruct);
first = false;
}
if (data) {
if (!first)
cachesStream << ",";
cachesStream << _fmt::format("\"l{}d\": {}", cacheIdx, data);
first = false;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To remove all the conditions, I would preferr a delim_seq that is initially "" and the set to ",". Maybe something like this:

Suggested change
if (unified) {
if (!first)
cachesStream << ",";
cachesStream << _fmt::format("\"l{}\": {}", cacheIdx, unified);
first = false;
}
if (instruct) {
if (!first)
cachesStream << ",";
cachesStream << _fmt::format("\"l{}i\": {}", cacheIdx, instruct);
first = false;
}
if (data) {
if (!first)
cachesStream << ",";
cachesStream << _fmt::format("\"l{}d\": {}", cacheIdx, data);
first = false;
}
if (unified) {
cachesStream << delim << _fmt::format("\"l{}\": {}", cacheIdx, unified);
delim = ",";
}
if (instruct) {
cachesStream << delim << _fmt::format("\"l{}i\": {}", cacheIdx, instruct);
delim = ","
}
if (data) {
cachesStream << delim << _fmt::format("\"l{}d\": {}", cacheIdx, data);
delim = ",";
}

_fmt::format may also not be necessary anymore since we use stringstream here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important consideration: Note that ir_metadata always gives string values where the unit is appended seoarated by a space (see for example for RAM here). Our previous extension of ir_metadata respected this design decision and adhered to it. It would be removed with the changes proposed here.

Comment on lines +359 to +366
bool isBigEndian() {
// Store 0x0001 in memory.
uint16_t word = 1;
// Get a pointer to the first byte of the word.
uint8_t* firstByte = (uint8_t*)&word;
// Check if the first byte is zero.
return !(*firstByte);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will continue the discussion from 25d4ea6#r153677577 here so that it is more visible. The proper way in C++ would be bit_cast(https://en.cppreference.com/w/cpp/numeric/bit_cast) though this is a C++20 feature. reinterpret_cast is typically used here though it would not be standard conformant though as far as casts are concered, I would say it is the closest one that is semantically correct without using bit_cast. union is also not standard compliant as only one member of a union may be active at any one time and the rest is UB. So the cleanest solution for a runtime check outside of C++20 would be writing an own bit_cast using memcpy. But endianness should be known at compile time (since constants are also written to the binary) so that the endianness macros of typical compilers can be used (in the end, std::endian only unifies this interface). Such an implementation is given here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve Build Matrix as discussed in #34
2 participants