Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minimal CCM integration #1225

Merged
merged 13 commits into from
Feb 11, 2025

Conversation

Lorak-mmk
Copy link
Collaborator

Pre-review checklist

  • I have split my patch into logically separate commits.
  • All commit messages clearly explain what they change and why.
  • I added relevant tests for new features and bug fixes.
  • All commits compile, pass static checks and pass test.
  • PR description sums up the changes and reasons why they should be introduced.
  • I have provided docstrings for the public items that I want to introduce.
  • I have adjusted the documentation in ./docs/source/.
  • I added appropriate Fixes: annotations to PR description.

dkropachev and others added 12 commits February 11, 2025 14:21
Cleanup unused dependancies
It was released over 6 months ago so it is ok to use it.
It allows us to use LazyCell.
This module exposes a simple abstraction that allows executing a
command and logs its arguments, environment, output and exit code
using tracing.

It will be used to run ccm commands.

Co-authored-by: Dmitry Kropachev <[email protected]>
In order to programatically manipulate by scylla.yaml and cassandra.yaml
We need to have some struct that provides API for that.
It also needs to represent it in the way that ccm updateconf understand.

Co-authored-by: Dmitry Kropachev <[email protected]>
This cfg will be put on all tests that use CCM. This is because we want
to avoid running them in normal workflow, but Cargo doesn't really have
any way to filter out a single integration test module.
CCM is cli tool that manages Scylla / Cassandra clusters.
Using it in integration suit will help to expand test coverage.

Co-authored-by: Dmitry Kropachev <[email protected]>
Co-authored-by: Mikołaj Uzarski <[email protected]>
Co-authored-by: Dmitry Kropachev <[email protected]>
For now it will run on each PR. If at some point it becomes too slow
we can switch it to running manually and before release.
Few things happen in this commit
1. Move NetPrefix from cluster.rs to ip_allocator.rs

2. Introduce IpAllocator (with LocalSubnetIdentifier)
LocalSubnetIdentifier is a utility wrapper over two ip octets, by which
we identify a local subnet (127.x.y.0/24). It can be converted to Ipv4Addr
and vice-versa. For now, we only support ipv4 (even though NetPrefix)
supports ipv6 as well. This may change in the future.

IpAllocator does exactly what we would expect - it keeps track of the
ip address pool and allows to allocate/free addresses (subnet ids) from this pool.
Upon construction, it scans the /proc/net/tcp file to find out which subnets
are used before the tests start. This way, the pool is initialized.

We introduce a global IpAllocator for test purposes. It's fine, because
IpAllocator is thread-safe. The only assumption we make, is that the ip pool
is exclusive to the tests when they run. This is true for CI, and for most
local use cases. There are some niche cases which break this assumption,
but we do not care about it now. This solution works just fine for hackathon purposes.

I introduced two new unit tests for Ipv4Addr and LocalSubnetIdentifier conversions.

3. Adjust cluster.rs to the allocator.
The `alloc_ip_prefix` is called where NetPrefix::sniff_ip_prefix was originally called.

The `free_ip_prefix` is called in `Cluster` destructor.
Copy link

github-actions bot commented Feb 11, 2025

cargo semver-checks found no API-breaking changes in this PR! 🎉🥳
Checked commit: 729a898

Copy link
Contributor

@muzarski muzarski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When removing the node_config.rs in the last commit, we forgot to remove serde_yaml dev-dependency. IDK if it matters that much.

NodeConfig-related functionalities were not yet used anywhere, and we
are in hurry during the hackathon. Let's give others the ccm framework
without NodeConfig and add NodeConfig later if it appears to be useful.
@Lorak-mmk
Copy link
Collaborator Author

When removing the node_config.rs in the last commit, we forgot to remove serde_yaml dev-dependency. IDK if it matters that much.

Done

Comment on lines +64 to +67
tracing::info!("{:15} -> status = {}", format!("exited[{}]", run_id), code);
}
None => {
tracing::info!("{:15} -> status = unknown", format!("exited[{}]", run_id));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

format is needless here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
tracing::info!("{:15} -> status = {}", format!("exited[{}]", run_id), code);
}
None => {
tracing::info!("{:15} -> status = unknown", format!("exited[{}]", run_id));
tracing::info!("{:15} -> status = {}", format_args!("exited[{}]", run_id), code);
}
None => {
tracing::info!("{:15} -> status = unknown", format_args!("exited[{}]", run_id));

Comment on lines +72 to +77
return Err(Error::msg(format!(
"Command `{}` failed: {}, stderr: \n{}",
command_with_args,
status,
std::str::from_utf8(tmp)?
)));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❓ About usage of ?: do we really want to propagate the str::from_utf8 conversion error instead of returning the error that the command failed?

Err(e) => {
tracing::info!(
"{:15} -> failed to wait on child process: = {}",
format!("exited[{}]", run_id),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

format_args!

Comment on lines +94 to +101
command: C,
args: A,
opts: RunOptions,
) -> Result<ExitStatus, Error>
where
A: IntoIterator<Item = B> + Clone,
B: AsRef<OsStr> + Clone,
C: AsRef<OsStr> + Clone,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔧 Let's use better names for type parameters. A, B, C don't say much. I suggest Args, Arg, Cmd.

Comment on lines +131 to +135
tracing::info!(
"{:15} -> {}",
format!("started[{}]", run_id),
command_with_args,
);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

format_args!

let mut stderr: Vec<u8> = Vec::new();
let stderr_task = Self::stream_reader(
child.stderr.take().expect("Failed to capture stderr"),
format!("{:15} -> ", format!("stderr[{}]", run_id)),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inner format! -> format_args!

Comment on lines +161 to +173
match buffer {
Some(buffer) => {
while let Some(line) = lines.next_line().await.ok().flatten() {
tracing::debug!("{} {}", prefix, line);
buffer.extend_from_slice(line.as_bytes());
}
}
None => {
while let Some(line) = lines.next_line().await.ok().flatten() {
tracing::debug!("{} {}", prefix, line);
}
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔧 match should be inside while let, not outside, to avoid duplication.

Comment on lines +15 to +16
pub(crate) static CLUSTER_VERSION: LazyLock<String> =
LazyLock::new(|| std::env::var("SCYLLA_TEST_CLUSTER").unwrap_or("release:6.2.2".to_string()));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔧 Let's extract DEFAULT_SCYLLADB_RELEASE = release:6.2.2, for easier updates in the future.

Comment on lines +19 to +20
std::env::var("TEST_KEEP_CLUSTER_ON_FAILURE")
.unwrap_or("".to_string())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::env::var("TEST_KEEP_CLUSTER_ON_FAILURE")
.unwrap_or("".to_string())
std::env::var("TEST_KEEP_CLUSTER_ON_FAILURE")
.as_deref()
.unwrap_or("")

pub(crate) static CLUSTER_VERSION: LazyLock<String> =
LazyLock::new(|| std::env::var("SCYLLA_TEST_CLUSTER").unwrap_or("release:6.2.2".to_string()));

static TEST_KEEP_CLUSTER_ON_FAILURE: LazyLock<bool> = LazyLock::new(|| {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❓ What is this going to be useful for?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can inspect the contents of the test tables, or the general state of the cluster, after the test fails.
With docker we had this automatically, now we don't.

@Lorak-mmk Lorak-mmk merged commit f1de757 into scylladb:branch-hackathon Feb 11, 2025
12 checks passed
@Lorak-mmk
Copy link
Collaborator Author

We'll address the comments when preparing a better version of this integration to be merged into main.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants