Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discrepancy between structure factors #1306

Open
joema10 opened this issue Feb 4, 2025 · 8 comments
Open

Discrepancy between structure factors #1306

joema10 opened this issue Feb 4, 2025 · 8 comments

Comments

@joema10
Copy link

joema10 commented Feb 4, 2025

I am using freud to calculate the structure factor from a trajectory computed with LAMMPS. I am trying to reproduce the results for the structure factor in figure 4 of this article: https://pubs.rsc.org/en/content/articlehtml/2021/sm/d1sm00445j

This is schematically my code using freud:

# Load x,y,z (positions)  and box limits (infLx, supLx)*(infLy, supLy)*(infLz, supLz)
# Transform coordinates to get the box size is (0,Lx)*(0,Ly)*(0,Lz)

points = np.column_stack((x, y, z))

deltaKbin = 2*np.pi/frame.Lx  
kmax = 8.16 
kmin = deltaKbin  
nbin = int((kmax-kmin)/deltaKbin)+1 

sf = freud.diffraction.StaticStructureFactorDirect(bins=nbin, k_max=kmax, k_min=kmin)
sf.compute((freud.box.Box(frame.Lx, frame.Ly, frame.Lz), points),reset=False)

Everything works fine for most of the cases (e.g. rho=0.155, 0.252 and T=0.3), but for rho=0.407 and T=0.3 I don't get the same result. In this case the g(r) is similar, but S(k) is totally different using freud (see below)

I have made my own code (very inefficient) and calculating S(k) sampling randomly values of \vec{k} gives similar results compare to the reported S(k). Also, using ovito I have calculated S(k) and it seems to come out similar as well. I don't know if this particular case is difficult because of some of the assumptions freud makes internally and I just haven't found the right set of parameters or if something is wrong

Please find attached screenshots of ovito, the comparison between my code and freud, as well as one of the frames of the simulation I am analysing

Any help is welcome

I am using jupyter-notebook in Ubuntu 18.04.6 LTS with:

  • python v3.9.21
  • freud v3.1.0

Frame_last.txt

Image

Image

@DomFijan
Copy link
Contributor

Have you tried using StaticStructureFactorDebye?
How do the results compare between Debye vs. Direct vs. your code?
It looks like the number of bins between what you use in your code and in Freud is also different. This is probably not the cause of the issue, but could you set the number of bins such that they are equivalent for easier comparison?
What happens if you set num_sampled_k_points in Direct to say 10 000? Do you get the same results? ( this also shouldn't matter)

@joema10
Copy link
Author

joema10 commented Feb 12, 2025

This is the result for all three methods. As you can see between 1 and 2 there is a maximum in both my code and Debye. That maximum is also observed in the given reference, but Direct does not show that maximum. For this figure I have used num_sampled_k_points=10000, but there doesn't seem to be any change from the default value.

Image

@DomFijan
Copy link
Contributor

Can you confirm that all the particles are inside the box? You can always wrap them for good measure.

Could you also share the full code you use to run the example, including extraction of box parameters, and coordinates from the file? With that I could try and take a closer look as well.

@joema10
Copy link
Author

joema10 commented Feb 15, 2025

Thanks for the suggestion. It seems that the particles are wrapped correctly

Attached is the python script I am using for this comparison. The framework I am using to calculate S(q) here is attached above. If you need anything else from me, please let me know

COMPUTE_Sq_from_LAMMPSdump.txt

@janbridley
Copy link
Contributor

Image

When opening Frame_last.py in OVITO, the particles form layers in an odd way. Is this intended/your experience as well?

@joema10
Copy link
Author

joema10 commented Feb 20, 2025

Yes, that the system that I am studing

You can find this behaviour for particles interacting via SALR potential, like in this reference: https://pubs.rsc.org/en/content/articlehtml/2021/sm/d1sm00445j

@joaander
Copy link
Member

I don't have time to try recreating your plots. But I would like to point out that a 1D averaged structure factor throws away a LOT of information from the full 3D S($\vec{k}$). With your lamellar type system, the peak would only show up at the one specific $\vec{k}$ that is commensurate with the direction the lamella repeat. All other S($\vec{k}$) at the same magnitude $k$ will likely be 0. Since StaticStructureFactorDirect samples randomly selected $\vec{k}$ values, it might miss the specific ones that match your system's direction, thus leading to the 0's where you expect a "peak" (which is the average of many 0's and a few non-zero values).

Nominally, StaticStructureFactorDirect states that when num_sampled_k_points==0 (the default) it will use all k points. However, this is not clear in the implementation. The code contains a complex process to select k points. Setting num_sampled_k_points==0 appears to only change the selection criteria for the random sampling:

std::vector<vec3<float>> StaticStructureFactorDirect::reciprocal_isotropic(const box::Box& box, float k_max,
float k_min,
unsigned int num_sampled_k_points)
{
const auto box_matrix = box_to_matrix(box);
// B holds "crystallographic" reciprocal box vectors that lack the factor of 2 pi.
const auto B = box_matrix.transpose().inverse();
const auto q_max = k_max / freud::constants::TWO_PI;
const auto q_max_sq = q_max * q_max;
const auto q_min = k_min / freud::constants::TWO_PI;
const auto q_min_sq = q_min * q_min;
const auto dq_x = B.row(0).norm();
const auto dq_y = B.row(1).norm();
const auto dq_z = B.row(2).norm();
const auto q_volume = dq_x * dq_y * dq_z;
// Above the pruning distance, the grid of k points is sampled isotropically
// at a lower density.
const auto q_prune_distance = get_prune_distance(num_sampled_k_points, q_max, q_volume);
const auto q_prune_distance_sq = q_prune_distance * q_prune_distance;
const auto bx = freud::constants::TWO_PI * vec3<float>(B(0, 0), B(0, 1), B(0, 2));
const auto by = freud::constants::TWO_PI * vec3<float>(B(1, 0), B(1, 1), B(1, 2));
const auto bz = freud::constants::TWO_PI * vec3<float>(B(2, 0), B(2, 1), B(2, 2));
const auto N_kx = static_cast<unsigned int>(std::ceil(q_max / dq_x));
const auto N_ky = static_cast<unsigned int>(std::ceil(q_max / dq_y));
const auto N_kz = static_cast<unsigned int>(std::ceil(q_max / dq_z));
// The maximum number of k points is a guideline. The true number of sampled
// k points can be less or greater than num_sampled_k_points, depending on the
// result of the random pruning procedure. Therefore, we cannot allocate a
// fixed size for the data. Also, reserving capacity for the concurrent
// vector had no measureable effect on performance.
// NOLINTNEXTLINE(misc-include-cleaner)
tbb::concurrent_vector<vec3<float>> k_points;
// This is a 3D loop but we parallelize in 1D because we only need to seed
// the random number generators once per block and there is no benefit of
// locality if we parallelize in 2D or 3D.
util::forLoopWrapper(0, N_kx, [&](size_t begin, size_t end) {
// Set up thread-local random number generator for k point pruning.
const auto thread_start = static_cast<unsigned int>(begin);
std::random_device rd;
std::seed_seq seed {thread_start, rd(), rd(), rd()};
std::mt19937 rng(seed);
std::uniform_real_distribution<float> base_dist(0, 1);
auto random_prune = [&]() { return base_dist(rng); };
const auto add_all_k_points = std::isinf(q_prune_distance);
for (unsigned int kx = begin; kx < end; ++kx)
{
const auto k_vec_x = static_cast<float>(kx) * bx;
for (unsigned int ky = 0; ky < N_ky; ++ky)
{
const auto k_vec_xy = k_vec_x + static_cast<float>(ky) * by;
// Solve the quadratic equation for kz to limit which kz values we must sample:
// k_min^2 <= |k_vec_xy|^2 + kz^2 |bz|^2 - 2 kz (k_vec_xy \cdot bz) <= k_max^2
// 0 <= kz^2 (|bz|^2) + kz (-2 (k_vec_xy \cdot bz)) + (|k_vec_xy|^2 - k_min^2)
// 0 >= kz^2 (|bz|^2) + kz (-2 (k_vec_xy \cdot bz)) + (|k_vec_xy|^2 - k_max^2)
// This step improves performance significantly when k_min > 0
// by eliminating a large portion of the search space. Likewise,
// it eliminates the portion of search space outside a sphere
// with radius k_max. We round kz_min down and kz_max up to
// ensure that we don't accidentally throw out valid k points in
// the range (k_min, k_max) due to rounding error.
const auto coef_a = dot(bz, bz);
const auto coef_b = -2 * dot(k_vec_xy, bz);
const auto coef_c_min = dot(k_vec_xy, k_vec_xy) - k_min * k_min;
const auto coef_c_max = dot(k_vec_xy, k_vec_xy) - k_max * k_max;
const auto b_over_2a = coef_b / (2 * coef_a);
const auto kz_min = static_cast<unsigned int>(
std::floor(-b_over_2a + std::sqrt(b_over_2a * b_over_2a - coef_c_min / coef_a)));
const auto kz_max = static_cast<unsigned int>(
std::ceil(-b_over_2a + std::sqrt(b_over_2a * b_over_2a - coef_c_max / coef_a)));
for (unsigned int kz = kz_min; kz < std::min(kz_max, N_kz); ++kz)
{
const auto k_vec = k_vec_xy + static_cast<float>(kz) * bz;
const auto q_distance_sq
= dot(k_vec, k_vec) / freud::constants::TWO_PI / freud::constants::TWO_PI;
// The k vector is kept with probability min(1, (q_prune_distance / q_distance)^2).
// This sampling scheme aims to have a constant density of k vectors with respect to
// radial distance.
if (q_distance_sq <= q_max_sq && q_distance_sq >= q_min_sq)
{
const auto prune_probability = q_prune_distance_sq / q_distance_sq;
if (add_all_k_points || prune_probability > random_prune())
{
k_points.emplace_back(k_vec);
}
}
}
}
}
});
return std::vector<vec3<float>>(k_points.cbegin(), k_points.cend());
}

It is unfortunate that StaticStructureFactorDirect does not provide the full 3D S($\vec{k}$). The 1D version of it throws away so much information, and is really only valid when you have a massive sample with randomly oriented grains (powder diffraction).

Note that:

  • OVITO computes its structure factor with FFTs and will therefore not miss any peaks.
  • StaticStructureFactorDebye uses an approximation that is not valid at low values of $k$ as stated in the documentation. You should not expect it to give accurate results at low $k$.
  • Your first brute force plot appears to show arbitrarily selected $\vec{k}$ values. $\vec{k}$ that are not commensurate with the periodic boundary conditions will introduce aliasing artifacts.

Could you please provide a complete minimum working example that loads your data and plots the result so that we do not need to spend our time recreating it. @DomFijan - the S(K) computation code and binning is so simple it is hard to imagine something going wrong there. If I had to guess, I would say the bug might be in the complex reciprocal_isotropic method. We should check that all k vectors are properly generated. We should also consider writing a simple and clearly correct code path when num_sampled_k_points==0. All we should need is a triple for loop without the optimizations to limit the search space. Although, if the existing code is missing k-points, then the bug will still be present when num_sampled_k_points > 0.

@DomFijan
Copy link
Contributor

Re num_sampled_k_points==0 : this sets q_prune_distance_sq = std::numeric_limits<float>::infinity(); which in turn makes the if condition for sampling points always true. So setting num_sampled_k_points==0 always emplaces back all the k_vec values, however it could still be true that these are not generated correctly - and thus making the whole num_sampled_k_points==0 path incorrect as you mentioned @joaander.

I used @joema10 script to test if turning off parallelization would show different results, but that doesn't seem to change results at all. I also played with number of bins and it seems that the brute force approach is very susceptible to bin width in this range as well (23 bins originally vs 48 bins plotted here).

Image

One last thing I want to try is to plot the diffraction pattern. My thinking was that diffraction pattern is based on discrete FT and shouldn't suffer from any of the sampling shenanigans. I used a view orientation similar to what @janbridley posted and rotated the camera accordingly. One thing that I'm not 100% certain about is what correct orientation to pass to freud's compute - it seems that viewport API in ovito was changed recently and now does not provide viewMatrix, but rather only view orientation so I'm not 100% sure if did the rotation right. I'm not 100% sure if there is a peak at around k=1.25 or not, might have to play with this more.

Image

I will try to implement the triple loop approach for num_sampled_k_points==0 as @joaander at some point this week to try and see if there are any issues with the sampling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants