Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add String::concat and string.extend functions to core for efficient String concatenation. #99929

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Ivorforce
Copy link
Contributor

@Ivorforce Ivorforce commented Dec 2, 2024

Small optimization and readability improvement.

String::concat and string.extend enter the arena to join strings, and are the fastest method among operator+, StringBuilder and StringBuffer for known-count string concatenations. They use compile-time logic to avoid needing RAM allocations for bookkeeping (like StringBuilder or multiple reallocs (like StringBuffer) and is thus always faster than either.

I have also changed all Variant conversions to String to use the String::concat function. This is not because of an urgent need to optimize these functions, but they serve as exemplary cases about how to use the function, and how this improves performance (1.6x in the case of Projection). Future PRs should address the various uses of repeated A + B + C + D + E... throughout the codebase where appropriate for optimization.

I am open to renaming or moving the function.

Imo #100293 should be merged before this one, to avoid unnecessary additional renames.

Explanation

Take the example case Vector4. This has previously compiled like:

// Actual Code
	return "(" + String::num_real(x, true) + ", " + String::num_real(y, true) + ", " + String::num_real(z, true) + ", " + String::num_real(w, true) + ")";

// Compiled like
// Technically even each addition is hidden behind a function, but that's not that relevant to performance.
String s1 = "(";
s1 += String::num_real(x, true);  // resize

String s2 = s1;  // Cow copy
s2 += String::num_real(y, true);  // resize

String s3 = s2;  // Cow copy
s3 += String::num_real(z, true);  // resize

String s4 = s3;  // Cow copy
s4 += String::num_real(w, true);  // resize

String s5 = s4;  // Cow copy
s5 += ")";  // resize

return s5;

This will now compile like:

// Actual code
	return String::concat(
		"(",
		String::num_real(x, true), ", ",
		String::num_real(y, true), ", ",
		String::num_real(z, true), ", ",
		String::num_real(w, true),
		")"
	);

// Compiled like
String s1 = String::num_real(x, true);
String s2 = String::num_real(y, true);
String s3 = String::num_real(z, true);
String s4 = String::num_real(w, true);

String string;
string.resize(1 + s1.length() + 2 + s2.length() + 2 + s3.length() + 2 + s4.length() + 1);  // resize
const char32_t *ptr;

*(ptr++) = '(';

memcpy(ptr, s1.ptr(), s1.length() * sizeof(char32_t));
ptr += s1.length();

// Technically these loops are likely unrolled by the optimizing compiler, but you get the gist.
for (int i = 0; i < 2; i++) (*ptr++) = ", "[i];

memcpy(ptr, s2.ptr(), s2.length() * sizeof(char32_t));
ptr += s2.length();

for (int i = 0; i < 2; i++) (*ptr++) = ", "[i];

memcpy(ptr, s3.ptr(), s3.length() * sizeof(char32_t));
ptr += s3.length();

for (int i = 0; i < 2; i++) (*ptr++) = ", "[i];

memcpy(ptr, s4.ptr(), s4.length() * sizeof(char32_t));
ptr += s4.length();

*(ptr++) = ')';

*ptr = '\n';

return string;

Benchmark

I ran the following benchmark:

	Projection p;

	auto t0 = std::chrono::high_resolution_clock::now();
	for (int i = 0; i < 1000000; i ++) {
		// Test
		String s = p.operator String();
	}
	auto t1 = std::chrono::high_resolution_clock::now();
	std::cout << std::chrono::duration_cast<std::chrono::milliseconds>(t1 - t0).count() << "ms\n";

This prints:

  • 4141ms on master (a40fc23)
  • 2540ms on this PR (b7f85fec53d9fc539f270fdde06563fe920bacef)

Which showcases an improvement of 1.6x in this particular case.

@Calinou
Copy link
Member

Calinou commented Dec 4, 2024

Should this method be exposed to scripting?

Also, I wonder if the method name should be shorter if it's going to be used often. concatenate() alone or even concat() would do.

@Ivorforce
Copy link
Contributor Author

Ivorforce commented Dec 4, 2024

Hrm, well the function itself isn't compatible with GDScript since it's a template function; quite optimal for C++ but parameters must be known at compile time. I think such a function would be better suited for a separate PR.

@Ivorforce Ivorforce force-pushed the string-concatenate-function branch from d6740c3 to 9c57fe9 Compare December 11, 2024 15:35
@Ivorforce Ivorforce changed the title Add concatenate_strings function. Add concatenate_strings function to core for efficient String concatenation. Dec 11, 2024
@Ivorforce Ivorforce force-pushed the string-concatenate-function branch 2 times, most recently from b7f85fe to 0c9ef0f Compare December 11, 2024 16:27
@Ivorforce Ivorforce marked this pull request as ready for review December 11, 2024 17:12
@Ivorforce Ivorforce requested a review from a team as a code owner December 11, 2024 17:12
@Ivorforce
Copy link
Contributor Author

Ivorforce commented Dec 11, 2024

I rebased the PR on top of master, and simplified some of the changes. I think it's ready for review now. Not 100% happy with the implementation and name necessarily though, so I'm open to input for that.

@Ivorforce Ivorforce force-pushed the string-concatenate-function branch from 0c9ef0f to c222d0d Compare December 12, 2024 23:55
@Ivorforce Ivorforce changed the title Add concatenate_strings function to core for efficient String concatenation. Add String::concat function to core for efficient String concatenation. Dec 12, 2024
@Ivorforce Ivorforce changed the title Add String::concat function to core for efficient String concatenation. Add String::concat and string.extend functions to core for efficient String concatenation. Jan 4, 2025
@Ivorforce Ivorforce force-pushed the string-concatenate-function branch from c222d0d to 648e382 Compare January 4, 2025 10:14
@Ivorforce
Copy link
Contributor Author

Ivorforce commented Jan 4, 2025

I have added string.extend. The reason is that String::concat can trivially be implemented by using string.extend, and string.extend can be used in even more functions where previously repeated += might have been used.
This also allowed for an easy add of a String::concat specialization for a first-argument rvalue ref String that can just be extended in-place, avoiding an unnecessary copy. This optimizes a case where string concatenation via + might have previously been faster than String::concat.

@Ivorforce Ivorforce force-pushed the string-concatenate-function branch from 648e382 to 3b43532 Compare January 4, 2025 10:21
…enations where the contributors to the final string are well-known at the time of call.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants