Skip to content

RUM-6171 use FileObserver to limit calls to filesystem #2718

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

xgouchet
Copy link
Contributor

What does this PR do?

Use the file observer to keep an up to date list of batch files instead of calling listFiles each time.

Motivation

Performance improvement

@xgouchet xgouchet requested review from a team as code owners June 11, 2025 11:32
mariusc83
mariusc83 previously approved these changes Jun 11, 2025
Copy link
Member

@mariusc83 mariusc83 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice !!

}

init {
fileObserver.startWatching()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we stop watching at some point ? not sure when though or if needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory yes but not sure when either

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use Object.finalize, but here it is Kotlin Any, so no counterpart method.

We probably need to still find a way to stop watching, because if Datadog.stop is called, we will leak this subscription.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently we should be good, because there is stopWatching call in the FileObserver.finalize (link).

ambushwork
ambushwork previously approved these changes Jun 11, 2025
@xgouchet xgouchet dismissed stale reviews from ambushwork and mariusc83 via a467af2 June 11, 2025 14:24
@xgouchet xgouchet force-pushed the xgouchet/RUM-6171/FileObserver branch from 5c82f6f to a467af2 Compare June 11, 2025 14:24
@xgouchet xgouchet requested review from mariusc83 and ambushwork June 11, 2025 14:40
ambushwork
ambushwork previously approved these changes Jun 11, 2025
Copy link
Member

@0xnm 0xnm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice change! It will reduce CPU utilization for the case when we have many files on the disk. I added a few comments.

rootDir.listFilesSafe(internalLogger)?.toMutableSet() ?: mutableSetOf()

@Suppress("DEPRECATION") // Recommended constructor only available in API 29 Q
internal val fileObserver = object : FileObserver(rootDir.path, CREATE or DELETE) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mask is saying that only CREATE or DELETE events will be watched, but MOVED_TO, MOVED_FROM are mentioned in the code as well.

MOVED_FROM, DELETE -> {
if (!name.isNullOrEmpty() && name.isBatchFileName) {
synchronized(knownFiles) {
knownFiles.removeAll { it.name == name }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given that knownFiles is a Set<File> and File.equals relies on the path comparison anyway (link), won't it be simpler to use .remove call instead of iterating?

}

init {
fileObserver.startWatching()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use Object.finalize, but here it is Kotlin Any, so no counterpart method.

We probably need to still find a way to stop watching, because if Datadog.stop is called, we will leak this subscription.

Comment on lines +57 to +58
private val knownFiles: MutableSet<File> =
rootDir.listFilesSafe(internalLogger)?.toMutableSet() ?: mutableSetOf()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized the it may bring the following concern: this will be called on the user thread during SDK initialization and since it is I/O call, it may slow down SDK start (especially if there is a lot of files on the disk).

Instead, we maybe can to initial read in the getWritableFile / getReadableFile methods (well, on any initial access of this property), this will shift it to the worker thread improving SDK initialization.

Same may apply to the root folder creation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants