Skip to content

Monitoring actual GPU memory usage #1407

Closed
@ctuluhu

Description

@ctuluhu

Describe the problem the feature is intended to solve

I have several models loaded and not sure how can I know if Tensorflow still has some memory left. I can check using nvidia-smi how much memory is allocated by Tensorflow but I couldn't find a way to check loaded models usage.

Describe the solution

Tensorflow could provide some metrics for Prometheus about actual GPU memory usage by each loaded model.

Describe alternatives you've considered

None.

Additional context

I am not sure if this is actually a feature request or it can be done somehow at the moment.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions