Expand description
Tensor — immutable, chainable wrapper around a libtorch tensor.
Every tensor owns its C++ handle and frees it on drop. This is the entire VRAM management story — no GC, no scopes, no finalizers.
Operations are chainable and return Result<Tensor>:
ⓘ
let z = a.add(&b)?.relu()?.sum()?;Structs§
- Device
Info - Information about a CUDA device.
- Grad
Accumulator Handle - Opaque strong-reference handle to a leaf tensor’s AccumulateGrad node. Dropping it frees the node (unless a backward pass still holds its own reference).
- RnnParams
- Persistent cache of RNN parameter tensors on the C++ side.
- Tensor
- A tensor wrapping a libtorch C++ tensor.
- Tensor
Error - Error type for tensor operations.
- Tensor
Options - Options for tensor creation.
Enums§
- DType
- Element data type of a tensor. Maps to PyTorch’s
torch.dtype. - Device
- Device represents where a tensor’s data lives.
Functions§
- cuda_
active_ bytes - Query bytes actively used by tensors on device 0.
- cuda_
active_ bytes_ idx - Query bytes actively used by tensors on a specific device.
- cuda_
allocated_ bytes - Query bytes reserved by the CUDA caching allocator on device 0.
- cuda_
allocated_ bytes_ idx - Query bytes reserved by the CUDA caching allocator on a specific device.
- cuda_
available - Returns true if CUDA is available.
- cuda_
compute_ capability - Query compute capability (major, minor) for a CUDA device.
- cuda_
device_ count - Returns the number of CUDA devices.
- cuda_
device_ name - Returns the GPU device name for device 0 (e.g. “NVIDIA GeForce GTX 1060 6GB”).
- cuda_
device_ name_ idx - Returns the GPU device name for the given index (e.g. “NVIDIA GeForce GTX 1060 6GB”).
- cuda_
devices - Enumerate all available CUDA devices.
- cuda_
empty_ cache - Release all unused cached memory from the CUDA caching allocator.
Equivalent to
torch.cuda.empty_cache(). - cuda_
manual_ seed_ all - Seed all CUDA device RNGs. No-op when built without CUDA.
- cuda_
memory_ info - Query CUDA memory usage for device 0.
Returns
(used_bytes, total_bytes)or an error if CUDA is not available. - cuda_
memory_ info_ idx - Query CUDA memory usage for a specific device.
Returns
(used_bytes, total_bytes)or an error if CUDA is not available. - cuda_
peak_ active_ bytes - Peak bytes allocated to tensors since last
cuda_reset_peak_stats()on device 0. - cuda_
peak_ active_ bytes_ idx - Peak bytes allocated to tensors since last
cuda_reset_peak_stats()on a specific device. - cuda_
peak_ reserved_ bytes - Peak bytes reserved by the CUDA caching allocator since last
cuda_reset_peak_stats()on device 0. - cuda_
peak_ reserved_ bytes_ idx - Peak bytes reserved by the CUDA caching allocator since last
cuda_reset_peak_stats()on a specific device. - cuda_
reset_ peak_ stats - Reset peak memory statistics for device 0.
- cuda_
reset_ peak_ stats_ idx - Reset peak memory statistics for a specific device.
Equivalent to
torch.cuda.reset_peak_memory_stats(). - cuda_
synchronize - Synchronize a CUDA device (wait for all pending work to complete).
- cuda_
utilization - Query GPU utilization percentage (0-100) via NVML.
Returns
Noneif NVML is not available or the query fails. - cuda_
utilization_ idx - Query GPU utilization percentage for a specific device (0-100) via NVML.
- current_
cuda_ device - Get the current CUDA device index.
- hardware_
summary - One-line hardware summary for dashboard headers.
- live_
tensor_ count - Number of live C++ Tensor handles (created but not yet dropped). If this grows over time during training, there is a handle leak. If it stays stable but RSS grows, the leak is inside libtorch.
- malloc_
trim - Ask glibc to return free memory to the OS (Linux only).
- manual_
seed - Seed all libtorch RNGs (CPU + CUDA) for reproducible tensor ops.
- probe_
device - Probe whether a CUDA device can execute compute kernels under the
current libtorch build. Returns
Ok(())if the device works, or an error describing why it cannot (e.g. missing kernel image for sm_61). - rss_kb
- Read current process RSS in kilobytes (Linux only). Returns 0 on non-Linux or if /proc/self/statm is unreadable.
- set_
cudnn_ benchmark - Enable or disable cuDNN benchmark mode.
- set_
current_ cuda_ device - Set the current CUDA device.
- usable_
cuda_ devices - Return all CUDA devices that can run compute kernels, with warnings for any excluded devices printed to stderr.