Module tensor

Expand description

Tensor — immutable, chainable wrapper around a libtorch tensor.

Every tensor owns its C++ handle and frees it on drop. This is the entire VRAM management story — no GC, no scopes, no finalizers.

Operations are chainable and return Result<Tensor>:

let z = a.add(&b)?.relu()?.sum()?;

Structs§

DeviceInfo: Information about a CUDA device.
GradAccumulatorHandle: Opaque strong-reference handle to a leaf tensor’s AccumulateGrad node. Dropping it frees the node (unless a backward pass still holds its own reference).
RnnParams: Persistent cache of RNN parameter tensors on the C++ side.
Tensor: A tensor wrapping a libtorch C++ tensor.
TensorError: Error type for tensor operations.
TensorOptions: Options for tensor creation.

Enums§

DType: Element data type of a tensor. Maps to PyTorch’s torch.dtype.
Device: Device represents where a tensor’s data lives.

Functions§

cuda_active_bytes: Query bytes actively used by tensors on device 0.
cuda_active_bytes_idx: Query bytes actively used by tensors on a specific device.
cuda_allocated_bytes: Query bytes reserved by the CUDA caching allocator on device 0.
cuda_allocated_bytes_idx: Query bytes reserved by the CUDA caching allocator on a specific device.
cuda_available: Returns true if CUDA is available.
cuda_compute_capability: Query compute capability (major, minor) for a CUDA device.
cuda_device_count: Returns the number of CUDA devices.
cuda_device_name: Returns the GPU device name for device 0 (e.g. “NVIDIA GeForce GTX 1060 6GB”).
cuda_device_name_idx: Returns the GPU device name for the given index (e.g. “NVIDIA GeForce GTX 1060 6GB”).
cuda_devices: Enumerate all available CUDA devices.
cuda_empty_cache: Release all unused cached memory from the CUDA caching allocator. Equivalent to torch.cuda.empty_cache().
cuda_manual_seed_all: Seed all CUDA device RNGs. No-op when built without CUDA.
cuda_memory_info: Query CUDA memory usage for device 0. Returns (used_bytes, total_bytes) or an error if CUDA is not available.
cuda_memory_info_idx: Query CUDA memory usage for a specific device. Returns (used_bytes, total_bytes) or an error if CUDA is not available.
cuda_peak_active_bytes: Peak bytes allocated to tensors since last cuda_reset_peak_stats() on device 0.
cuda_peak_active_bytes_idx: Peak bytes allocated to tensors since last cuda_reset_peak_stats() on a specific device.
cuda_peak_reserved_bytes: Peak bytes reserved by the CUDA caching allocator since last cuda_reset_peak_stats() on device 0.
cuda_peak_reserved_bytes_idx: Peak bytes reserved by the CUDA caching allocator since last cuda_reset_peak_stats() on a specific device.
cuda_reset_peak_stats: Reset peak memory statistics for device 0.
cuda_reset_peak_stats_idx: Reset peak memory statistics for a specific device. Equivalent to torch.cuda.reset_peak_memory_stats().
cuda_synchronize: Synchronize a CUDA device (wait for all pending work to complete).
cuda_utilization: Query GPU utilization percentage (0-100) via NVML. Returns None if NVML is not available or the query fails.
cuda_utilization_idx: Query GPU utilization percentage for a specific device (0-100) via NVML.
current_cuda_device: Get the current CUDA device index.
hardware_summary: One-line hardware summary for dashboard headers.
live_tensor_count: Number of live C++ Tensor handles (created but not yet dropped). If this grows over time during training, there is a handle leak. If it stays stable but RSS grows, the leak is inside libtorch.
malloc_trim: Ask glibc to return free memory to the OS (Linux only).
manual_seed: Seed all libtorch RNGs (CPU + CUDA) for reproducible tensor ops.
probe_device: Probe whether a CUDA device can execute compute kernels under the current libtorch build. Returns Ok(()) if the device works, or an error describing why it cannot (e.g. missing kernel image for sm_61).
rss_kb: Read current process RSS in kilobytes (Linux only). Returns 0 on non-Linux or if /proc/self/statm is unreadable.
set_cudnn_benchmark: Enable or disable cuDNN benchmark mode.
set_current_cuda_device: Set the current CUDA device.
usable_cuda_devices: Return all CUDA devices that can run compute kernels, with warnings for any excluded devices printed to stderr.

Type Aliases§

Result

Module tensor

Module tensor Copy item path

Structs§

Enums§

Functions§

Type Aliases§

Module tensor