[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-1243":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":8,"htmlUrl":8,"language":9,"languages":8,"totalLinesOfCode":8,"stars":10,"forks":11,"watchers":12,"openIssues":13,"contributorsCount":14,"subscribersCount":14,"size":14,"stars1d":15,"stars7d":16,"stars30d":17,"stars90d":14,"forks30d":14,"starsTrendScore":18,"compositeScore":19,"rankGlobal":8,"rankLanguage":8,"license":20,"archived":21,"fork":21,"defaultBranch":22,"hasWiki":23,"hasPages":21,"topics":24,"createdAt":8,"pushedAt":8,"updatedAt":25,"readmeContent":26,"aiSummary":27,"trendingCount":14,"starSnapshotCount":14,"syncStatus":28,"lastSyncTime":29,"discoverSource":30},1243,"utilyze","systalyze\u002Futilyze","systalyze",null,"Go",363,27,1,6,0,4,7,57,12,4.34,"Apache License 2.0",false,"main",true,[],"2026-06-12 02:00:25","# Utilyze\n\nUtilyze measures how efficiently your GPU is doing useful work, not just whether it's busy. It runs live against your workload with negligible overhead.\n\n![utlz in action](.\u002Fassets\u002Futlz.png)\n\nStandard tools like `nvidia-smi` and `nvtop` only check whether a kernel is running on the GPU. They can show 100% while your workload is using a tiny fraction of the hardware's real capacity. \n\nUtilyze reads GPU performance counters directly to show what's actually being used, and provides an estimate of how far you can push utilization given a workload, model, and hardware. To learn more, read [our blog post](https:\u002F\u002Fsystalyze.com\u002Futilyze).\n\nUtilyze is created by [Systalyze](https:\u002F\u002Fsystalyze.com).\n\n**Read this in other languages:** [中文](.\u002FREADME.zh-CN.md)\n\n## Requirements\n\n- Linux amd64 (arm64 support coming soon)\n- NVIDIA Ampere or newer GPU (A100, H100, H200, B200, RTX 3000+)\n- CUDA Toolkit 11.0+\n- `sudo` or `CAP_SYS_ADMIN` (see below), or privileged container\n\n## Installation\n\n```bash\n# macOS\u002FLinux\ncurl -sSfL https:\u002F\u002Fsystalyze.com\u002Futilyze\u002Finstall.sh | sh\n\n# Windows\niex (curl.exe -L https:\u002F\u002Fsystalyze.com\u002Futilyze\u002Finstall.ps1 | Out-String)\n```\n\nFor macOS and Windows versions, **Utilyze acts as a client for another Utilyze process running on a remote Linux machine with profiling capabilities.** These do not require root nor any native libraries. On Windows, you may need to add an exception to executable path for Windows Defender and then reinstall Utilyze:\n\n```powershell\nAdd-MpPreference -ExclusionPath \u003CINSTALL_DIR>\niex (curl.exe -L https:\u002F\u002Fsystalyze.com\u002Futilyze\u002Finstall.ps1 | Out-String)\n```\n\nUtilyze will likely require root for profiling capabilities depending on your host configuration (see below) and will prompt you for your password during installation to install it system-wide.\n\nIf CUPTI 12+ is not found, `utlz` will prompt you to install the latest release from PyPI on first run.\n\n## Usage\n\nOn a Linux machine with profiling capabilities, you can:\n```bash\n# monitor all GPUs for SOL metrics\nsudo utlz\n\n# monitor specific GPUs\nsudo utlz --devices 0,2\n\n# show discovered inference server endpoints per GPU\nsudo utlz --endpoints\n```\nThis starts a WebSocket server that listens for connections from other Utilyze processes on port 8079 by default. Further instances will automatically connect to the same server.\n\nOn a macOS\u002FWindows machine, you can connect to a running server with:\n```bash\nutlz --connect \u003CSERVER_URL>\n```\n\nNote that a single device ID can only be monitored by a single instance of `utlz`. This is due to the way NVIDIA's Perf SDK API handles device access.\n\n### Attainable SOL\n\nUtilyze discovers running inference servers to detect which model is loaded on each GPU. It computes an attainable compute SOL ceiling (your realistic peak given that model and hardware).\n\nCurrently Utilyze only supports vLLM as a backend, with more (e.g. SGLang) coming soon. We are expanding model and hardware coverage over time; at present we support a subset of models on H100-80G and A100-80G GPUs within a node (up to 8 GPUs).\n\nTo enable this, Utilyze anonymously sends GPU configuration data to Systalyze's servers. Disable with `UTLZ_DISABLE_METRICS=1`.\n\n### Running without sudo\n\nBy default, NVIDIA restricts GPU profiling counters to admin users. To allow non-root access, disable the restriction on the host and reboot:\n\n```bash\necho 'options nvidia NVreg_RestrictProfilingToAdminUsers=0' | sudo tee \u002Fetc\u002Fmodprobe.d\u002Fnvidia-profiling.conf\nsudo reboot\n```\n\nAfter this, `utlz` can run without sudo. If `utlz` warns about missing capabilities, you can disable the warning via `UTLZ_DISABLE_PROFILING_WARNING=1` (see Options).\n\n### Options\n\nFlags (most have environment variable equivalents):\n\n- `--endpoints`: show discovered inference server endpoints per GPU\n- `--devices` \u002F `UTLZ_DEVICES`: monitor specific GPUs (comma-separated list of device IDs)\n- `--log` \u002F `UTLZ_LOG`: a file to write logs to (default: no logging)\n- `--log-level` \u002F `UTLZ_LOG_LEVEL`: set the log level (default: `INFO`, other options: `DEBUG`, `WARN`, `ERROR`)\n- `--version`: show the version\n\nEnvironment variables only:\n\n- `UTLZ_DISABLE_PROFILING_WARNING`: disable the warning about GPU profiling capabilities on startup\n- `UTLZ_BACKEND_URL`: set the backend URL for Systalyze's roofline SOL metrics API (default: `https:\u002F\u002Fapi.systalyze.com\u002Fv1\u002Futilyze`)\n- `UTLZ_DISABLE_METRICS`: disable workload detection and Systalyze roofline SOL metrics API\n\n## Build from source\n\nTo build from source you'll need:\n\n- Go 1.25+ for the CLI\n- Docker for building the native library with wide compatibility\n- CUDA Toolkit (13.1 is linked against by default but can be set via `CUDA_VERSION`)\n\n```bash\n# build the native library and the CLI\nmake all\n\n# build and package the native library via Docker\nmake dist-tarball-docker\n\n# build the CLI only\nmake utlz\n```\n\nThere is experimental support for ARM64 builds using the sbsa-linux CUDA target.\n","Utilyze 是一个用于测量 GPU 实际工作负载效率的工具。它通过直接读取 GPU 性能计数器来提供比标准工具如 `nvidia-smi` 更准确的利用率数据，并且可以估算给定工作负载、模型和硬件条件下的最大可能利用率。该工具以 Go 语言编写，具有低开销特性，适用于需要深入了解 GPU 资源使用情况的场景，特别是对性能优化有高要求的应用环境。支持 Linux amd64 系统与 NVIDIA Ampere 或更新架构的 GPU，安装简便，可通过命令行快速部署。",2,"2026-06-11 02:42:34","CREATED_QUERY"]