Last updated: 17 September 2025
Use your browser’s find (Ctrl/Cmd + F) to jump to a term.
auto top‑up — An optional setting that adds prepaid credits when your balance drops below a threshold.
benchmark — A measurement that compares performance (e.g., tokens/sec for inference or throughput on GPUs).
billing (per‑second) — Compute with Hivenet charges while an instance is running, calculated per second.
Compute with Hivenet — Hivenet’s compute platform for running workloads on GPUs/CPUs.
Compute account (Hivenet) — A separate account used only for Compute with Hivenet. Has its own balance and billing, even if you already have a general Hivenet account.
concurrency (inference) — How many requests your server can handle at the same time.
context length — The maximum number of tokens a model accepts in one request (prompt + response window).
credits (prepaid) — Funds you add in advance and spend across Hivenet services.
**custom template **— Your saved instance configuration (image, packages, ports, env) you can reuse.
delete account — Permanently removes your Compute account and associated data after a grace period. Not the same as terminating an instance.
distributed cloud — Hivenet’s model that uses underutilized devices (not central data centers) to provide storage and compute.
egress — Data that leaves Hivenet to the internet (downloads, API responses). May be subject to limits or fees depending on the feature.
endpoint (HTTPS) — The public URL where your running service (e.g., vLLM) is reachable over TLS.
encryption at rest / in transit — Protecting data stored on devices (at rest) and moving across networks (in transit).
GPU instance — A virtual machine with a dedicated GPU for your workload.
GPU memory (VRAM) — Memory on the GPU used to hold model weights, activations, and KV cache.
HTTPS — Encrypted HTTP. All public endpoints on Compute are served over HTTPS by default.
Hivenet ID — The email you use to sign in. Also your owner identity for teams and access.
image (OS / container) — The base software environment an instance boots with (e.g., Ubuntu, a CUDA image, or a vLLM template).
inference (LLM) — Running a trained model to generate outputs. On Compute, you can run a vLLM server.
instance — A running virtual machine on Compute. Billed while running; can be stopped or terminated.
instance logs — Console/application logs you can view or export for debugging or support.
instance template — A saved recipe for new instances (image, ports, startup commands, etc.). See custom template.
KV cache — Key‑value cache used by LLM servers to speed up generation by reusing attention states.
memory fraction (vLLM) — How much of the GPU VRAM the server is allowed to use.
model (LLM) — The neural network you run for inference, defined by name and weights.
network ports — You can expose TCP/UDP ports on an instance. Limits apply. See instance networking.
on-demand instances — Standard pricing. Start and stop whenever you need.
organization (team) — A shared workspace in Hivenet for multiple members, with separate billing and permissions.
prepaid credits — See credits. Add funds in advance to control spend and enable features like auto top‑up.
private key / public key (SSH) — Credentials you use to securely connect to a running instance.
quantization — Techniques to reduce model precision (e.g., INT8/FP8) to fit in VRAM and improve throughput.
RAM (system memory) — Memory on the instance’s CPU side. Separate from GPU VRAM.
rate limits (API) — Caps on how often you can call an endpoint within a window.
SSH — Secure Shell. Use it to access your instance’s terminal.
stop / start (instances) — Stop pauses your instance and billing for compute; start brings it back. Data on the instance disk stays unless you terminate.
storage (instance) — Local disk on the instance for your OS, packages, and temporary data. Not the same as Store.
temperature (LLM) — Controls randomness of generated text. Higher values produce more diverse outputs.
terminate (instance) — Permanently ends an instance and erases its local storage. Different from stop.
top‑k / top‑p (LLM) — Decoding settings that limit the token selection space during generation.
UDP / TCP — Transport protocols. TCP is connection‑oriented (APIs). UDP is connectionless (some services/tools).
usage (billing) — Your historical spend and activity for Store, Compute, and Send.
vCPU — Virtual CPU cores available to your instance.
vLLM — An open‑source LLM serving engine optimized for throughput and memory efficiency.
VRAM — See GPU memory.
web console — The browser UI for managing Hivenet services, billing, and settings.