Local Avatar
- Managed LiveKit-Server
- Self-hosted LiveKit-Server
Prerequisites
- Credentials for LiveKit Server
- GPU-Server
- Python 3.12 or higher
Set Up Python Environment
note
This Tutorial uses the uv command-line tool from astral.sh to create a virtual python environment and install the required packages. But it's optional to use uv and you can use any other method to create a virtual environment and install the required packages.
uv venv --python 3.12
uv pip install livekit dotenv-python git+https://github.com/avaluma-ai/avaluma-livekit-plugin.git
Assamble Agent
The following script shows how to assemble an agent with LiveKit and an Avaluma avatar. The avatar specific lines are highlighted.
info
livekit.agent.inference is only available when using LiveKit Cloud.
For the self-hosted version stt, llm and tts have to be replaced with Model Plugins.
agent_with_local_avatar.py
from avaluma_livekit_plugin import LocalAvatarSession
from dotenv import load_dotenv
from livekit.agents import (
Agent,
AgentSession,
JobContext,
JobProcess,
RoomInputOptions,
RoomOutputOptions,
WorkerOptions,
cli,
inference,
)
from livekit.plugins import noise_cancellation, silero
from livekit.plugins.turn_detector.multilingual import MultilingualModel
load_dotenv(".env.local")
class Assistant(Agent):
def __init__(self) -> None:
super().__init__(
instructions="""You are a helpful virtual AI assistant. The user is interacting with you via voice, even if you perceive the conversation as text.
You eagerly assist users with their questions by providing information from your extensive knowledge.
Your responses are concise, to the point, and without any complex formatting or punctuation including emojis, asterisks, or other symbols.
You are curious, friendly, and have a sense of humor.""",
)
async def entrypoint(ctx: JobContext):
ctx.log_context_fields = {
"room": ctx.room.name,
}
session = AgentSession(
stt=inference.STT(model="assemblyai/universal-streaming", language="en"),
llm=inference.LLM(model="openai/gpt-4.1-mini"),
tts=inference.TTS(
model="cartesia/sonic-3", voice="9626c31c-bec5-4cca-baa8-f8ba9e84c8bc"
),
turn_detection=MultilingualModel(),
vad=ctx.proc.userdata["vad"],
preemptive_generation=True,
)
avatar = LocalAvatarSession(
license_key="", # Your License Key
avatar_id="AVATAR_ID", # Avatar identifier (Name of .hvia file)
assets_dir="assets/avatars", # Path to the assets directory with .hvia files
)
# Start the avatar and wait for it to join
await avatar.start(agent_session=session, room=ctx.room)
await session.start(
agent=Assistant(),
room=ctx.room,
room_input_options=RoomInputOptions(
noise_cancellation=noise_cancellation.BVC(),
),
room_output_options=RoomOutputOptions(audio_enabled=False),
)
await ctx.connect()
def prewarm(proc: JobProcess):
proc.userdata["vad"] = silero.VAD.load()
if __name__ == "__main__":
cli.run_app(WorkerOptions(entrypoint, prewarm_fnc=prewarm, job_memory_warn_mb=4096))
Test Agent with LiveKit Playground
Project:
project
├── .env.local
├── assets
│ └── *.hvia
└── src
└── agent.py
- Copy LiveKit Credentials into
.env.local - Download models for speaker detection
uv run python src/agent_with_local_avatar.py download-files. - Run the agent using
uv run python src/agent_with_local_avatar.py start. - Open the LiveKit Agent Playground to test your agent.
Additional Resources
- Agent Starter Example from LiveKit