Hello World with LiteLLM
Last updated Oct 22, 2025
LiteLLM is a library for calling LLMs from Python. It makes it easy to access, and switch between, many providers, including OpenAI, Anthropic, Google, and more.
This recipe mirrors the Basic Python recipe, but swaps the OpenAI SDK for LiteLLM. The workflow still delegates LLM calls to an Activity, letting Temporal coordinate retries and durability, while LiteLLM forwards those calls to your configured provider.
Key points:
- A reusable Activity that wraps litellm.acompletionand keeps retries in Temporal.
- The most common LiteLLM parameters are on LiteLLMRequestensuring type checking and IDE completion. Others may be passed via theextra_optionsdictionary, which functions askwargsforlitellm.acompletion.
- The Activity returns the full LiteLLM response for processing by the workflow.
Create the Activity
activities/models.py
from dataclasses import dataclass, field
from typing import Any, Dict, List, Optional, Type, Union
@dataclass
class LiteLLMRequest:
    model: str
    messages: List[Dict[str, Any]]
    temperature: Optional[float] = None
    max_tokens: Optional[int] = None
    timeout: Optional[Union[float, int]] = None
    response_format: Optional[Union[dict, Type[Any]]] = None
    extra_options: Dict[str, Any] = field(default_factory=dict)
    def to_acompletion_kwargs(self) -> Dict[str, Any]:
        kwargs = {
            "model": self.model,
            "messages": self.messages,
        }
        optional_values = {
            "temperature": self.temperature,
            "max_tokens": self.max_tokens,
            "timeout": self.timeout,
            "response_format": self.response_format,
        }
        for key, value in optional_values.items():
            if value is not None:
                kwargs[key] = value
        if self.extra_options:
            kwargs.update(self.extra_options)
        return kwargs
activities/litellm_completion.py
from typing import Any, Dict
import litellm
from temporalio import activity
from temporalio.exceptions import ApplicationError
from activities.models import LiteLLMRequest
@activity.defn(name="activities.litellm_completion.create")
async def create(request: LiteLLMRequest) -> Dict[str, Any]:
    kwargs = request.to_acompletion_kwargs()
    kwargs["num_retries"] = 0
    try:
        response = await litellm.acompletion(**kwargs)
    except (
        litellm.AuthenticationError,
        litellm.BadRequestError,
        litellm.InvalidRequestError,
        litellm.UnsupportedParamsError,
        litellm.JSONSchemaValidationError,
        litellm.ContentPolicyViolationError,
        litellm.NotFoundError,
    ) as exc:
        raise ApplicationError(
            str(exc),
            type=exc.__class__.__name__,
            non_retryable=True,
        ) from exc
    except litellm.APIError:
        raise
    return response
LiteLLM supports many providers. Configure credentials via environment variables (for example OPENAI_API_KEY) before running the Activity. For Google-hosted models (Vertex AI or Gemini), the sample relies on the google-cloud-aiplatform and google-auth dependencies included in pyproject.toml; set the usual Google application credentials (GOOGLE_APPLICATION_CREDENTIALS, GOOGLE_CLOUD_PROJECT, VERTEXAI_LOCATION, etc.) so LiteLLM can obtain an access token.
Create the Workflow
workflows/hello_world_workflow.py
from datetime import timedelta
from temporalio import workflow
from activities.models import LiteLLMRequest
@workflow.defn
class HelloWorld:
    @workflow.run
    async def run(self, input: str) -> str:
        messages = [
            {"role": "system", "content": "You only respond in haikus."},
            {"role": "user", "content": input},
        ]
        response = await workflow.execute_activity(
            "activities.litellm_completion.create",
            LiteLLMRequest(
                # LiteLLM lets you keep the same code and swap models/providers.
                # model="gpt-4o-mini",
                model="gemini-2.5-flash-lite",
                messages=messages,
            ),
            start_to_close_timeout=timedelta(seconds=30),
        )
        message = response["choices"][0]["message"]["content"]
        if isinstance(message, list):
            message = "".join(
                part.get("text", "")
                for part in message
                if isinstance(part, dict)
            )
        return message
Temporal manages Activity retries, so LiteLLM's retry helper is disabled via num_retries=0. Use the extra_options escape hatch on LiteLLMRequest if you need to surface additional LiteLLM parameters without editing the sample.
Create the Worker
worker.py
import asyncio
from temporalio.client import Client
from temporalio.worker import Worker
from activities import litellm_completion
from workflows.hello_world_workflow import HelloWorld
from temporalio.contrib.pydantic import pydantic_data_converter
async def main():
    client = await Client.connect(
        "localhost:7233",
        data_converter=pydantic_data_converter,
    )
    worker = Worker(
        client,
        task_queue="hello-world-python-task-queue",
        workflows=[
            HelloWorld,
        ],
        activities=[
            litellm_completion.create,
        ],
    )
    await worker.run()
if __name__ == "__main__":
    asyncio.run(main())
Create the Workflow Starter
start_workflow.py
import asyncio
from temporalio.client import Client
from temporalio.contrib.pydantic import pydantic_data_converter
from workflows.hello_world_workflow import HelloWorld
async def main():
    client = await Client.connect(
        "localhost:7233",
        data_converter=pydantic_data_converter,
    )
    result = await client.execute_workflow(
        HelloWorld.run,
        "Tell me about recursion in programming.",
        id="my-workflow-id",
        task_queue="hello-world-python-task-queue",
    )
    print(f"Result: {result}")
if __name__ == "__main__":
    asyncio.run(main())
Running
Start the Temporal Dev Server:
temporal server start-dev
Install dependencies
uv sync
Set the appropriate environment variables before launching the worker (for example export OPENAI_API_KEY=... or export GEMINI_API_KEY=...) so LiteLLM can reach your chosen provider.
Run the worker:
uv run python -m worker
Start the workflow:
uv run python -m start_workflow