Korai Docs

Llama Usage with OpenRouter

The AI backend leverages the power of large language models (LLMs) for complex tasks like identifying compelling moments in a video transcript. To do this in a flexible and scalable way, we use the Llama model served through OpenRouter.

What is OpenRouter?

OpenRouter is a service that provides a unified API for accessing a wide variety of LLMs from different providers. Instead of integrating with each model provider individually, we can use the OpenRouter client to access models like Llama, GPT, and more.

Initializing the OpenRouter Client

In the load_model method of the AiPodcastClipper class, we initialize the OpenRouter client using the API key stored in Modal secrets.

        print("Creating OpenRouter client...")
        self.openrouter_client = OpenAI(
            base_url="https://openrouter.ai/api/v1",
            api_key=os.environ["OPENROUTER_API_KEY"],
        )
        print("Created OpenRouter client...")

This client is compatible with the OpenAI Python library, making it easy to integrate into our existing code.

Identifying Viral Moments with Llama

The identify_moments function is where the magic happens. This function takes the video transcript and uses the Llama model (meta-llama/llama-4-scout) to find the most engaging and potentially viral segments.

    def identify_moments(self, transcript: dict, source_language: str, custom_prompt: Optional[str] = None):
        # ... (prompt construction)

        completion = self.openrouter_client.chat.completions.create(
            extra_headers={
                "HTTP-Referer": os.environ.get("OPENROUTER_REFERRER_URL", ""),
                "X-Title": os.environ.get("OPENROUTER_SITE_NAME", ""),
            },
            model="meta-llama/llama-4-scout",
            messages=[
                {
                    "role": "user",
                    "content": base_prompt,
                }
            ]
        )
        # ...

A detailed prompt is constructed to guide the Llama model. This prompt instructs the model to act as a viral clip finder, providing specific criteria for what makes a clip compelling (e.g., emotional hook, storytelling, educational value). The model is then asked to return a JSON array of the identified moments, including a title, summary, and a "virality score" for each.

Benefits of Using Llama with OpenRouter

Large Context Length for Long Videos: Llama models, especially those available through OpenRouter, often come with a large context window. This is incredibly beneficial for our use case as we process long video transcripts. A larger context length allows the model to understand the entire conversation, leading to better identification of compelling moments and more coherent summaries.
Strong Multilingual Capabilities: The Llama models have been trained on a diverse dataset, giving them strong multilingual capabilities. This is crucial for our translation features, as it allows us to accurately translate video content between a wide variety of languages, ensuring that the translated content is natural and contextually appropriate.

By combining the power of the Llama model with the flexibility of OpenRouter, our AI backend can perform sophisticated analysis of video content to automatically identify and extract the most valuable moments.