llm.ts

May 4, 2023

There are over 100 different LLMs, with more shipping every day. They differ slightly in their architectures, and the data they were trained on, but all of them do text completion. It’s the APIs that are fragmented — OpenAI uses a “completions” endpoint with parameters like “top_p” and “stop,” Cohere uses a “generate” endpoint with parameters “p” and “stop_sequences,” HuggingFace uses “max_new_tokens” instead of “max_tokens.”

How do you test prompts across different models? How to call different models without managing 10 different client libraries? Why deal with slightly different but logically the same APIs?

I just published llm.ts, an open-source and MIT-licensed TypeScript library that lets you call over 30 different LLMs from a single API. Send multiple prompts to multiple LLMs and get the results back in a single response. It has zero dependencies and clocks in under 10kB minified. Bring your own API keys and call the models directly over HTTPS. Supports most LLM parameters (presence_penalty, stop_sequences, top_p, top_k, temperature, max_tokens, frequency_penalty, ).

If you’re interested in adding new models or hosting providers, don’t hesitate to send a pull request. See a list of the models supported and installation instructions on GitHub.

How does it work? Here’s an example that multiplexes 2 prompts over 3 different models.

import { LLM, MODEL } from 'llm.ts';

(async function () {
    await new LLM({
        apiKeys: {
            openAI: process.env.OPENAI_API_KEY ?? '',
            cohere: process.env.COHERE_API_KEY ?? '',
            huggingface: process.env.HF_API_TOKEN ?? '',
        }
    }).completion({
        prompt: [
            'Repeat the following sentence: "I am a robot."',
            'Repeat the following sentence: "I am a human."',
        ],
        model: [
            // use the model name
            'text-ada-001',

            // or specify a specific provider
            'cohere/command-nightly',

            // or use enums to avoid typos
            MODEL.HF_GPT2,
        ],
    }).then(resp => {
        console.log(resp);
    })
})()

The results are returned in an OpenAI-compatible JSON response.

{
  "created": 1683079463217,
  "choices": [
    {
      "text": "\n\nI am a robot.",
      "index": 0,
      "model": "text-ada-001",
      "promptIndex": 0,
      "created": 1683079462
    },
    {
      "text": "\n\nI am a human.",
      "index": 1,
      "model": "text-ada-001",
      "promptIndex": 1,
      "created": 1683079462
    },
    {
      "text": "\nI am a robot.",
      "index": 2,
      "model": "command-nightly",
      "promptIndex": 0,
      "created": 1683079463217
    },
    {
      "text": "\nI am a human.",
      "index": 3,
      "model": "command-nightly",
      "promptIndex": 1,
      "created": 1683079463216
    },
    {
      "text": " \"Is that your question? I was expecting the answer.\" \"Then why do you think you are being asked!\" 1. \"What are you?\" \"What are you?\" \"Why are you",
      "index": 4,
      "model": "gpt2",
      "promptIndex": 0,
      "created": 1683079463088
    },
    {
      "text": " — this quote is most often cited in reference to the Qur'an. (e.g. Ibn `Allaahu `udayyyih, Al-Rai`an, Al",
      "index": 5,
      "model": "gpt2",
      "promptIndex": 1,
      "created": 1683079463091
    }
  ]
}