Caching
LangChain provides an optional caching layer for LLMs. This is useful for two reasons:
It can save you money by reducing the number of API calls you make to the LLM provider, if you're often requesting the same completion multiple times. It can speed up your application by reducing the number of API calls you make to the LLM provider.
import { OpenAI } from "langchain/llms/openai";
// To make the caching really obvious, lets use a slower model.
const model = new OpenAI({
modelName: "text-davinci-002",
cache: true,
n: 2,
bestOf: 2,
});
In Memory Cacheโ
The default cache is stored in-memory. This means that if you restart your application, the cache will be cleared.
// The first time, it is not yet in cache, so it should take longer
const res = await model.predict("Tell me a joke");
console.log(res);
/*
CPU times: user 35.9 ms, sys: 28.6 ms, total: 64.6 ms
Wall time: 4.83 s
"\n\nWhy did the chicken cross the road?\n\nTo get to the other side."
*/
// The second time it is, so it goes faster
const res2 = await model.predict("Tell me a joke");
console.log(res2);
/*
CPU times: user 238 ยตs, sys: 143 ยตs, total: 381 ยตs
Wall time: 1.76 ms
"\n\nWhy did the chicken cross the road?\n\nTo get to the other side."
*/
Caching with Momentoโ
LangChain also provides a Momento-based cache. Momento is a distributed, serverless cache that requires zero setup or infrastructure maintenance. Given Momento's compatibility with Node.js, browser, and edge environments, ensure you install the relevant package.
To install for Node.js:
- npm
- Yarn
- pnpm
npm install @gomomento/sdk
yarn add @gomomento/sdk
pnpm add @gomomento/sdk
To install for browser/edge workers:
- npm
- Yarn
- pnpm
npm install @gomomento/sdk-web
yarn add @gomomento/sdk-web
pnpm add @gomomento/sdk-web
Next you'll need to sign up and create an API key. Once you've done that, pass a cache
option when you instantiate the LLM like this:
import { OpenAI } from "langchain/llms/openai";
import { MomentoCache } from "langchain/cache/momento";
import {
CacheClient,
Configurations,
CredentialProvider,
} from "@gomomento/sdk"; // `from "gomomento/sdk-web";` for browser/edge
// See https://github.com/momentohq/client-sdk-javascript for connection options
const client = new CacheClient({
configuration: Configurations.Laptop.v1(),
credentialProvider: CredentialProvider.fromEnvironmentVariable({
environmentVariableName: "MOMENTO_API_KEY",
}),
defaultTtlSeconds: 60 * 60 * 24,
});
const cache = await MomentoCache.fromProps({
client,
cacheName: "langchain",
});
const model = new OpenAI({ cache });
API Reference:
- OpenAI from
langchain/llms/openai
- MomentoCache from
langchain/cache/momento
Caching with Redisโ
LangChain also provides a Redis-based cache. This is useful if you want to share the cache across multiple processes or servers. To use it, you'll need to install the redis
package:
- npm
- Yarn
- pnpm
npm install ioredis
yarn add ioredis
pnpm add ioredis
Then, you can pass a cache
option when you instantiate the LLM. For example:
import { OpenAI } from "langchain/llms/openai";
import { RedisCache } from "langchain/cache/ioredis";
import { Redis } from "ioredis";
// See https://github.com/redis/ioredis for connection options
const client = new Redis({});
const cache = new RedisCache(client);
const model = new OpenAI({ cache });
Caching with Upstash Redisโ
LangChain provides an Upstash Redis-based cache. Like the Redis-based cache, this cache is useful if you want to share the cache across multiple processes or servers. The Upstash Redis client uses HTTP and supports edge environments. To use it, you'll need to install the @upstash/redis
package:
- npm
- Yarn
- pnpm
npm install @upstash/redis
yarn add @upstash/redis
pnpm add @upstash/redis
You'll also need an Upstash account and a Redis database to connect to. Once you've done that, retrieve your REST URL and REST token.
Then, you can pass a cache
option when you instantiate the LLM. For example:
import { OpenAI } from "langchain/llms/openai";
import { UpstashRedisCache } from "langchain/cache/upstash_redis";
// See https://docs.upstash.com/redis/howto/connectwithupstashredis#quick-start for connection options
const cache = new UpstashRedisCache({
config: {
url: "UPSTASH_REDIS_REST_URL",
token: "UPSTASH_REDIS_REST_TOKEN",
},
});
const model = new OpenAI({ cache });
API Reference:
- OpenAI from
langchain/llms/openai
- UpstashRedisCache from
langchain/cache/upstash_redis
You can also directly pass in a previously created @upstash/redis client instance:
import { Redis } from "@upstash/redis";
import https from "https";
import { OpenAI } from "langchain/llms/openai";
import { UpstashRedisCache } from "langchain/cache/upstash_redis";
// const client = new Redis({
// url: process.env.UPSTASH_REDIS_REST_URL!,
// token: process.env.UPSTASH_REDIS_REST_TOKEN!,
// agent: new https.Agent({ keepAlive: true }),
// });
// Or simply call Redis.fromEnv() to automatically load the UPSTASH_REDIS_REST_URL and UPSTASH_REDIS_REST_TOKEN environment variables.
const client = Redis.fromEnv({
agent: new https.Agent({ keepAlive: true }),
});
const cache = new UpstashRedisCache({ client });
const model = new OpenAI({ cache });
API Reference:
- OpenAI from
langchain/llms/openai
- UpstashRedisCache from
langchain/cache/upstash_redis
Caching with Cloudflare KVโ
This integration is only supported in Cloudflare Workers.
If you're deploying your project as a Cloudflare Worker, you can use LangChain's Cloudflare KV-powered LLM cache.
For information on how to set up KV in Cloudflare, see the official documentation.
Note: If you are using TypeScript, you may need to install types if they aren't already present:
- npm
- Yarn
- pnpm
npm install -S @cloudflare/workers-types
yarn add @cloudflare/workers-types
pnpm add @cloudflare/workers-types
import type { KVNamespace } from "@cloudflare/workers-types";
import { OpenAI } from "langchain/llms/openai";
import { CloudflareKVCache } from "langchain/cache/cloudflare_kv";
export interface Env {
KV_NAMESPACE: KVNamespace;
OPENAI_API_KEY: string;
}
export default {
async fetch(_request: Request, env: Env) {
try {
const cache = new CloudflareKVCache(env.KV_NAMESPACE);
const model = new OpenAI({
cache,
modelName: "gpt-3.5-turbo-instruct",
openAIApiKey: env.OPENAI_API_KEY,
});
const response = await model.invoke("How are you today?");
return new Response(JSON.stringify(response), {
headers: { "content-type": "application/json" },
});
} catch (err: any) {
console.log(err.message);
return new Response(err.message, { status: 500 });
}
},
};
API Reference:
- OpenAI from
langchain/llms/openai
- CloudflareKVCache from
langchain/cache/cloudflare_kv
Caching on the File Systemโ
This cache is not recommended for production use. It is only intended for local development.
LangChain provides a simple file system cache. By default the cache is stored a temporary directory, but you can specify a custom directory if you want.
const cache = await LocalFileCache.create();