Creating a Personal AI Assistant with LangChain, Pinecone, and Next.js

by Didin J. on Jul 15, 2025 Creating a Personal AI Assistant with LangChain, Pinecone, and Next.js

Build a personal AI assistant using LangChain, Pinecone, and Next.js with OpenAI integration, vector search, and RAG for intelligent Q&A and memory.

The rise of large language models (LLMs) like OpenAI’s GPT-4 has made it possible to build smart, conversational AI assistants tailored to individual needs. In this tutorial, we’ll walk you through creating a personal AI assistant using three powerful tools:

  • LangChain – a framework for chaining LLMs with tools like vector stores and memory.

  • Pinecone – a managed vector database for semantic search and retrieval.

  • Next.js – a modern full-stack React framework with powerful routing and API capabilities.

We’ll also integrate OpenAI to power the assistant's brain and use Retrieval-Augmented Generation (RAG) to give your assistant memory, allowing it to answer questions from your data or documents.

By the end of this guide, you’ll have a working AI assistant that can:

  • Ingest and store documents as vector embeddings,

  • Answer natural language questions based on that custom knowledge.

  • Stream intelligent responses via a modern Next.js 14 frontend.

Let’s get started!


Project Setup

1. Create a New Next.js 14 Project

We'll use the App Router (introduced in Next.js 13+) and TypeScript for scalability.

npx create-next-app@latest ai-assistant --typescript --app
cd ai-assistant

Enable experimental features if prompted (like Server Actions or Streaming).

2. Install Dependencies

We need packages for LangChain, Pinecone, OpenAI, and vector store integration.

npm install langchain @langchain/openai @langchain/community openai @pinecone-database/pinecone dotenv @langchain/pinecone @langchain/core --legacy-peer-deps

Also, install types and other useful packages:

npm install -D @types/node

3. Project Structure

Let’s follow a clean structure:

ai-assistant/
├── app/
│   └── chat/                  # Chat UI and logic
├── lib/
│   ├── langchain/             # LangChain config and chains
│   └── pinecone/              # Pinecone client setup
├── pages/api/ingest.ts        # API route to upload documents
├── env.local                  # Environment variables
└── ...

4. Configure Environment Variables

Create a .env.local file at the root:

OPENAI_API_KEY=your_openai_api_key
PINECONE_API_KEY=your_pinecone_api_key
PINECONE_INDEX_NAME=your_index_name

🔐 Keep this file private and avoid committing it to version control.


Set Up Pinecone Vector Database

To enable semantic search over custom documents, we’ll use Pinecone as our vector store. LangChain will store and retrieve document embeddings from Pinecone.

1. Create a Pinecone Account and Index

  1. Go to https://www.pinecone.io/ and sign up or log in.

  2. Navigate to the Console and click Create Index.

  3. Use the following settings:

    • Name: ai-assistant

    • Dimension: 1536 (for OpenAI embeddings like text-embedding-3-small)

    • Metric: cosine

    • Pods: 1 (starter tier is fine for testing)

📌 Note: The dimension depends on the embedding model. You can check OpenAI's docs for model-specific dimensions.

2. Install and Configure the Pinecone Client

Pinecone provides an official Node.js SDK:

npm install @pinecone-database/pinecone

Create a new file lib/pinecone/client.ts:

import { Pinecone } from '@pinecone-database/pinecone';

export const pinecone = new Pinecone({
  apiKey: process.env.PINECONE_API_KEY!
});

Create a helper to initialize or retrieve the index:

import { pinecone } from './client';

export const getPineconeIndex = async () => {
  return pinecone.Index(process.env.PINECONE_INDEX_NAME!);
};


Embedding Documents with LangChain

Now, let’s parse and embed documents, store the vectors in Pinecone, and prepare for retrieval.

1. Create a LangChain Embed and Vector Store Setup

First, configure the OpenAI embedding model and Pinecone integration:

import { OpenAIEmbeddings } from '@langchain/openai';
import { getPineconeIndex } from '@/lib/pinecone/index';
import { PineconeStore } from '@langchain/pinecone';

export const embedDocuments = async (texts: string[], namespace = 'default') => {
    const index = await getPineconeIndex();

    const vectorStore = await PineconeStore.fromTexts(
        texts,
        texts.map(() => ({})), // metadata (optional)
        new OpenAIEmbeddings(),
        {
            pineconeIndex: index,
            namespace,
        }
    );

    return vectorStore;
};

2. Load and Ingest Documents (TXT/MD/PDF)

You can use LangChain's community loaders. Let’s demonstrate with simple .txt files.

// pages/api/ingest.ts
import { NextApiRequest, NextApiResponse } from 'next';
import fs from 'fs/promises';
import path from 'path';
import { embedDocuments } from '@/lib/langchain/embeddings';

export default async function handler(req: NextApiRequest, res: NextApiResponse) {
  try {
    const filePath = path.join(process.cwd(), 'docs', 'example.txt');
    const content = await fs.readFile(filePath, 'utf-8');

    await embedDocuments([content]);

    res.status(200).json({ message: 'Document embedded successfully.' });
  } catch (err) {
    console.error(err);
    res.status(500).json({ error: 'Embedding failed.' });
  }
}


Implement Retrieval-Augmented Generation (RAG)

RAG allows your AI assistant to “remember” and reason over your data by retrieving relevant information from Pinecone and injecting it into the LLM’s prompt context.

1. Build the RAG Chain with LangChain

We’ll create a reusable LangChain chain that:

  • Accepts a user question,

  • Retrieves relevant chunks from Pinecone,

  • Crafts a smart prompt,

  • Gets an answer from OpenAI. 

    import { ChatOpenAI } from '@langchain/openai';
    import { PromptTemplate } from '@langchain/core/prompts';
    import { RunnableMap, RunnableSequence } from '@langchain/core/runnables';
    import { getPineconeIndex } from '@/lib/pinecone/index';
    import { PineconeStore } from '@langchain/pinecone';
    import { OpenAIEmbeddings } from '@langchain/openai';
    
    export const createRAGChain = async () => {
        const index = await getPineconeIndex();
    
        const vectorStore = await PineconeStore.fromExistingIndex(
            new OpenAIEmbeddings(),
            {
                pineconeIndex: index,
                namespace: 'default',
            }
        );
    
        const retriever = vectorStore.asRetriever({ k: 4 });
    
        const prompt = PromptTemplate.fromTemplate(`
    You are a helpful assistant. Use the following context to answer the question.
    If you don't know the answer, just say "I don't know".
    
    Context:
    {context}
    
    Question:
    {question}
      `);
    
        const llm = new ChatOpenAI({
            modelName: 'gpt-4',
            temperature: 0.5,
        });
    
        const chain = RunnableSequence.from([
            RunnableMap.from({
                context: async (input: { question: string }) => {
                    const docs = await retriever.invoke(input.question);
                    return docs.map((doc) => doc.pageContent).join('\n\n');
                },
                question: (input: { question: string }) => input.question,
            }),
            prompt,
            llm,
        ]);
    
        return chain;
    };

2. Create a Query API Route in Next.js

Now let’s expose an API endpoint that receives a user question and returns a response using our chain.

// pages/api/query.ts
import type { NextApiRequest, NextApiResponse } from 'next';
import { createRAGChain } from '@/lib/langchain/rag';

export default async function handler(req: NextApiRequest, res: NextApiResponse) {
    if (req.method !== 'POST') {
        return res.status(405).json({ error: 'Method not allowed' });
    }

    try {
        const { question } = req.body;

        if (!question) {
            return res.status(400).json({ error: 'Question is required' });
        }

        const chain = await createRAGChain();

        const response = await chain.invoke({ question });

        res.status(200).json({ answer: response.content });
    } catch (err) {
        console.error('RAG error:', err);
        res.status(500).json({ error: 'Failed to retrieve answer' });
    }
}

3. Example Request from Frontend (Client Side)

const askQuestion = async (question: string) => {
  const res = await fetch('/api/query', {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({ question }),
  });

  const data = await res.json();
  return data.answer;
};

At this point, your backend can:

  • Accept a question from the frontend,

  • Retrieve matching documents from Pinecone,

  • Use OpenAI via LangChain to answer contextually.


Frontend Chat UI with Next.js 14 (App Router)

Folder Structure (Recap)

We’ll put the chat interface in a dedicated route:

app/
├── chat/
│   ├── page.tsx       <-- Main chat UI
│   └── components/
│       └── ChatBox.tsx

1. app/chat/page.tsx – Chat Page Entry

import ChatBox from './components/ChatBox';

export default function ChatPage() {
  return (
    <main className="flex min-h-screen flex-col items-center justify-center p-6 bg-gray-100">
      <div className="w-full max-w-2xl">
        <h1 className="text-3xl font-bold mb-6 text-center">💬 Personal AI Assistant</h1>
        <ChatBox />
      </div>
    </main>
  );
}

2. app/chat/components/ChatBox.tsx – Chat Component

'use client';

import { useState } from 'react';

type Message = {
  sender: 'user' | 'assistant';
  content: string;
};

export default function ChatBox() {
  const [messages, setMessages] = useState<Message[]>([]);
  const [input, setInput] = useState('');
  const [loading, setLoading] = useState(false);

  const sendMessage = async () => {
    if (!input.trim()) return;

    const userMessage: Message = { sender: 'user', content: input };
    setMessages((prev) => [...prev, userMessage]);
    setInput('');
    setLoading(true);

    try {
      const res = await fetch('/api/query', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ question: input }),
      });

      const data = await res.json();

      const aiMessage: Message = {
        sender: 'assistant',
        content: data.answer ?? 'Sorry, I couldn’t find an answer.',
      };

      setMessages((prev) => [...prev, aiMessage]);
    } catch (err) {
      console.error(err);
    } finally {
      setLoading(false);
    }
  };

  const handleKeyDown = (e: React.KeyboardEvent) => {
    if (e.key === 'Enter' && !e.shiftKey) {
      e.preventDefault();
      sendMessage();
    }
  };

  return (
    <div className="bg-white p-4 rounded-xl shadow-md">
      <div className="h-[400px] overflow-y-auto space-y-3 mb-4 px-2">
        {messages.map((msg, i) => (
          <div
            key={i}
            className={`p-3 rounded-lg max-w-[80%] ${
              msg.sender === 'user'
                ? 'bg-blue-100 self-end ml-auto text-right'
                : 'bg-gray-200 self-start'
            }`}
          >
            {msg.content}
          </div>
        ))}
        {loading && (
          <div className="italic text-gray-500">AI is thinking...</div>
        )}
      </div>
      <textarea
        className="w-full border rounded-md p-2 resize-none"
        rows={3}
        placeholder="Ask me anything..."
        value={input}
        onChange={(e) => setInput(e.target.value)}
        onKeyDown={handleKeyDown}
      />
      <button
        onClick={sendMessage}
        disabled={loading}
        className="mt-2 w-full bg-blue-600 hover:bg-blue-700 text-white font-semibold py-2 px-4 rounded disabled:opacity-50"
      >
        Send
      </button>
    </div>
  );
}

Preview Example

Ask it questions like:

“What does my example.txt say about company policies?”

And it should return a context-aware response using the Pinecone-backed vector search + GPT.

Optional Improvements (Later)

  • ✅ Add streaming response via readableStream

  • ✅ Use tailwind animations (framer-motion)

  • ✅ Show document sources

  • ✅ Enable file upload (PDF/TXT ingestion)


Conclusion

In this tutorial, you built a Personal AI Assistant using modern AI and web technologies:

  • LangChain for chaining LLMs with retrieval and memory,

  • Pinecone as a high-performance vector database for semantic search,

  • OpenAI GPT-4 to generate human-like, contextual answers,

  • Next.js 14 (App Router) for a responsive full-stack frontend and backend.

You learned how to:

  • Embed and store custom documents as vector embeddings,

  • Retrieve relevant context using vector similarity,

  • Build a Retrieval-Augmented Generation (RAG) chain with LangChain LCEL,

  • Query the chain through a clean API,

  • And finally, create a sleek chat UI to interact with your assistant.

🚀 This setup gives you the foundation to build more advanced features like file uploads, authentication, session memory, voice input, or even multimodal AI interfaces.

You can get the full source code on our GitHub.

That's just the basics. If you need more deep learning about the Next.js frameworks, you can take the following cheap course:

Thanks!