Error creating Knowledge Base from pdfs directory using llama3.2

This is the code I used

from phi.agent import Agent
from phi.knowledge.pdf import PDFKnowledgeBase, PDFReader
from phi.vectordb.pgvector import PgVector
from phi.embedder.ollama import OllamaEmbedder

db_url = "postgresql+psycopg://ai:ai@localhost:5532/ai"

# Create a knowledge base with the PDFs from the specified directory
knowledge_base = PDFKnowledgeBase(
    path=r"C:\Users\ducng\Downloads\Phidata\knowledge",
    vector_db=PgVector(
        table_name="333",
        db_url=db_url,
        embedder=OllamaEmbedder(model="llama3.2", dimensions=4096),
    ),
    reader=PDFReader(chunk=True),
)

knowledge_base.load(recreate=True)

agent = Agent(
    knowledge=knowledge_base,
    search_knowledge=True,
)

agent.print_response("What are the key concepts in the knowledge base?", markdown=True)

And I got this error

(env) C:\Users\ducng\Downloads\Phidata>python pdf.py
INFO     Dropping collection
INFO     Table 'ai.333' dropped successfully.
INFO     Creating collection
INFO     Loading knowledge base
INFO     Reading: An Introduction to Relativity - Jayant V
ERROR    Error with batch starting at index 0: (builtins.ValueError) expected 4096 dimensions, not
         3072

Does anyone know how to fix this? Thanks ahead.

Hi @Seitoku
Thank you for reaching out and using Phidata! I’ve tagged the relevant engineers to assist you with your query. We aim to respond within 24 hours.
If this is urgent, please feel free to let us know, and we’ll do our best to prioritize it.
Thanks for your patience!

Hey @Seitoku, can you try deleting your table 333 and try again? I think the embeddings are set as 3072 in the table but the Ollama llama3.2 model is outputting 4096 dimension vectors.

Thanks for your suggestion. I fixed the error by removing model="llama3.2", dimensions=4096, so the embedded will use openhermes for embedding :disguised_face:.

1 Like