RunResponse with output tokens: 1 when connecting to LangChainKnowledgeBase that exists in ElasticSearch vector store with ollama embeddings

Hi,
When I’m trying to connect to langchain knowledge with llama3.1:8b model on ollama, im getting empty response, with no explanations.
How can I get a response? How can i debug whats wrong? also after exporting os.environ[“PHI_API_KEY”]=“…”
I’m not able to see any sessions in Phi UI Sessions.

code:

embeddings = OllamaEmbeddings(model="nomic-embed-text", base_url=base_url)
vector_store = ElasticsearchStore(
    es_url=es_url,
    vector_query_field="document_vector",
    index_name=index_name,
    strategy=ElasticsearchStore.ExactRetrievalStrategy(),
    embedding=embeddings
)

retriever = vector_store.as_retriever()
kb_knowledge_base = LangChainKnowledgeBase(retriever=retriever)
kb_agent = Agent(
    model=Ollama(id="llama3.1:8b", host=base_url ),
    knowledge=kb_knowledge_base,
    show_tool_calls=True,
    markdown=True,
    add_references_to_prompt=True,
    verbose=True,
    debug_mode=True,
    monitoring=True,
    instructions="Only answer the query from your knowledge base. Try to make answers verbose and detailed. ")

response:

RunResponse(content='', content_type='str', event='RunResponse', messages=[Message(role='system', content='## Instructions\n- Only answer the query from your knowledge base. Try to make answers verbose and detailed. Mention the reason when no response is generated with output token: 1\n- Use markdown to format your answers.', name=None, tool_call_id=None, tool_calls=None, audio=None, images=None, videos=None, tool_name=None, tool_args=None, tool_call_error=None, stop_after_tool_call=False, metrics={}, references=None, created_at=1736171286), Message(role='assistant', content='', name=None, tool_call_id=None, tool_calls=[], audio=None, images=None, videos=None, tool_name=None, tool_args=None, tool_call_error=None, stop_after_tool_call=False, metrics={'time': 3.9188525699996717, 'input_tokens': 92, 'output_tokens': 1, 'total_tokens': 93}, references=None, created_at=1736171290)], metrics=defaultdict(<class 'list'>, {'time': [3.9188525699996717], 'input_tokens': [92], 'output_tokens': [1], 'total_tokens': [93]}), model='llama3.1:8b', run_id='efd93f90-b1ea-47e8-a35d-c96c3da7c269', agent_id='e5faa3d5-e325-4b51-b169-98ae0b445aae', session_id='b3d17a2c-40a7-424a-880b-e11c490fe27c', workflow_id=None, tools=None, images=None, videos=None, audio=None, response_audio=None, extra_data=None, created_at=1736168142)

The same works when openai/gpt models are used with the same vector store.

@shubhangi we don’t support Langchain’s Knowledge Base and it will give you weird errors that you won’t be able to debug. I would suggest you use our knowledge base for seamless integration