Summary generation in a formatted fashion

Currently i’m working on an online meetings summarization task (mostly focused now on Russian language ones) - starting from speech-to-text using Deepgram and then feeding created text into a llama3.1 model from Groq with the usage of Phidata interface to 2 assistants:
1st) divide the text into chunks and summarizing main points
2nd) perform result summarization of all the chunk summaries and output it into a proper format

First and current approach was just the usage of add_to_system_prompt using Assistants for both summarizers, what is now deprecated:

add_to_system_prompt=dedent(
            """
            # {Укажите название совещания}

            ## Отчет о совещании

            **Название совещания:** {Укажите название совещания}
            **Дата:** {Укажите дату совещания}
            **Место проведения:** {Укажите место проведения совещания}
            **Участники:** {Перечислите всех участников}

            **Цель совещания:** {Кратко опишите основную цель совещания}

            **Обсуждаемые темы:**
            1. Тема 1: [Краткое описание обсуждения], Решение: [Описание решения, если применимо], Ответственный: [Имя ответственного]
            2. Тема 2: [Краткое описание обсуждения], Решение: [Описание решения, если применимо], Ответственный: [Имя ответственного]
            ...

            **Задачи к выполнению:**
            - Задача 1: [Описание задачи], Ответственный: [Имя], Срок: [Дата выполнения]
            - Задача 2: [Описание задачи], Ответственный: [Имя], Срок: [Дата выполнения]
            ...

            **Основные выводы и планы на будущее:**
            - Вывод 1: [Краткое описание], Ответственный: [Имя]
            - Вывод 2: [Краткое описание], Ответственный: [Имя]
            ...

            """
)

The text is in Russian, but the overall idea is that I want the model to output the data in this kind of format.
The problem, unfortunately, is that the response doesn’t follow the same format as pointed out - it provides slightly different outputs within the same responses. Even though I put the temperature=0 but it still doesn’t follow the format completely. All the tasks summarization examples in the internet are tackling a lot of different documents and without a proper end-formatting. How to overcome that issue?

Possibilities:
1. At first there must be a way to transfer from Assistants to Agents interface. Via first tests it’s quite straightforward, but the problem is that the response of an agent is now a bit different and how to properly parse the output text - that’s a question for me. I think agent outputs a strange tuple now, which I don’t know how to parse. That’s how everything was done in the main file (get_chunk_summarizer and get_text_summarizer are the assistants):

def split_text_into_chunks(text, chunk_size, overlap_size):
    words = text.split()
    chunks = []
    i = 0
    while i < len(words):
        chunks.append(" ".join(words[i: i + chunk_size]))
        i += chunk_size - overlap_size
    return chunks

def summarize_text(text, model=None, chunker_limit=None, overlap_size=500) -> str:
    if model is None:
        model = os.getenv("MODEL", "llama-3.1-70b-versatile")
    if chunker_limit is None:
        chunker_limit = int(os.getenv("CHUNKER_LIMIT", 4500))
    overlap_size = int(os.getenv("OVERLAP_SIZE", 500)) 

    logger.info(f"Using model: {model}")

    try:
        chunks = split_text_into_chunks(text, chunker_limit, overlap_size)
        num_chunks = len(chunks)

        if num_chunks > 1:
            logger.info(f"Text is split into {num_chunks} chunks")
            chunk_summaries = []
            for i in range(num_chunks):
                chunk_summary = ""
                chunk_summarizer = get_chunk_summarizer(model=model)
                chunk_info = f"Text chunk {i + 1}:\n\n"
                chunk_info += f"{chunks[i]}\n\n"
                for delta in chunk_summarizer.run(chunk_info):
                    chunk_summary += delta  # type: ignore
                chunk_summaries.append(chunk_summary)

            summary = ""
            text_info = "Summaries:\n\n"
            for i, chunk_summary in enumerate(chunk_summaries, start=1):
                text_info += f"Chunk {i}:\n\n{chunk_summary}\n\n"
                text_info += "---\n\n"

            for delta in get_text_summarizer(model=model).run(text_info):
                summary += delta  # type: ignore
        else:
            logger.info("Text is short enough to summarize in one go")
            summary = ""
            text_info = f"Text:\n\n"
            text_info += f"{text}\n\n"

            for delta in get_text_summarizer(model=model).run(text_info):
                summary += delta  # type: ignore

        return summary

    except Exception as e:
        logger.exception(f"An error occurred during text summarization: {e}")
        raise

2. After checking some documentation I found out that the feature of response_model can be very useful in my program (might be not, if I didn’t correctly understood it’s need). The problem is - how to properly use it in a report generation? Pydantic has a very extensive documentation and going through it is a challenging task. As well - how to provide headings to the Fields? How to make sure that the model won’t generate just one line per class field, but will dynamically change it according to the processed text (topics, tasks and etc in the 2nd class - that’s why btw I put List). How do I then make this response in a JSON format for the front-end processing and debugging, such as providing POST requests of a raw text from speech-to-text and outputting the JSON formated summary?
I’ve came up with this but it’s definitely not the proper way of how it must be implemented:

class ChunkSummarizer(BaseModel):
    topics: str = Field(..., description = "Кратко опишите темы, обсуждавшиеся в чанке")
    solutions: str = Field(..., description = "Укажите принятые решения, если они были упомянуты")
    persons: List[str] = Field(..., description = "Перечислите ответственных лиц")
    
class TextSummarizer(BaseModel):
    name: str = Field(..., description = "Укажите название совещания")
    date: str = Field(..., description = "Укажите дату совещания")
    place: str = Field(..., description = "Укажите место проведения совещания. Если не доступно, то укажите, что это онлайн или оффлайн совещание")
    persons: List[str] = Field(..., description = "Перечислите всех участников")
    reason: str = Field(..., description = "Кратко опишите основную цель совещания")
    topics: List[str] = Field(..., description = "Краткое описание обсуждаемой темы. Описание решения, если применимо. Ответственный: имя")
    tasks: List[str] = Field(..., description = "Описание задачи к выполнению. Ответственный: имя. Срок: дата выполнения")
    results: List[str] = Field(..., description = "Краткое описание основных выводов и планов на будущее. Ответственный: имя")

I would appreciate any help for any of my questions. I’m new to all of this and would like to know more. I cannot find similar examples on that.

Hi @Olegario228 ,
Thank you for reaching out and using Phidata! I’ve tagged the relevant engineers to assist you with your query. We aim to respond within 24 hours.
If this is urgent, please feel free to let us know, and we’ll do our best to prioritize it.
Thanks for your patience!

Hey @Olegario228 !

We created a doc to help you migrate from Assistants to Agents. Please take a look at: Upgrade to v2.5.0 - Phidata

You are spot on, response_model would work great for your use case. A combination of detailed description + system prompt (via instructions) would be the best way to implement your application and making sure the model follows the instructions. The run function now returns a Pydantic object, with the following structure:

class RunResponse(BaseModel):
    """Response returned by Agent.run() or Workflow.run() functions"""

    content: Optional[Any] = None
    content_type: str = "str"
    event: str = RunEvent.run_response.value
    messages: Optional[List[Message]] = None
    metrics: Optional[Dict[str, Any]] = None
    model: Optional[str] = None
    run_id: Optional[str] = None
    agent_id: Optional[str] = None
    session_id: Optional[str] = None
    workflow_id: Optional[str] = None
    tools: Optional[List[Dict[str, Any]]] = None
    extra_data: Optional[RunResponseExtraData] = None
    created_at: int = Field(default_factory=lambda: int(time()))

The content param contains the actual output from the model. And you can extract it from the result of the run function to display it on the front end.

Please let me know if you have any further questions