Currently i’m working on an online meetings summarization task (mostly focused now on Russian language ones) - starting from speech-to-text using Deepgram and then feeding created text into a llama3.1 model from Groq with the usage of Phidata interface to 2 assistants:
1st) divide the text into chunks and summarizing main points
2nd) perform result summarization of all the chunk summaries and output it into a proper format
First and current approach was just the usage of add_to_system_prompt using Assistants for both summarizers, what is now deprecated:
add_to_system_prompt=dedent(
"""
# {Укажите название совещания}
## Отчет о совещании
**Название совещания:** {Укажите название совещания}
**Дата:** {Укажите дату совещания}
**Место проведения:** {Укажите место проведения совещания}
**Участники:** {Перечислите всех участников}
**Цель совещания:** {Кратко опишите основную цель совещания}
**Обсуждаемые темы:**
1. Тема 1: [Краткое описание обсуждения], Решение: [Описание решения, если применимо], Ответственный: [Имя ответственного]
2. Тема 2: [Краткое описание обсуждения], Решение: [Описание решения, если применимо], Ответственный: [Имя ответственного]
...
**Задачи к выполнению:**
- Задача 1: [Описание задачи], Ответственный: [Имя], Срок: [Дата выполнения]
- Задача 2: [Описание задачи], Ответственный: [Имя], Срок: [Дата выполнения]
...
**Основные выводы и планы на будущее:**
- Вывод 1: [Краткое описание], Ответственный: [Имя]
- Вывод 2: [Краткое описание], Ответственный: [Имя]
...
"""
)
The text is in Russian, but the overall idea is that I want the model to output the data in this kind of format.
The problem, unfortunately, is that the response doesn’t follow the same format as pointed out - it provides slightly different outputs within the same responses. Even though I put the temperature=0 but it still doesn’t follow the format completely. All the tasks summarization examples in the internet are tackling a lot of different documents and without a proper end-formatting. How to overcome that issue?
Possibilities:
1. At first there must be a way to transfer from Assistants to Agents interface. Via first tests it’s quite straightforward, but the problem is that the response of an agent is now a bit different and how to properly parse the output text - that’s a question for me. I think agent outputs a strange tuple now, which I don’t know how to parse. That’s how everything was done in the main file (get_chunk_summarizer and get_text_summarizer are the assistants):
def split_text_into_chunks(text, chunk_size, overlap_size):
words = text.split()
chunks = []
i = 0
while i < len(words):
chunks.append(" ".join(words[i: i + chunk_size]))
i += chunk_size - overlap_size
return chunks
def summarize_text(text, model=None, chunker_limit=None, overlap_size=500) -> str:
if model is None:
model = os.getenv("MODEL", "llama-3.1-70b-versatile")
if chunker_limit is None:
chunker_limit = int(os.getenv("CHUNKER_LIMIT", 4500))
overlap_size = int(os.getenv("OVERLAP_SIZE", 500))
logger.info(f"Using model: {model}")
try:
chunks = split_text_into_chunks(text, chunker_limit, overlap_size)
num_chunks = len(chunks)
if num_chunks > 1:
logger.info(f"Text is split into {num_chunks} chunks")
chunk_summaries = []
for i in range(num_chunks):
chunk_summary = ""
chunk_summarizer = get_chunk_summarizer(model=model)
chunk_info = f"Text chunk {i + 1}:\n\n"
chunk_info += f"{chunks[i]}\n\n"
for delta in chunk_summarizer.run(chunk_info):
chunk_summary += delta # type: ignore
chunk_summaries.append(chunk_summary)
summary = ""
text_info = "Summaries:\n\n"
for i, chunk_summary in enumerate(chunk_summaries, start=1):
text_info += f"Chunk {i}:\n\n{chunk_summary}\n\n"
text_info += "---\n\n"
for delta in get_text_summarizer(model=model).run(text_info):
summary += delta # type: ignore
else:
logger.info("Text is short enough to summarize in one go")
summary = ""
text_info = f"Text:\n\n"
text_info += f"{text}\n\n"
for delta in get_text_summarizer(model=model).run(text_info):
summary += delta # type: ignore
return summary
except Exception as e:
logger.exception(f"An error occurred during text summarization: {e}")
raise
2. After checking some documentation I found out that the feature of response_model can be very useful in my program (might be not, if I didn’t correctly understood it’s need). The problem is - how to properly use it in a report generation? Pydantic has a very extensive documentation and going through it is a challenging task. As well - how to provide headings to the Fields? How to make sure that the model won’t generate just one line per class field, but will dynamically change it according to the processed text (topics, tasks and etc in the 2nd class - that’s why btw I put List). How do I then make this response in a JSON format for the front-end processing and debugging, such as providing POST requests of a raw text from speech-to-text and outputting the JSON formated summary?
I’ve came up with this but it’s definitely not the proper way of how it must be implemented:
class ChunkSummarizer(BaseModel):
topics: str = Field(..., description = "Кратко опишите темы, обсуждавшиеся в чанке")
solutions: str = Field(..., description = "Укажите принятые решения, если они были упомянуты")
persons: List[str] = Field(..., description = "Перечислите ответственных лиц")
class TextSummarizer(BaseModel):
name: str = Field(..., description = "Укажите название совещания")
date: str = Field(..., description = "Укажите дату совещания")
place: str = Field(..., description = "Укажите место проведения совещания. Если не доступно, то укажите, что это онлайн или оффлайн совещание")
persons: List[str] = Field(..., description = "Перечислите всех участников")
reason: str = Field(..., description = "Кратко опишите основную цель совещания")
topics: List[str] = Field(..., description = "Краткое описание обсуждаемой темы. Описание решения, если применимо. Ответственный: имя")
tasks: List[str] = Field(..., description = "Описание задачи к выполнению. Ответственный: имя. Срок: дата выполнения")
results: List[str] = Field(..., description = "Краткое описание основных выводов и планов на будущее. Ответственный: имя")
I would appreciate any help for any of my questions. I’m new to all of this and would like to know more. I cannot find similar examples on that.