Share
Explore

Shrinked.ai Second Epic [public]

Jobs Module Implementation

**Goal**: Implement robust background task processing system using BullMQ, Redis, and MongoDB The API provides a comprehensive system for processing and managing content transformations through an asynchronous pipeline that accepts various inputs (files and URLs), processes them using a configurable multi-step workflow backed by BullMQ and Redis for reliable queuing, tracks progress and handles errors through MongoDB storage, authenticates users and manages access through API keys with rate limiting, and delivers processed content as structured outputs while providing detailed monitoring and administrative capabilities throughout the entire lifecycle. (
@Andrew Balitsky
)
[Audio/Video]
[Transcription Service] → Raw Text
[Metadata Extractor] → Title/Chapters/Refs
[Content/Report Generator] → Contributors/Intro/Conclusion
[Result Compiler] → Normalized JSON
[Storage Service] → MongoDB/R2

→ Platogram Req Update (
@Петренко Євгеній
)

The new architecture will shift LLM processing to be handled as configurable job steps through the Jobs Module, where LLMService acts as an abstraction layer for multiple providers. Instead of making direct API calls from Python, requests will flow through the Jobs API, which uses ScenarioFactory to generate appropriate processing steps based on the selected provider. This requires updating the job schema to accommodate provider-specific configurations and moving prompt templates from Python code to API-level configurations.
The migration involves extracting the LLM operations currently in anthropic.py into the Jobs Module's StepHandlers, adding provider selection to the job creation flow, and implementing provider-specific processing logic within the API layer. This maintains compatibility with the Jobs Module's asynchronous processing pattern while allowing for flexible provider selection at runtime.
Proposed Changes:
Move LLM processing logic from Python layer to Jobs Module as a configurable step
Create LLMService that supports multiple providers (Anthropic, OpenAI, etc.) on API/Shrinked back side
Add provider selection to Shrinked.ai API’s step configurator
Update job schema to include provider-specific configurations
Знімок екрана 2025-01-30 о 02.05.21.png
What are all these flags are for?
Sending param’s to Plato:
“set -e
URL="$1"
LANG="en"
VERBOSE="false"
IMAGES="false"
while [[ $# -gt 0 ]]; do
case $1 in
--lang)
LANG="$2"
shift
shift
;;
--verbose)
VERBOSE="true"
shift
;;
--images)
IMAGES="true"
shift
;;
*)
shift
;;
esac”
Questions:
WHY the URL should be included at all times?
Can we include multiple URL’s?

Ideally add:

-- merge doc1.md doc2.md \ A format in which we add two already structured/templated markdowns to be processed as Parts (like a zip archive) to be included in the input. Every part being an URL to md file. Each part should be referenced in the output like [timestamp + part number]. The output report uses the same scripted rewrite approach. ------ Find relevant chapters based on similar score chapters -- assemblyai-api-key A modified version of this flag which will allow to add custom data/service/llm providers. Basically we need a wrapper.

And later we’d wanna try adding these blocks (short ones like abstract/contributors) ​## Adjusted abstract reflecting on the prompt given for the report ## Expanded chapters [every chapters title + a very short summary] Custom blocks [ requested by clients, more to be added ]

## sentiment analysis
## intent detection

## key actions EXAMPLE_STRUCTURE:
Screenshot 2025-01-30 at 4.50.48 PM.png

EXAMPLE EXTRA FLAGS:
Screenshot 2025-01-30 at 4.52.24 PM.png

Want to print your doc?
This is not the way.
Try clicking the ⋯ next to your doc name or using a keyboard shortcut (
CtrlP
) instead.