Pre Processor Documentation

API Documentation

Overview

This API is a Pub/Sub subscriber that listens to a specific topic and performs image preprocessing tasks. It receives a message containing the bucket name, job ID, and image directory, then processes the images in the specified directory and publishes the results to another Pub/Sub topic.

⁠

Input Format

The API expects a POST, GET, or PUT request with a JSON body. The JSON body should contain a “message” object with a “data” field. The “data” field should contain a base64 encoded string, which when decoded, should be a JSON object with the following fields:

bucketName: The name of the Google Cloud Storage bucket where the images are stored.

jobId: A unique identifier for the job.

imageDir: The directory in the bucket where the images are stored.

Example:

{

"message": {

"data": "eyJidWNrZXROYW1lIjogIm15LWJ1Y2tldCIsICJqb2JJZCI6ICIxMjM0NSIsICJpbWFnZURpciI6ICJpbWFnZXMvIn0="

}

The base64 encoded string in the “data” field decodes to:

{

"bucketName": "my-bucket",

"jobId": "job123",

"imageDir": "job123/images/"

}

Functionality

When the API receives a request, it performs the following steps:

Decodes the base64 encoded string in the “data” field of the “message” object.

Validates the decoded data against the ArgsModel schema. If the data is not valid, it returns a 400 error with a description of the validation error.

Lists the images in the specified directory of the specified bucket.

For each image, it detects faces using the detect_face function and keeps track of the success of each detection, images that fail face detection are deleted so that they do not cause noise in training data

Publishes a message to the TOPIC_NAME Pub/Sub topic with the job ID and the results of the face detection.

Returns a 200 response with a JSON body containing a “success” field set to true.

Testing

To test the API, you can send a POST, GET, or PUT request to the API’s URL with a JSON body in the format described above. You can use tools like curl or Postman to send the request.

Want to print your doc?
This is not the way.

Try clicking the ⋯ next to your doc name or using a keyboard shortcut (

CtrlP

) instead.