API Documentation
Overview
This API is a Pub/Sub subscriber that listens to a specific topic and performs image preprocessing tasks. It receives a message containing the bucket name, job ID, and image directory, then processes the images in the specified directory and publishes the results to another Pub/Sub topic.
Input Format
The API expects a POST, GET, or PUT request with a JSON body. The JSON body should contain a “message” object with a “data” field. The “data” field should contain a base64 encoded string, which when decoded, should be a JSON object with the following fields:
bucketName: The name of the Google Cloud Storage bucket where the images are stored. jobId: A unique identifier for the job. imageDir: The directory in the bucket where the images are stored. Example:
{
"message": {
"data": "eyJidWNrZXROYW1lIjogIm15LWJ1Y2tldCIsICJqb2JJZCI6ICIxMjM0NSIsICJpbWFnZURpciI6ICJpbWFnZXMvIn0="
}
}
The base64 encoded string in the “data” field decodes to:
{
"bucketName": "my-bucket",
"jobId": "job123",
"imageDir": "job123/images/"
}
Functionality
When the API receives a request, it performs the following steps:
Decodes the base64 encoded string in the “data” field of the “message” object. Validates the decoded data against the ArgsModel schema. If the data is not valid, it returns a 400 error with a description of the validation error. Lists the images in the specified directory of the specified bucket. For each image, it detects faces using the detect_face function and keeps track of the success of each detection, images that fail face detection are deleted so that they do not cause noise in training data Publishes a message to the TOPIC_NAME Pub/Sub topic with the job ID and the results of the face detection. Returns a 200 response with a JSON body containing a “success” field set to true. Testing
To test the API, you can send a POST, GET, or PUT request to the API’s URL with a JSON body in the format described above. You can use tools like curl or Postman to send the request.