L1 Call Topic Prediction Table
Where to find predictions
Predictions for the L1 call topic model can be found in the following Big Query table inside the pg-zinnia-data-production-v1 project:
pg-zinnia-data-production-v1.zowie_ml.topic_model_predictions
Technical Context
This table has predictions for all Call ID’s found in pg-zinnia-data-production-v1.call_center_v1.call_transcriptions_ner table.
There is also an that is pulling new transcriptions daily and updates the topic_model_predictions table with new L1 call topic predictions. Production Airflow DAG can be found . The model used inside the DAG was deployed to . Table Context
Call topic model started out as outputting the top call topic prediction for a given transcription. On October 25th, 2024, we updated the model to now output the top 3 call topic predictions (link to ). This means that: For backward compatibility reasons we kept the original topic_pred_id field which corresponds to the top call topic prediction. With the model update, we added 6 new fields: topic_pred_id_n, and topic_pred_id_n_name. Note that topic_pred_id and topic_pred_id_1 are essentially predicting the same thing. Another full backfill on all historical transcription was also done after the model update. This means that all call IDs have duplicate entries in the table: 1 row for predictions of the top call topic, 1 row for predictions with top 3 call topics. In order to get predictions for the most recent model, filter the topic_model_predictions table to fetch data created past 2024-10-26. Ex: SELECT * FROM pg-zinnia-data-production-v1.zowie_ml.topic_model_predictions
WHERE created_at >= “2024-10-26”
L1 Topic Model Prediction Table
L2 Call Topic Prediction Table
Where to find predictions
Predictions for the L1 call topic model can be found in the following Big Query table inside the pg-zinnia-data-production-v1 project:
pg-zinnia-data-production-v1.zowie_ml.topic_model_l2_predictions
Technical Context
This table has predictions for the more specific L2 topics generated by clustering algorithms. All the L2 topics can be found in the following . This model runs as a daily batch prediction job and updates the topic_model_l2_predictions table. Implementation details can be found in this . Contrary to the L1 topic model, this model was not deployed to Vertex AI. Instead, the job loads the model from GCS and makes offline predictions.
Table Context
With the way L2 model is coded, there is dependency between L1 and L2 topics. Predictions for this L2 model directly depend on the L1 topic_model_predictions table. This is why we also see L1 predictions inside the L2 table. This batch prediction job was created on October 23rd 2024 and no backfill job has currently been run on historical transcripts. This means we only have L2 predictions for Call ids on October 23rd 2024 forward. After realizing that the L2 topic names were not very clean and meaningful, an update was made to have refined L2 call topic names which can all be found in this . The update was made on 2024-10-31 so the newer L2 topic names are available from that date forward. L2 Topic Model Prediction Table