compile a list of these videos in the following manner:
1. If it is text to video model, take a prompt and run them through these approaches (cameractrl and motionctrl) and evaluate the translation and rotation metrics. This is needed to have apples to apples comparison.
2. If the model takes image as input, then the above comparison will be easier as the generated scenes are likely to be similar.
9/25
1. Create a table of existing approaches and their pros, cons, evaluation metrics, datasets used and research directions that we can explore