Correct for different camera viewpoints without needing manual calibration.
A human-language query like "Find the new building" or "Highlight the moved chair."
The scene with a change (e.g., a car moved, a building added).
A visual "heatmap" or mask overlaying the video, showing that the AI successfully located the change requested in the text. Technical Significance