CAST: Modeling Visual State Transitions for Consistent Video Retrieval | Signal Canvas | ScienceToStartup