We introduce the Open Sequential Repetitive Action Counting (OSRAC) task, which aims to count all repetitions and locate transition boundaries of sequential actions from large-scale video data, without relying on predefined action categories. Unlike the Repetitive Action Counting (RAC) task that focuses on a single-action assumption, OSRAC handles diverse and alternating repetitive action sequences in real-world scenarios, which is fundamentally more challenging. To this end, we propose UniCount, a universal system capable of counting multiple sequential repetitive actions from video data. Specifically, UniCount designs three primary modules: the Universal Repetitive Pattern Learner (URPL) to capture general repetitive patterns in alternating actions, Temporal Action Boundary Discriminator (TABD) to locate the action transition boundaries, and Dual Density Map Estimator (DDME) to achieve action counting and repetition segmentation. We also design a novel actionness loss to improve the detection of action transitions. To support this task, we conduct in-depth data analysis on existing RAC datasets and construct several OSRAC benchmarks (i.e., MUCFRep, MRepCount, and MInfiniteRep) by developing a pipeline on data processing and mining. We further perform comprehensive experiments to evaluate the effectiveness of UniCount. On MInfiniteRep, UniCount substantially improves the Off-By-One Accuracy (OBOA) from 0.39 to 0.78 and decreases the Mean Absolute Error (MAE) from 0.29 to 0.14 compared to counterparts. UniCount also achieves superior performance in open-set data, showcasing its universality.
- Article type
- Year
- Co-author
Open Access
Issue
Open Access
Issue
The accurate segmentation of medical images is crucial to medical care and research; however, many efficient supervised image segmentation methods require sufficient pixel level labels. Such requirement is difficult to meet in practice and even impossible in some cases, e.g., rare Pathoma images. Inspired by traditional unsupervised methods, we propose a novel Chan-Vese model based on the Markov chain for unsupervised medical image segmentation. It combines local information brought by superpixels with the global difference between the target tissue and the background. Based on the Chan-Vese model, we utilize weight maps generated by the Markov chain to model and solve the segmentation problem iteratively using the min-cut algorithm at the superpixel level. Our method exploits abundant boundary and local region information in segmentation and thus can handle images with intensity inhomogeneity and object sparsity. In our method, users gain the power of fine-tuning parameters to achieve satisfactory results for each segmentation. By contrast, the result from deep learning based methods is rigid. The performance of our method is assessed by using four Computerized Tomography (CT) datasets. Experimental results show that the proposed method outperforms traditional unsupervised segmentation techniques.
京公网安备11010802044758号