Person detection and person re-identi�cation are rapidly increasing research areas in computer vision. They are independent but related. In fact, the output of person detection is the input of person re-identi�cation. There are a certain number of solutions for each of these two individual tasks. But currently, there is no existing solution that can combine them to form an integrated working pipeline. To �ll the gap, we propose a highly modular and structural framework solution that provides the functionalities including not only cross-language invocation and pipeline execution mechanism but also viewer, device, tracker, detector, and recognizer abstraction. We instantiate the proposed framework to achieve our goal of tracking the same person across multiple cameras, which essentially is the combination of person detection and person re-identi�cation. Besides the main task of person re-identi�cation, we also support skeleton tracking, as well as camera calibration, image alignment and green screen image which commonly comes with a computer vision framework. We evaluate our proposed solution according to the requirements and usage scenarios and report the major metrics used by the research community for person detection and person re-identi�cation tasks, respectively.