We've Moved!
Visit SDSU’s new digital collections website at https://digitalcollections.sdsu.edu
Description
Falls among elderly people can lead to serious injuries and significantly impact their quality of life. This research proposes a vision-based fall prediction and detection system using the human pose estimation (HPE) method running on an edge computing device, Xilinx Kria KV260 Vision AI development platform. The system comprises an Intel® RealSense™ D455 infrared (IR) stereo-based range-sensing camera connected to a KV260 platform. The camera captures synchronized RGB and depth frames of dimensions 640x480x3 (HxWxC) and 640x480 respectively at a rate of 60 frames per second for real-time processing. The KV260 board has a Quad-core Arm® Cortex®-A53 processor and a PL-configured DPU unit synthesized to have three cores, where each core can perform 1024 operations per clock cycle. We designed a parallel 3-stage pipeline of machine learning models on the board utilizing the available resources of the SoC to increase the efficiency and speed of the overall process. All the machine learning models are trained on a GPU server and quantized using the Xilinx Vitis-ai-quantizer to deploy on the KV260 platform. The pipeline’s first model is YOLOX, an object detection model, trained on the CrowdHuman dataset, where we achieved a quantized accuracy of 74%. The YOLOX model takes an RGB frame with a dimension of 640x640x3 to produce a bounding box per human in the frame, and discards the RGB frame, preserving privacy. The second model in the pipeline is a ResNet50-based Anchor-to-Joint (A2J) regression network, trained on the MP-3DHP: Multi-Person 3D Human Pose Dataset and the ITOP dataset, where we achieved a quantized accuracy of 83%. The A2J model accepts a depth frame with a dimension of 288x288. The bounding box coordinates for each human in the frame are cropped, rescaled, and normalized. The model produces 15 key points of joint information of a detected human. The final model in the pipeline is a binary classifier that takes informative joint coordinates (x,y,z,t) and predicts and or detects human fall activity. The KV260 also has a Dual-core Arm Cortex-R5F MPCore real-time processor that invokes an alerting system whenever a fall is inferred by the pipeline.