카테고리 없음

[Paper Review] Project Starline: A high-fidelity telepresence system

bona.0 2023. 2. 3. 15:51

  • Pod 
    • RGB camera for texture 
    • a pair of monochrome NIR cameras for stereo
    • ➡️create depth maps at 60Hz by incorporating information from overlapping time windows of 5 NIR image pairs

  • 3D Face Tracking
    • Eye locations: determine stereo viewpoints for rendering
    • Mouth position: enable beamforming in audio cpture
    • 4 synchronized monochrome cameras detect the face and locate 34 facial landmarks
    • Determine the 2D locations of fice features(eyes, mouth and ears) as weighted combinations of nearby landmarks
    • For each feature, they used trangulation to obtain its 3D position
    • Mitigate the latency
      • Extrapolate the 3D positions of the tracked features
      • Apply double exponential smoothing 
      • remove this small noise using a “change band” hysteresis filter

 

  • Compression
    • They transmit multiple color images and stereoreconstructed depth maps using traditional video compression
    • Both the color and depth streams are encoded using the H.265 codec with YUV420 chroma subsampling.
    • reduce encoding and decoding latency by omitting bidirectionally encoded frames.
    • and delay their “fusion” until the rendering (Section 4.6) of the left and right eye views in the receiving client

 

  • Transmission
    • For each frame, we gather the encoded video packets from all 7 video streams (as well as the tracked face points) into a single data payload, and transmit it usingWebRTC

 

  • Rendering
    • On the receiving client, decompress the 3 depth maps and 4 color images
    • (1) for each of the 4 color cameras, compute a shadow map using raycasting by finding for each ray the first intersection with a surface fused from the input depth maps,
    • (2) for each of the 2 user views (left and right eye), compute an output depth map using the same raycasting algorithm
    • (3) for each output depth map point, compute a weighted color blend of the images determined visible by the shadow maps computed in step 1.

 

Reference)

https://research.google/pubs/pub50903/

 

Project Starline: A high-fidelity telepresence system – Google Research

We present a real-time bidirectional communication system that lets two people, separated by distance, experience a face-to-face conversation as if they were copresent. It is the first telepresence system that is demonstrably better than 2D videoconferenci

research.google