NRT Trainee Team Presentations

Fall 2021 Presentation Abstracts


An acoustic and computer graphics perspective for HRTF
personalization for spatial audio in AR/VR

Yuxiang Wang and Neil Zhang

The head-related Transfer Function (HRTF) describes how humans perceive spatial audio. HRTFs cover various directions are individual. Most current spatial audio displays use generic HRTF to provide spatial cues for customers, but it may result in vague or misplaced spatial perception. Personalizing HRTF has wide applications in AR/VR games and virtual scene reconstruction.

To efficiently achieve this goal, we need the research background in both acoustics and machine learning, as well as some inputs from computer vision. We are combining our strengths in different backgrounds to find a unique approach to this research question.

We first found efficient representations to reduce the dimension of both the HRTF data and the hear-torse geometry. Then we built a machine learning framework to find the relation between the compact presentation of both the acoustic data and geometric data. By adapting these methods, we were able to predict global HRTFs with decent error across all spatial and temporal dimensions.


Augmented Pianoroll

Frank Cwitkowitz, Eleni Patelaki, & Jeremy Goodsell

Learning to play the piano can be a difficult, lengthy, and frustrating task. What’s more, this process typically involves learning how to read sheet music, which can be cumbersome and unintuitive for many people. Recently, in the digital era, the need to understand sheet music has been circumvented by employing animated “pianoroll”, a binary time-frequency representation that indicates the piano keys which should be active across time. This representation of music makes playing piano more accessible, but it is still disconnected from the instrument, as one must follow the animation on a separate device. In this project, we propose to overlay animated pianoroll directly atop a physical piano using the HoloLens 2. We generalize our application such that, through a user interface, a user can select any MIDI file in the device’s storage to display as augmented pianoroll, anchoring it to the piano using QR code tracking. Furthermore, we provide real-time feedback on the user’s performance by comparing the expected key activity to the key activity estimated using an on-device piano transcription model. The application was successful and will undergo future development in preparation for some form of release.


An AR Solution for Blinds Navigation

Narges Mohammadi and Shadi Sartipi

This project entitled “Developing AR/VR solutions for blind people navigation” is about developing a user-friendly application which aim to transfer the visual information captured by the device cameras and sensor to a mix of audio signals to give the subject the sense of awareness for the surrounding environment. This applicationrequires scene understanding using spatial meshing, machine vision for object detection and generating spatial audio. In fact, this work tries to give a sound to silent objects in the environment and present the semantic information in the surrounding to users by combinations of spatial audio clues to not only avoid but also locate the interested objects while avoiding cognitive overloads to the user.