Publications

For further stats and details, check out my Google Scholar Profile.

Peer Reviewed Articles

  1. "AudioChat: Unified Audio Storytelling, Editing, and Understanding with Transfusion Forcing", William Chen, Prem Seetharaman, Rithesh Kumar, Oriol Nieto, Shinji Watanabe, Justin Salamon, Zeyu Jin, (under review), 2026. arXiv Demo
  2. "TAC: Timestamped Audio Captioning", Sonal Kumar, Prem Seetharaman, Ke Chen, Oriol Nieto, Jiaqi Su, Zhepei Wang, Rithesh Kumar, Dinesh Manocha, Nicholas J. Bryan, Zeyu Jin, Justin Salamon, (under review), 2026. arXiv Demo
  3. "Generative Audio Extension and Morphing", Prem Seetharaman,* Oriol Nieto,* Justin Salamon, Proc. of the 51st International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Barcelona, Spain, 2026. arXiv Demo
  4. "Mix2Morph: Learning Sound Morphing From Noisy Mixes", Annie Chu, Hugo Flores García, Oriol Nieto, Justin Salamon, Bryan Pardo, Prem Seetharaman, Proc. of the 51st International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Barcelona, Spain, 2026. arXiv Demo
  5. "PromptSep: Generative Audio Separation Via Multimodal Prompting", Yutong Wen, Ke Chen, Prem Seetharaman, Oriol Nieto, Jiaqi Su, Rithesh Kumar, Minje Kim, Paris Smaragdis, Zeyu Jin, Justin Salamon, Proc. of the 51st International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Barcelona, Spain, 2026. arXiv Demo
  6. "AudioCards: Structured Metadata Improves Audio Language Models For Sound Design", Sripathi Sridhar, Prem Seetharaman, Oriol Nieto, Mark Cartwright, Justin Salamon, Proc. of the 51st International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Barcelona, Spain, 2026. arXiv Demo
  7. "Multi-Domain Audio Question Answering Benchmark Toward Acoustic Content Reasoning", Chao-Han Huck Yang, Sreyan Ghosh, Qing Wang, Jaeyeon Kim, Hengyi Hong, Sonal Kumar, Guirui Zhong, Zhifeng Kong, S. Sakshi, Vaibhavi Lokegaonkar, Oriol Nieto, Ramani Duraiswami, Dinesh Manocha, Gunhee Kim, Jun Du, Rafael Valle, Bryan Catanzaro, Proc. of the 51st International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Barcelona, Spain, 2026. arXiv
  8. "SoundStager: Interactive Design of Story-Driven GenAI Soundscapes for Video", Suhyeon Yoo, Adolfo Hernandez-Sebastian, Prem Seetharaman, Justin Salamon, Oriol Nieto, Anh Truong, Proc. of the ACM Conference on Human Factors in Computing Systems (CHI). Barcelona, Spain, 2026. PDF Video
  9. "SILA: Signal-to-Language Augmentation for Enhanced Control in Text-to-Audio Generation", Sonal Kumar, Prem Seetharaman, Justin Salamon, Dinesh Manocha, Oriol Nieto, Proc. of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). Tahoe City, CA, USA, 2025. arXiv Demo
  10. "FLAM: Frame-Wise Language-Audio Modeling", Yusong Wu, Christos Tsirigotis, Ke Chen, Cheng-Zhi Anna Huang, Aaron Courville, Oriol Nieto, Prem Seetharaman, Justin Salamon, Proc. of the 47th International Conference on Machine Learning (ICML). Vancouver, BC, Canada, 2025. arXiv Code Demo
  11. "Video-Guided Foley Sound Generation with Multimodal Controls", Ziyang Chen, Prem Seetharaman, Bryan Russell, Oriol Nieto, David Bourgin, Andrew Owens, Justin Salamon, The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR). Nashville, TN, USA, 2025. arXiv Demo
  12. "Visual Description Grounding Reduces Hallucinations and Boosts Reasoning in LVLMs", Sreyan Ghosh, Chandra Kiran Reddy Evuru, Sonal Kumar, Utkarsh Tyagi, Oriol Nieto, Zeyu Jin, Dinesh Manocha, Proc. of the 13th International Conference on Learning Representations (ICLR). Singapore, 2025. arXiv Demo
  13. 🏆 Top 5.1% conference paper (spotlighted)
    "MMAU: A Massive Multi-Task Audio Understanding and Reasoning Benchmark", S. Sakshi, Utkarsh Tyagi, Sonal Kumar, Ashish Seth, Ramaneswaran Selvakumar, Oriol Nieto, Ramani Duraiswami, Sreyan Ghosh, Dinesh Manocha, Proc. of the 13th International Conference on Learning Representations (ICLR). Singapore, 2025. arXiv Code Demo
  14. "Sketch2Sound: Controllable Audio Generation via Time-Varying Signals and Sonic Imitations", Hugo Flores García, Oriol Nieto, Justin Salamon, Bryan Pardo, Prem Seetharaman, Proc. of the 50th International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Hyderabad, India, 2025. arXiv Demo
  15. "ReCLAP: Improving Zero Shot Audio Classification by Describing Sounds", Sreyan Ghosh, Sonal Kumar, Chandra Kiran Reddy Evuru, Oriol Nieto, Ramani Duraiswami, Dinesh Manocha, Proc. of the 50th International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Hyderabad, India, 2025. arXiv Code
  16. "Augment, Drop & Swap: Improving Diversity in LLM Captions for Efficient Music-Text Representation Learning", Ilaria Manco, Justin Salamon, Oriol Nieto, Proc. of the 25th International Society for Music Information Retrieval Conference (ISMIR). San Francisco, CA, USA, 2024. arXiv
  17. 🏆 Top 5% conference paper (oral)
    "GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities", Sreyan Ghosh, Sonal Kumar, Ashish Seth, Chandra Kiran Reddy Evuru, Utkarsh Tyagi, S. Sakshi, Oriol Nieto, Ramani Duraiswami, Dinesh Manocha, Proc. of the 19th Empirical Methods in Natural Language Processing Conference (EMNLP). Miami, Florida, USA, 2024. arXiv Code Demo
  18. "CompA: Addressing the Gap in Compositional Reasoning in Audio-Language Models", Sreyan Ghosh, Ashish Seth, Sonal Kumar, Utkarsh Tyagi, Chandra Kiran Evuru, S. Ramaneswaran, S. Sakshi, Oriol Nieto, Ramani Duraiswami, Dinesh Manocha, Proc. of the 12th International Conference on Learning Representations (ICLR). Vienna, Austria, 2024. arXiv Code Demo
  19. "Bridging High-Quality Audio and Video via Language for Sound Effects Retrieval from Visual Queries", Julia Wilkins, Justin Salamon, Magdalena Fuentes, Juan Pablo Bello, Oriol Nieto, Proc. of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). New Paltz, NY, USA, 2023. arXiv Demo
  20. "Efficient Spoken Language Recognition via Multilabel Classification", Oriol Nieto, Zeyu Jin, Franck Dernoncourt, Justin Salamon, Proc. of the 24th InterSpeech Conference. Dublin, Ireland, 2023. arXiv
  21. 🏆 Top 10% conference paper (highlighted)
    "Language-Guided Audio-Visual Source Separation via Trimodal Consistency", Reuben Tan, Andrea Burns, Arijit Ray, Bryan A. Plummer, Oriol Nieto, Justin Salamon, Bryan Russell, Kate Saenko, Proc. of the IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR). Vancouver, BC, Canada, 2023. arXiv Code
  22. "Audio-Text Models Do Not Yet Leverage Natural Language", Ho-Hsiang Wu, Oriol Nieto, Juan Pablo Bello, Justin Salamon, Proc. of the 48th International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Rhodes, Greece, 2023. arXiv
  23. "Music Enhancement Via Image Translation and Vocoding", Nikhil Kandpal, Oriol Nieto, Zeyu Jin, Proc. of the 47th International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Singapore, 2022. arXiv Code
  24. "Deep Embeddings and Section Fusion Improve Music Segmentation", Justin Salamon, Oriol Nieto, Nicholas J. Bryan, Proc. of the 22nd International Society for Music Information Retrieval Conference (ISMIR), pp. 594-601, 2021. PDF
  25. "Multimodal Metric Learning for Tag-Based Music Retrieval", Minz Won, Sergio Oramas, Oriol Nieto, Fabien Gouyon, Xavier Serra, Proc. of the 46th International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Toronto, Canada, 2021. arXiv
  26. "Audio-Based Music Structure Analysis: Current Trends, Open Challenges, and Applications", Oriol Nieto, Gautham J. Mysore, Cheng-i Wang, Jordan B. L. Smith, Jan Schlüter, Thomas Grill, Brian McFee, Transactions of the International Society for Music Information Retrieval (TISMIR), 3(1), pp. 246-263, 2020. DOI: 10.5334/tismir.54. PDF
  27. "Mood Classification Using Listening Data", Filip Korzeniowski, Oriol Nieto, Matthew McCallum, Minz Won, Sergio Oramas, Erik Schmidt, Proc. of the 21st International Society for Music Information Retrieval Conference (ISMIR). Montreal, Quebec, Canada, 2020. arXiv
  28. "Data-Driven Harmonic Filters For Audio Representation Learning", Minz Won, Sanghyuk Chun, Oriol Nieto, Xavier Serra, Proc. of the 45th International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Barcelona, Spain, 2020. PDF
  29. "The Harmonix Set: Beats, Downbeats, and Functional Segment Annotations of Western Popular Music", Oriol Nieto, Matthew McCallum, Matthew Davies, Andrew Robertson, Adam Stark, Eran Egozy, Proc. of the 20th International Society for Music Information Retrieval Conference (ISMIR). Delft, The Netherlands, 2019. PDF Code
  30. "Investigating Musical Pattern Ambiguity in a Human Annotated Dataset", Iris Yuping Ren, Oriol Nieto, Hendrik Vincent Koops, Anja Volk, Wouter Swierstra, Proc. of the 15th International Conference on Music Perception and Cognition (ICMPC). Graz, Austria, 2018. PDF
  31. 🏆 Best Student Paper
    "End-to-End Learning for Music Audio Tagging at Scale", Jordi Pons, Oriol Nieto, Matthew Prockup, Erik Schmidt, Andreas Ehmann, Xavier Serra, Proc. of the 19th International Society for Music Information Retrieval Conference (ISMIR). Paris, France, 2018. arXiv
  32. "Multimodal Deep Learning for Music Genre Classification", Sergio Oramas, Francesco Barbieri, Oriol Nieto, Xavier Serra, Transactions of the International Society for Music Information Retrieval (TISMIR), 2018. arXiv
  33. "Predicting Audio Advertisement Quality", Samaneh Ebrahimi, Hossein Vahabi, Matthew Prockup, Oriol Nieto, Proc. of the 11th ACM International Conference on Web Search and Data Mining (WSDM), 2018. arXiv
  34. "A Deep Multimodal Approach for Cold-start Music Recommendation", Sergio Oramas, Oriol Nieto, Mohamed Sordo, Xavier Serra, Proc. of the 2nd Workshop on Deep Learning for Recommender Systems (DLRS), at RecSys. Como, Italy, 2017. arXiv
  35. "Evaluating Hierarchical Structure in Music Annotations", Brian McFee, Oriol Nieto, Morwaread M. Farbood, Juan Pablo Bello, Frontiers in Psychology, 8, 2017. DOI: 10.3389/fpsyg.2017.01337. PDF
  36. 🏆 Best Presentation
    "Multi-label Music Genre Classification from Audio, Text, and Images Using Deep Features", Sergio Oramas, Oriol Nieto, Francesco Barbieri, Xavier Serra, Proc. of the 18th International Society of Music Information Retrieval Conference (ISMIR). Suzhou, China, 2017. arXiv
  37. "Systematic Exploration of Computational Music Structure Research", Oriol Nieto, Juan Pablo Bello, Proc. of the 17th International Society for Music Information Retrieval Conference (ISMIR). New York City, NY, USA, 2016. PDF Code
  38. "Hierarchical Evaluation of Segment Boundary Detection", Brian McFee, Oriol Nieto, Juan Pablo Bello, Proc. of the 16th International Society for Music Information Retrieval Conference (ISMIR). Málaga, Spain, 2015. PDF
  39. "librosa: Audio and Music Signal Analysis in Python", Brian McFee, Colin Raffel, Dawen Liang, Daniel P. W. Ellis, Matt McVicar, Eric Battenberg, Oriol Nieto, Proc. of the 14th Python in Science Conference (SciPy). Austin, TX, USA, 2015. PDF
  40. "Music Segment Similarity Using 2D-Fourier Magnitude Coefficients", Oriol Nieto, Juan Pablo Bello, Proc. of the 39th International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Florence, Italy, 2014. PDF
  41. 🏆 Best Poster Presentation
    "MIR_EVAL: A Transparent Implementation of Common MIR Metrics.", Colin Raffel, Brian McFee, Eric J. Humphrey, Justin Salamon, Oriol Nieto, Dawen Liang, Daniel P. W. Ellis, Proc. of the 15th International Society for Music Information Retrieval Conference (ISMIR). Taipei, Taiwan, 2014. PDF
  42. "Identifying Polyphonic Patterns from Audio Recordings Using Music Segmentation Techniques", Oriol Nieto, Morwaread M. Farbood, Proc. of the 15th International Society for Music Information Retrieval Conference (ISMIR). Taipei, Taiwan, 2014. PDF
  43. "Embodying Theoretical Research in Music Cognition: Four Proposals for Theory-Driven Experimentation", Andreu Ballús, Eric Arnau, Oriol Nieto, Frederic Font, Alba G. Torrents, Proc. of the Annual Meeting of the Cognitive Science Society. Quebec City, Quebec, Canada, 2014. PDF
  44. "Convex Non-Negative Matrix Factorization for Automatic Music Structure Identification", Oriol Nieto, Tristan Jehan, Proc. of the 38th International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Vancouver, BC, Canada, 2013. PDF
  45. "Data Driven and Discriminative Projections for Large-Scale Cover Song Identification", Eric J. Humphrey, Oriol Nieto, Juan Pablo Bello, Proc. of the 14th International Society for Music Information Retrieval Conference (ISMIR). Curitiba, Brazil, 2013. PDF
  46. "Unsupervised Clustering of Extreme Vocal Effects", Oriol Nieto, Proc. of the 10th International Conference on Advances in Quantitative Laryngology, Voice and Speech Research (AQL). Cincinnati, OH, USA, 2013. PDF
  47. "Fortissimo: Force-Feedback for Mobile Devices", Tae Hong Park, Oriol Nieto, Proc. of the 13th International Conference on New Interfaces for Musical Expression (NIME). Daejeon and Seoul, Korea, 2013. PDF
  48. "Even More Tactile Feedback for Mobile Devices", Tae Hong Park, Langdon Crawford, Oriol Nieto, Proc. of the 39th International Computer Music Conference (ICMC). Perth, Australia, 2013. PDF
  49. "Perceptual Evaluation of Automatically Extracted Musical Motives", Oriol Nieto, Morwaread M. Farbood, Proc. of the 12th International Conference on Music Perception and Cognition (ICMPC), pp. 723-727. Thessaloniki, Greece, 2012. PDF
  50. "Compressing Music Recordings Into Audio Summaries", Oriol Nieto, Eric J. Humphrey, Juan Pablo Bello, Proc. of the 13th International Society for Music Information Retrieval Conference (ISMIR), pp. 313-318. Porto, Portugal, 2012. PDF

Algorithms

  1. "MIREX 2016 Entry: MSAF V0.1.0 Submission", Oriol Nieto, Music Information Retrieval Evaluation eXchange (MIREX). New York City, NY, USA, 2016. PDF Code
  2. "MIREX 2014 Entry: 2D Fourier Magnitude Coefficients", Oriol Nieto, Juan Pablo Bello, Music Information Retrieval Evaluation eXchange (MIREX). Taipei, Taiwan, 2014. PDF Code
  3. "MIREX 2014 Entry: Music Segmentation Techniques and Greedy Path Finder Algorithm to Discover Musical Patterns", Oriol Nieto, Morwaread M. Farbood, Music Information Retrieval Evaluation eXchange (MIREX). Taipei, Taiwan, 2014. PDF Code
  4. "MIREX 2014 Entry: Convex Non-negative Matrix Factorization", Oriol Nieto, Tristan Jehan, Music Information Retrieval Evaluation eXchange (MIREX). Taipei, Taiwan, 2014. PDF Code
  5. "MIREX 2013: Discovering Musical Patterns Using Audio Structural Segmentation Techniques", Oriol Nieto, Morwaread M. Farbood, Music Information Retrieval Evaluation eXchange (MIREX). Curitiba, Brazil, 2013. PDF

Theses

  1. "Discovering Structure in Music: Automatic Approaches and Perceptual Evaluations", Oriol Nieto, New York University. PhD Dissertation, 2015. PDF Slides Video
  2. "Voice Transformations for Extreme Vocal Effects", Oriol Nieto, Pompeu Fabra University. Master's Thesis, 2008. PDF
  3. "Desenvolupament Open Source per a E-Learning-II", Oriol Nieto, Polytechnic University of Catalonia. Undergrad's Thesis, 2007. PDF

Selected Talks

  1. "Project Sound Stager", Oriol Nieto, Adobe MAX Sneaks 2025. Los Angeles, CA, USA, 2025. Video
  2. "GenAI for Sound Design", Oriol Nieto, Conversational AI Reading Group at Mila. Montreal, Quebec, Canada, 2025. Video
  3. "Overview, Challenges, and Applications of Audio-based Music Structure Analysis", Oriol Nieto, Women in Music Information Retrieval Workshop (ISMIR). Virtual, 2021. Slides
  4. "Music Recommendation with Waveform-based Architectures", Oriol Nieto, 4th Global AI Conference. Santa Clara, CA, USA, 2020. Slides
  5. "Spectral Analysis and Detection of Extreme Vocal Effects (with CNNs)", Oriol Nieto, Research Seminar. Universitat Pompeu Fabra. Barcelona, Spain, 2019. Slides
  6. "Spectral Analysis and Detection of Extreme Vocal Effects", Oriol Nieto, 2nd International Symposium on Distorted Voices. São Paulo, Brazil, 2019. Slides
  7. "Recommending Music with Waveform Architectures at Scale (Extended Version)", Oriol Nieto, Seminar Series in Data Science. University of San Francisco. San Francisco, CA, USA, 2019. Slides
  8. "Recommending Music with Waveform Architectures at Scale", Oriol Nieto, Deep Learning Barcelona Symposium. Pompeu Fabra University. Barcelona, Spain, 2018. Slides Video
  9. "Cold-Start Music Recommendation Using Multimodal Deep Architectures", Oriol Nieto, Systematic Approaches to Deep Learning Methods for Audio. Erwin Schrödinger Institute, University of Vienna. Vienna, Austria, 2017. PDF
  10. "Long Tail Music Recommendation Using Deep Architectures", Oriol Nieto, International Workshop on Deep Learning for Music (IJCNN). Anchorage, AK, USA, 2017. PDF
  11. "Deep Learning for Large-Scale Music Recommendation", Oriol Nieto, Data-Driven Research in Music Cognition. Stanford University. Stanford, CA, USA, 2017. PDF
  12. "Deep Learning for Music Recommendation: Machine Listening and Collaborative Filtering", Oriol Nieto, Seminar on Music Knowledge Extraction Using Machine Learning. Pompeu Fabra University. Barcelona, Spain, 2016. PDF
  13. "Deep Learning for Large Scale Music Recommendation", Oriol Nieto, Biostat Seminar. Stanford University. Stanford, CA, USA, 2016. PDF
  14. "Multiple Annotations and Subjectivity in the Identification of Segment Boundaries in Music", Oriol Nieto, Morwaread M. Farbood, Cognitive Music Information Retrieval (CogMIR). Toronto, ON, Canada, 2014. PDF
  15. "Music Segment Similarity Using 2D-Fourier Magnitude Coefficients", Oriol Nieto, Juan Pablo Bello, North East Music Information Special Interest Group (NEMISIG). New York, NY, USA, 2014. PDF
  16. "A Perceptually Based Evaluation of Music Boundaries", Oriol Nieto, Morwaread M. Farbood, Juan Pablo Bello, Cognitive Music Information Retrieval (CogMIR). Toronto, ON, Canada, 2013. PDF
  17. "Music Structure Analysis and New Musical Interfaces", Oriol Nieto, Pompeu Fabra University. Barcelona, Spain, 2013. PDF
  18. "Music Structure Analysis by Matrix Factorization", Oriol Nieto, Tristan Jehan, North East Music Information Special Interest Group (NEMISIG). Boston, MA, USA, 2013. PDF

Music

  1. "La Bossa d'Urina: El Primer Disc", Daniel Bolsa, Oriol Nieto, Published by Record Union, 2022. Pandora Spotify Amazon
  2. "Rumbahía: Casi al Compás", Luis Carlos Cobo, Javier Cardona, Erin E. Grant, Diego Melendo, Oriol Nieto, Published by CDBaby, 2021. Pandora Spotify Amazon
  3. "Rumbahía: Aprendiendo", Luis Carlos Cobo, Javier Cardona, Jess Gallegos, Diego Melendo, Oriol Nieto, Published by CDBaby, 2019. Pandora Spotify Amazon
  4. "La Bossa d'Urina: Merda Fina", Daniel Bolsa, Oriol Nieto, Published by Record Union, 2018. Pandora iTunes Spotify Amazon
  5. "Arkaen: Arkaen", Oriol Nieto, Sean Henson, Joey Nuñez, Eli Remas, Garey Rickher, Published by Record Union, 2017. Pandora iTunes Spotify Amazon
  6. "La Bossa d'Urina: La Bossa d'Urina", Daniel Bolsa, Oriol Nieto, Published by Cydonia Records, 2015. Pandora iTunes Spotify Amazon
  7. "Sargon: Vida", Carles Ferreiro, Jordi Llobet, Oriol Nieto, Marc Prim, Album edited by Weight Recordings, 2009. Pandora iTunes Spotify Amazon
  8. "Sargon: Transcriptions", Carles Ferreiro, Jordi Llobet, Oriol Nieto, Marc Prim, Album edited by Big Bang Records, 2005. iTunes Spotify

Other

  1. "Automatic Music Tagging with Harmonic CNN", Minz Won, Sanghyuk Chun, Oriol Nieto, Xavier Serra, Late Breaking Session of the International Society for Music Information Retrieval Conference (ISMIR). Delft, The Netherlands, 2019. PDF
  2. "MSAF: Music Structure Analysis Framework", Oriol Nieto, Juan Pablo Bello, International Society for Music Information Retrieval Conference (ISMIR). Málaga, Spain, 2015. PDF
  3. "2013 Late Break Session on Music Segmentation", Oriol Nieto, Jordan B. L. Smith, Proc. of the 14th International Society for Music Information Retrieval Conference (ISMIR). Curitiba, Brazil, 2013. PDF
  4. "Late-break Session on Music Structure Analysis", Bruno Rocha, Jordan B. L. Smith, Geoffroy Peeters, Joe Cheri Ross, Oriol Nieto, Jan Van Balen, Proc. of the 13th International Society for Music Information Retrieval Conference (ISMIR). Porto, Portugal, 2012. PDF
  5. "Sistemas Operativos: Cuaderno de Laboratorio", Oriol Nieto, Àlex Pajuelo, David López, Amador Millan, A. Heredero, Alex Duran, José R. Herrero, Xavier Verdú, Yolanda Becerra, Enric Morancho, Department of Computer Architecture. Polytechnic University of Catalonia, 2007.