Learning why do we do what we do - Understanding human actions using Neurosymbolic AI

This research is supported by the National Research Foundation Fellowship. Duration : April 2022 to March 2027.

Recognizing what we do or what we will do has been well investigated under action recognition and anticipation literature in the Computer Vision (CV) and Machine Learning (ML) research communities. However, computational learning of why do we do what we do is not well investigated. The objective of this project is to develop Artificial Intelligent (AI) models to process videos and learn why do humans do what they do? by reducing the gap between neural and symbolic representation through novel neurosymbolic AI. These neurosymbolic AI models can see what we do and then reason about our behavior to interpret, justify, explain and understand our actions.

Codes are also available here!

Publications

2024
CausalChaos! Dataset for Comprehensive Causal Action Question Answering Over Longer Causal Chains Grounded in Dynamic Visual Scenes
Paritosh Parmar, Eric Peh, Ruirui Chen, Ting En Lam, Yuhan Chen, Elston Tan, Basura Fernando
NeurIPS 2024 (Accepted)
PDF code
Learning to Reason Iteratively and Parallelly for Complex Visual Reasoning Scenarios
Shantanu Jaiswal, Debaditya Roy, Basura Fernando, Cheston Tan
NeurIPS 2024 (Accepted)
PDF code (soon)
RCA: Region Conditioned Adaptation for Visual Abductive Reasoning
Hao Zhang and Yeo Keat Ee and Basura Fernando
ACM MM (Accepted 2024)
PDF code
Predicting the Next Action by Modeling the Abstract Goal
Debaditya Roy and Basura Fernando
ICPR (Accepted 2024)
PDF code
Dissecting Multimodality in VideoQA Transformer Models by Impairing Modality Fusion
Ishaan Singh Rawal, Alexander Matyasko, Shantanu Jaiswal, Basura Fernando, Cheston Tan
ICML (Accepted 2024)
PDF code
Who are you referring to? Coreference resolution in image narrations
Arushi Goel and Basura Fernando and Frank Keller and Hakan Bilen
International Conference on Computer Vision - ICCV (2023)
PDF CIN Dataset Code Bibtex
Semi-supervised multimodal coreference resolution in image narrations
Arushi Goel and Basura Fernando and Frank Keller and Hakan Bilen
Empirical Methods in Natural Language Processing - EMNLP (2023)
PDF Bibtex
Energy-based Self-Training and Normalization for Unsupervised Domain Adaptation
Samitha Herath, Basura Fernando, Ehsan Abbasnejad, Munawar Hayat, Shahram Khadivi, Mehrtash Harandi, Hamid Rezatofighi, and Reza Haffari
International Conference on Computer Vision - ICCV (2023)
PDF Bibtex
ClipSitu: Effectively Leveraging CLIP for Conditional Predictions in Situation Recognition
Debaditya Roy and Dhruv Verma and Basura Fernando
IEEE/CVF Winter Conference on Applications of Computer Vision - WACV (2024)
Best results in SWiG - 2024
Best results in imSitu - 2024
Code PDF Bibtex
A Region-Prompted Adapter Tuning for Visual Abductive Reasoning
Hao Zhang, Yeo Keat Ee, Basura Fernando
Preprint
PDF
Abductive Action Inference
Clement Tan and Chai Kiat Yeo and Cheston Tan and Basura Fernando
Preprint
PDF
Learning to Visually Connect Actions and their Effects
Eric Peh and Paritosh Parmar and Basura Fernando
Preprint
PDF

Past and Current Team Members

Dr. Chen Li (Research Scientist)

Dr. Paritosh Parmar (Research Scientist)

Dr. Hao Zhang (Research Scientist)

Dr. Thanh-Son Nguyen (Research Scientist)

Dr. Alexander Matyasko (Research Scientist)

Dr. Debaditya Roy (Research Scientist)

Eric Peh (Research Engineer)

Dhruv Verma (Research Engineer)

Ee Yeo Keat (Research Engineer)

Yang Hong (Research Engineer)

Jaiswal Shantanu (Research Engineer)

PhD student - Clement Tan (NTU)

PhD student - BURTON-BARR JONATHAN WESTON (NTU)

PhD student - Chinthani Sugandhika (NTU)