publications
publications by categories in reversed chronological order. generated by jekyll-scholar.
2025
- Mech Interp ’25Group Equivariance Meets Mechanistic Interpretability: Equivariant Sparse AutoencodersNeurIPS Mechanistic Interpretability and UniReps Workshops 2025
- FAccT ’25Gender Bias in Explainability: Investigating Performance Disparity in Post-hoc MethodsACM Conference on Fairness, Accountability, and Transparency 2025
- MLMP ’25FreeFlow: Latent Flow Matching for Free Energy Difference EstimationICLR Workshop on Machine Learning Multiscale Processes 2025
2024
- CANS ’24SplitOut: Out-of-the-Box Training-Hijacking Detection in Split Learning via Outlier DetectionInternational Conference on Cryptology And Network Security 2024
2023
- GLFrontiers ’23Poisoning × Evasion: Symbiotic Adversarial Robustness for Graph Neural NetworksNew Frontiers in Graph Learning Workshop (at NeurIPS ’23) 2023
- (SRW) RANLP ’23Detecting ChatGPT: A Survey of the State of Detecting ChatGPT-Generated TextIn Proceedings of the 8th Student Research Workshop associated with the International Conference Recent Advances in Natural Language Processing 2023