Katalina Hernandez - AI Alignment Forum

🔹 Background:
Lawyer by education, researcher by vocation.

Stress-Testing Reality Limited | Katalina Hernández | Substack

Started in data protection & privacy, now specializing in Privacy Engineering and Privacy-Enhancing Technologies (PETs). Pivoted to AI governance and AI Safety-adjacent research after realizing that compliance frameworks alone won’t ensure AI remains contestable and user-controllable.

🔹 Current Work: AI Safety for AI Governance
I work for a multinational as part of the Responsible AI team. I also carry out independent Policy research, focusing on bridging the gap between AI Regulation and Safety research for effective governance.

My main interest is on the intersection of AI governance, privacy engineering, and alignment-adjacent control mechanisms, particularly:

Inference contestability → Users should be able to challenge and correct AI-generated inferences in high stakes decision-making processes.
Human Autonomy→ "Automation bias" is making the legal mechanism of "Human in the loop" useless. It is also not compatible with any scalable oversight research direction in AI Safety. I argue that, at least for GenAI, users should have a better understanding of how the advice given to them by the AI assistants may impact their cognition. Aligned AI should understand the importance of "challenging" human users to use their own reasoning rather than defaulting to its advice all the time.
Mechanistic interpretability & governance → Bridging interpretability research with real-world policy to ensure AI oversight isn’t just transparency theater.
UX for AI autonomy → Moving beyond "explainability" to user-controllable AI decision-making.

🔹 Why LessWrong?

To stress-test the intersection of alignment and control—where governance, interpretability, and UX meet.
To challenge assumptions about how much agency humans will retain over AI-driven cognition.
To explore practical, scalable solutions beyond theory.

🔹 Let’s Connect If:

You work in mechanistic interpretability, AI safety, UX for AI, or privacy-enhancing AI governance.
You have thoughts on how AI governance can mirror alignment research to ensure control remains feasible in advanced AI systems.
You’re skeptical about whether "user control" over AI inferences is technically viable—and want to stress-test the idea.

Yes, Yes... Me, a **lawyer** posting in LW feels like:

AI ALIGNMENT FORUM
AF

Posts

Wikitag Contributions

Comments