Katalina Hernandez

🔹 Background:
Started in data privacy, specializing in Privacy-Enhancing Technologies (PETs) and AI compliance. Pivoted to AI governance and interpretability-adjacent research after realizing that compliance frameworks alone won’t ensure AI remains contestable and user-controllable.

🔹 Current Work: AI Governance & Autonomy by Design (AbD)
I focus on the intersection of AI governance, privacy engineering, and alignment-adjacent control mechanisms, particularly:

  • Inference contestability → Users should be able to challenge and correct AI-generated inferences.
  • Mechanistic interpretability & governance → Bridging interpretability research with real-world policy to ensure AI oversight isn’t just transparency theater.
  • UX for AI autonomy → Moving beyond "explainability" to user-controllable AI decision-making.

🔹 Interests & Open Questions:
I’m currently exploring how interpretability techniques (like feature steering) could enable real-time human intervention in AI reasoning.

  • Can users meaningfully contest and steer AI-generated inferences in real-world systems?
  • What’s the minimal viable product (MVP) for inference control in GenAI?
  • How do we ensure AI remains contestable at scale before overreliance becomes the norm?

🔹 Why LessWrong?

  • To stress-test the intersection of alignment and control—where governance, interpretability, and UX meet.
  • To challenge assumptions about how much agency humans will retain over AI-driven cognition.
  • To explore practical, scalable solutions beyond theory.

🔹 Let’s Connect If:

  • You work in mechanistic interpretability, AI safety, UX for AI, or privacy-enhancing AI governance.
  • You have thoughts on how AI governance can mirror alignment research to ensure control remains feasible in advanced AI systems.
  • You’re skeptical about whether "user control" over AI inferences is technically viable—and want to stress-test the idea.

💡 Autonomy isn’t just about ensuring AI aligns with human values. It’s about ensuring humans retain the power to challenge, override, and resist AI-driven optimization.

 

Yes, Yes... Me, a **lawyer** posting in LW feels like:

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by