Structured outputs from LLMs
This is more of a “to-do” than anything, but I really love the idea of inference-time next-token probability-distribution hacking to enforce structured outputs. There’s this neat Python library outlines that I’d love to explore in more depth - it even supports context-free grammars for output guiding!