LLM Enhancement

7papers

5.9viability

-25%30d

State of the Field

Recent advancements in large language models (LLMs) are focusing on enhancing their structural understanding and engagement capabilities, addressing critical limitations in current applications. One notable development involves the introduction of a specialized token that encapsulates graph structures, allowing for improved comprehension and reasoning in graph-related tasks, which could significantly benefit fields like data analysis and knowledge representation. Concurrently, iterative improvement processes are being employed in social chat applications, yielding measurable increases in user engagement and steerability, crucial for maintaining user interest in competitive platforms. Techniques for enhancing inter-head interactions in attention mechanisms are also being explored, leading to more efficient training and reduced memory usage, which is vital for deploying LLMs in resource-constrained environments. Furthermore, strategies to infuse randomness into prompts are being tested to boost output diversity, a key factor for creative applications. Collectively, these efforts reflect a concerted push towards making LLMs more versatile, efficient, and user-friendly in real-world scenarios.

Last updated Mar 3, 2026

Papers

1–7 of 7

Research Paper·Feb 2, 2026

<SOG_k>: One LLM Token for Explicit Graph Structural Understanding

Large language models show great potential in unstructured data understanding, but still face significant challenges with graphs due to their structural hallucination. Existing approaches mainly eithe...

7.0 viability

Research Paper·Mar 2, 2026

CharacterFlywheel: Scaling Iterative Improvement of Engaging and Steerable LLMs in Production

This report presents CharacterFlywheel, an iterative flywheel process for improving large language models (LLMs) in production social chat applications across Instagram, WhatsApp, and Messenger. Start...

7.0 viability

Research Paper·Jan 27, 2026

Explicit Multi-head Attention for Inter-head Interaction in Large Language Models

In large language models built upon the Transformer architecture, recent studies have shown that inter-head interaction can enhance attention performance. Motivated by this, we propose Multi-head Expl...

6.0 viability

Research Paper·Feb 12, 2026

InjectRBP: Steering Large Language Model Reasoning Behavior via Pattern Injection

Reasoning can significantly enhance the performance of Large Language Models. While recent studies have exploited behavior-related prompts adjustment to enhance reasoning, these designs remain largely...

6.0 viability

Research Paper·Jan 14, 2026

DPWriter: Reinforcement Learning with Diverse Planning Branching for Creative Writing

Reinforcement learning (RL)-based enhancement of large language models (LLMs) often leads to reduced output diversity, undermining their utility in open-ended tasks like creative writing. Current meth...

5.0 viability

Research Paper·Feb 5, 2026·B2B

Transport and Merge: Cross-Architecture Merging for Large Language Models

Large language models (LLMs) achieve strong capabilities by scaling model capacity and training data, yet many real-world deployments rely on smaller models trained or adapted from low-resource data. ...

5.0 viability

Research Paper·Jan 26, 2026

Addressing LLM Diversity by Infusing Random Concepts

Large language models (LLMs) are known to produce outputs with limited diversity. In this work, we study whether infusing random concepts in the prompts can improve the diversity of the generated outp...

5.0 viability

LLM Enhancement

State of the Field

Papers

<SOG_k>: One LLM Token for Explicit Graph Structural Understanding

CharacterFlywheel: Scaling Iterative Improvement of Engaging and Steerable LLMs in Production

Explicit Multi-head Attention for Inter-head Interaction in Large Language Models

InjectRBP: Steering Large Language Model Reasoning Behavior via Pattern Injection

DPWriter: Reinforcement Learning with Diverse Planning Branching for Creative Writing

Transport and Merge: Cross-Architecture Merging for Large Language Models

Addressing LLM Diversity by Infusing Random Concepts

Filters