PDF Viewer

BUILDER'S SANDBOX

Build This Paper

Use an AI coding agent to implement this research.

OpenAI Codex
OpenAI CodexAI Agent

Lightweight coding agent in your terminal.

Claude Code
Claude CodeAI Agent

Agentic coding tool for terminal workflows.

AntiGravity IDE
AntiGravity IDEScaffolding

AI agent mindset installer and workflow scaffolder.

Cursor
CursorIDE

AI-first code editor built on VS Code.

VS Code
VS CodeIDE

Free, open-source editor by Microsoft.

Estimated $9K - $13K over 6-10 weeks.

See exactly what it costs to build this -- with 3 comparable funded startups.

7-day free trial. Cancel anytime.

Discover the researchers behind this paper and find similar experts.

7-day free trial. Cancel anytime.

References (36)

[1]
Discrete Audio Tokens: More Than a Survey!
2025Pooneh Mousavi, Gallil Maimon et al.
[2]
On the Design of Diffusion-Based Neural Speech Codecs
2025Pietro Foti, Andreas Brendel
[3]
TS3-Codec: Transformer-Based Simple Streaming Single Codec
2024Haibin Wu, Naoyuki Kanda et al.
[4]
Moshi: a speech-text foundation model for real-time dialogue
2024Alexandre D'efossez, Laurent Mazar'e et al.
[5]
BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec
2024Detai Xin, Xu Tan et al.
[6]
Finite Scalar Quantization: VQ-VAE Made Simple
2023Fabian Mentzer, David C. Minnen et al.
[7]
SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models
2023Xin Zhang, Dong Zhang et al.
[8]
High-Fidelity Audio Compression with Improved RVQGAN
2023Rithesh Kumar, Prem Seetharaman et al.
[9]
HiFi-Codec: Group-residual Vector quantization for High Fidelity Audio Codec
2023Dongchao Yang, Songxiang Liu et al.
[10]
High Fidelity Neural Audio Compression
2022Alexandre D'efossez, Jade Copet et al.
[11]
Vector-quantized Image Modeling with Improved VQGAN
2021Jiahui Yu, Xin Li et al.
[12]
SoundStream: An End-to-End Neural Audio Codec
2021Neil Zeghidour, Alejandro Luebs et al.
[13]
SDR – Half-baked or Well Done?
2018Jonathan Le Roux, Scott Wisdom et al.
[14]
Neural Discrete Representation Learning
2017Aäron van den Oord, O. Vinyals et al.
[15]
Soft-to-Hard Vector Quantization for End-to-End Learning Compressible Representations
2017E. Agustsson, Fabian Mentzer et al.
[16]
Librispeech: An ASR corpus based on public domain audio books
2015Vassil Panayotov, Guoguo Chen et al.
[17]
Adam: A Method for Stochastic Optimization
2014Diederik P. Kingma, Jimmy Ba
[18]
Definition of the Opus Audio Codec
2012J. Valin, Koen Vos et al.
[19]
A short-time objective intelligibility measure for time-frequency weighted noisy speech
2010C. Taal, R. Hendriks et al.
[20]
Springer handbook of speech processing
2007J. Benesty

Showing 20 of 36 references

Founder's Pitch

"Introducing a shape-gain decomposition method in neural audio codecs for improved bitrate-distortion performance."

Audio ProcessingScore: 2View PDF ↗

Commercial Viability Breakdown

0-10 scale

High Potential

0/4 signals

0

Quick Build

3/4 signals

7.5

Series A Potential

1/4 signals

2.5

Sources used for this analysis

arXiv Paper

Full-text PDF analysis of the research paper

GitHub Repository

Code availability, stars, and contributor activity

Citation Network

Semantic Scholar citations and co-citation patterns

Community Predictions

Crowd-sourced unicorn probability assessments

Analysis model: GPT-4o · Last scored: 2/17/2026

Explore the full citation network and related research.

7-day free trial. Cancel anytime.

Understand the commercial significance and market impact.

7-day free trial. Cancel anytime.

Get detailed profiles of the research team.

7-day free trial. Cancel anytime.