'공부' 카테고리의 글 목록 (2 Page)

Adapting TTS models For New Speakers using Transfer Learning

2024.12.04· 공부/논문

Interspeech 2022https://arxiv.org/abs/2110.05798Contributionpresent transfer learning methods and guidelines for finetuning single-speaker TTS models for a new voiceevaluate and provide a detailed analysis with varying amount of datademonstrate that transfer learning can substantially reduce the training time and amount of data needed for synthesizing a new voiceopen-source framework, provide a ..

Speech ReaLLM - Real-time Streaming Speech Recognition with Multimodal LLMs by Teaching the Flow of Time

2024.08.20· 공부/논문

https://arxiv.org/abs/2406.09569 Speech ReaLLM -- Real-time Streaming Speech Recognition with Multimodal LLMs by Teaching the Flow of TimeWe introduce Speech ReaLLM, a new ASR architecture that marries "decoder-only" ASR with the RNN-T to make multimodal LLM architectures capable of real-time streaming. This is the first "decoder-only" ASR architecture designed to handle continuous audio witarxi..

java 17 mac silicon m1 설치

2024.05.02· 공부/튜토리얼

설치 방법설치brew install openjdk@17path 설정참고 : zshrc 아니고 bash_profile 이신 분들은 ~/.zshrc 부분을 ~/.bash_profile 로 대체하시면 됩니다.brew info openjdk@17 # 설치 내용 확인echo 'export PATH="/opt/homebrew/opt/openjdk@17/bin:$PATH"' >> ~/.zshrcexport CPPFLAGS="-I/opt/homebrew/opt/openjdk@17/include" # 이건 혹시나 컴파일러를 위해서source ~/.zshrcjava -vpath 설정 전에는 java home 경로가 14 였다가 source 명령어로 적용 이후에 확인하면 17로 잘 잡힘을 확인할 수 있다.References..

Fre-GAN: Adversarial Frequency-consistent Audio Synthesis

2023.06.02· 공부/논문

Abstract Resolution-connected generator, Resolution-wise discriminator 제안 더불어 정확성있게 high-frequency components 재생산을 위해 discriminators 안에서 discrete wavelet transform 이용 Fre-GAN은 MOS에서 Ground-truth audio와 0.03 정도의 차이만 난다. 1. Introduction autoregressive model 들은 좋은 성능을 보여주지만 느린 인퍼런스 속도 이들의 구조적 한계를 해결하기 위해 flow-based vocoders 가 제안되었다. 자연스러운 waveform을 실시간으로 생성함에도 불구하고 병렬적으로 noise sequence를 raw wavefor..

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

2023.05.24· 공부/논문

paper link: https://arxiv.org/abs/2003.08934 NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis We present a method that achieves state-of-the-art results for synthesizing novel views of complex scenes by optimizing an underlying continuous volumetric scene function using a sparse set of input views. Our algorithm represents a scene using a fully-con arxiv.org Abstract input ..

RAdam

2023.05.18· 공부

설명 Rectified Adam 가중치를 업데이트하기 위한 optimizer로, Adam 의 변형입니다. Adam이 가진 Bad local optima convergence problem(local optima에 너무 일찍 도달하여 학습이 거의 일어나지 않는 현상)을 개선하고자 하였습니다. Adam의 수식에 rectification(분산을 consistent하게 만들 수 있는 rectification term)을 곱해줌으로써 학습 초기에 일어날 수 있는 bad local optima problem을 해결하고, 학습 안정성을 높였다고 할 수 있습니다. 사용 optimizer = RAdam(model.parameters(), lr=learning_rate, betas=(0.9, 0.999), weight_d..

assert

2023.05.18· 공부

설명 assert는 뒤의 조건이 True가 아니면 AssertError를 발생 가정 설정문 방어적 프로그래밍 사용 assert 조건, '메시지' References https://wikidocs.net/21050 03_가정 설정문(assert) assert는 뒤의 조건이 True가 아니면 AssertError를 발생한다. ``` >>> a = 3 >>> assert a == 2 #결과 Traceback (most r… wikidocs.net

@staticmethod

2023.05.18· 공부

설명 정적 함수 클래스에서 바로 사용할 수 있는 함수 (인스턴스 선언 없이) 메서드의 실행이 외부 상태에 영향을 끼치지 않는 순수 함수(pure function)를 만들 때 사용 사용 class 클래스이름: @staticmethod def 메서드(매개변수1, 매개변수2): 코드 class Calc: @staticmethod def add(a, b): print(a + b) @staticmethod def mul(a, b): print(a * b) Calc.add(10, 20) # 클래스에서 바로 메서드 호출 Calc.mul(10, 20) # 클래스에서 바로 메서드 호출 References https://dojang.io/mod/page/view.php?id=2379 파이썬 코딩 도장: 35.2 정적 메서..

티스토리툴바