ift6266h15.wordpress.com
Class Project Description – IFT6266 – H2015 Representation Learning
https://ift6266h15.wordpress.com/class-project
IFT6266 – H2015 Representation Learning. A mostly deep learning course. Project Blogs and Repos. An important part of the learning experience associated with this course (and 40% of the grade) comes from experimenting with the algorithms presented in class. This page describes what is expected from the students. Feel free to ask questions below. Get hands-on experience with some of the algorithms presented in the course. Practice the collaborative competition typically enjoyed by scientists:. Each studen...
benjaubert.wordpress.com
1) Dataset specifications – IFT6266 Project : Speech Synthesis
https://benjaubert.wordpress.com/2014/02/01/1-dataset-specifications
IFT6266 Project : Speech Synthesis. February 1, 2014. Dataset consited of 630 speakers acoustic samples (ten english sentences for each) ( more infos here. Framerate: 16KHz ( for a frame of 10ms : 160 values ). Mono signal (from microphone speech). Example of acoutic signal and start-end position labelled. 14/02/2014) : A more sophisticated plotting of wave with start-end was added by David ( http:/ davidtob.wordpress.com/2014/02/05/annotated-waveforms/. Speech Synthesis Project from TIMIT dataset.
ift6266tr.wordpress.com
First understanding | Speech synthesis experiments
https://ift6266tr.wordpress.com/2014/02/10/first-understanding
You’re at the best WordPress.com site ever. I never studied speech signal at all and didn’t know the composition of a signal. I never played with sounds or music, I don’t even play music! What is the format of a sequence? What do we need to match with what? What is in a .wav file? What to do with it? What are our inputs? As you can understand, I was completely lost and I didn’t manage to focus on what I had to do to move forward, despite the very good explanations provided by. February 10, 2014. You are ...
ift6266tr.wordpress.com
February | 2014 | Speech synthesis experiments
https://ift6266tr.wordpress.com/2014/02
You’re at the best WordPress.com site ever. Monthly Archives: February 2014. As I said in my previous post, a long time ago, I tried what Hubert. Did: a small neural network with one hidden layer, tanh activation function and linear output with the mean squared error as loss function. This leads to 382 721 frames or examples to train on. I divided it as follow: 80% for train set, 10% for valid and test sets. Learning rate: 0.001. Some acoustics samples generated by the model:. I’ll try Theano, mayb...
ift6266tr.wordpress.com
Thomas Rohée | Speech synthesis experiments
https://ift6266tr.wordpress.com/author/mennerve
You’re at the best WordPress.com site ever. Author Archives: Thomas Rohée. The plan, at this point, is to use a gammatone dictionary and sparse coding to train more efficiently on the TIMIT dataset. Gammatone functions are filters applied on frequencies. The goal is to keep frequencies processed by our ears. Joao explains this greatly in his blog. The idea is to use this sparse-coded version as input of a spike and slab RBM segmented by frames of 160 samples and with the previous, current and next phones...
ift6266h14.wordpress.com
Mar27 | Representation Learning - ift6266h14
https://ift6266h14.wordpress.com/2014/03/23/mar27
Representation Learning – ift6266h14. Yoshua Bengio's graduate class on representation learning and deep learning. Posted by Yoshua Bengio. Please study the following material (on auto-encoders) in preparation for the March 27th class:. Section 7 of Representation Learning: A Review and New Perspectives. By Y Bengio, A. Courville and P. Vincent. Videos 6.1 to 6.7 (Auto-encoders) from Hugo Larochelle’s course on neural networks. Please put up your questions below, as replies to this post. 1 what is it?
benjaubert.wordpress.com
February 2014 – IFT6266 Project : Speech Synthesis
https://benjaubert.wordpress.com/2014/02
IFT6266 Project : Speech Synthesis. February 12, 2014. 3) Linear Prediction Coefficient (LPC). In speech synthesis, a big issue is the generation of a continuous acoustic signal. We want to predict the next acoustic samples from previous samples (a). As first estimate, a LPC linear model can be used. It aims to find coefficients in order to minimize equation (b) :. With Yt the sample at time t, K the number of samples in the window that precede Yt. As in my post #2. February 10, 2014. My test code was.
benjaubert.wordpress.com
Benjamin – IFT6266 Project : Speech Synthesis
https://benjaubert.wordpress.com/author/benjaubert
IFT6266 Project : Speech Synthesis. April 30, 2014. 7) Final Training with Stacked DAE. The batch is still running…. I will post my last results when it will be finish ( Epoch 591 / 5000 ). I’m trying to encode a frame of 560 samples (phoneme ‘oe’) in order to have a multidimensionality reduction of a factor ten. The architecture is a stack of three DEA: 560-Inputs 560-Outputs / 560-Inputs 100-Outputs / 100-Inputs 50-Outputs. April 28, 2014. 6) Signal reconstruction based on a single auto-encoder level.