Skip to content

Commit

Permalink
📝 update songstarter post
Browse files Browse the repository at this point in the history
  • Loading branch information
nateraw committed May 2, 2024
1 parent e11dd9a commit bc1c858
Showing 1 changed file with 7 additions and 7 deletions.
14 changes: 7 additions & 7 deletions posts/training_musicgen_songstarter.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@
"date: \"2024-04-24\"\n",
"categories: [audio, training, code, musicgen]\n",
"image: ../static/training_musicgen_songstarter/thumbnail.jpg\n",
"include-in-header:\n",
" text: <script type=\"module\" src=\"https://gradio.s3-us-west-2.amazonaws.com/4.27.0/gradio.js\"></script>\n",
"---"
]
},
Expand Down Expand Up @@ -274,23 +276,21 @@
"\n",
"I struggled with this for a bit until I realized I was just being stupid - the solution was right in front of me. Under the hood, when MusicGen does melody conditioning, it runs stem separation on the audio prompt to remove vocals, as they can make it harder to find a stable signal for conditioning. By simply removing this step, we can prompt with vocals directly! 🔥\n",
"\n",
"Now, unless you've got a voice like Michael's, you likely don't sing with perfect pitch. We reintroduce the problem that stem separation tried to solve. If your vocals are off pitch, or have fast vibrato, the model will have a hard time finding a stable signal to condition against. To try and mitigate that, you can run pitch correction on your vocals before feeding them through to the patched model. I used some modified code from this AWESOME [blogpost](https://t.co/Kpi023sDP6) by [@wilczek_jan](https://twitter.com/wilczek_jan) to do this, and packaged it up into a Gradio app to play with interactively.\n",
"\n",
"Have a listen to the results:\n",
"\n",
"<video width=\"850\" height=\"450\" controls>\n",
" <source src=\"../static/training_musicgen_songstarter/singing_songstarter_v2.mp4\" type=\"video/mp4\">\n",
" Your browser does not support the video tag.\n",
" </source>\n",
"</video>\n",
"\n",
"Now, unless you've got a voice like Michael's, you likely don't sing with perfect pitch. We reintroduce the problem that stem separation tried to solve. If your vocals are off pitch, or have fast vibrato, the model will have a hard time finding a stable signal to condition against. To try and mitigate that, you can run pitch correction on your vocals before feeding them through to the patched model. I used some modified code from this AWESOME [blogpost](https://t.co/Kpi023sDP6) by [@wilczek_jan](https://twitter.com/wilczek_jan) to do this, and packaged it up into a Gradio app to play with interactively.\n",
"---\n",
"\n",
"You can find the code [here](https://github.com/nateraw/singing-songstarter), or play with it [on Hugging Face Spaces](https://huggingface.co/spaces/nateraw/singing-songstarter). I'll embed it below for your convenience:\n",
"\n",
"<iframe\n",
"\tsrc=\"https://nateraw-singing-songstarter.hf.space\"\n",
"\tframeborder=\"0\"\n",
"\twidth=\"850\"\n",
"\theight=\"450\"\n",
"></iframe>"
"<gradio-app src=\"https://nateraw-singing-songstarter.hf.space\"></gradio-app>"
]
},
{
Expand Down

0 comments on commit bc1c858

Please sign in to comment.