Next.js App Router + React Server Components Demo

new
past
show
ask
show
jobs
submit

▲Ask HN: How to remove voice from a music file?

121 points by mettamage 1640 days ago | 46 comments

unlinked_dll 1639 days ago [-]

The technical name for this problem is "blind source separation" [1] and is related to the "cocktail party effect" [2].

[1] http://www.mit.edu/~gari/teaching/6.555/LECTURE_NOTES/ch15_b...

[2] https://en.wikipedia.org/wiki/Cocktail_party_effect

Commercial solutions are out there, but imo/e they're not always that great and the problem is of prime interest for researchers both in industry and academia.

hnaccount141 1639 days ago [-]

There's not currently a plug and play solution that will work well on all types of music, but there's a lot of research happening in this area. If you're interested in digging into some of the cutting edge source separation algorithms there's a great python library called nussl that provides implementations of many of them. https://interactiveaudiolab.github.io/nussl/

paulrpotts 1640 days ago [-]

Izotope RX can do a pretty credible job of this, although it depends a lot on the source file. This is a commercial product, though, so I'm not sure if it is what you are looking for. https://ask.audio/articles/isolating-separating-remixing-usi...

rewgs 1639 days ago [-]

Professional audio person here. RX is the tool to use, but even for someone skilled at this, it's going to be tough and will likely result in some degradation of the rest of the music.

Get in touch and I'll give it a shot, but no promises.

codetrotter 1639 days ago [-]

Note that the e-mail field on HN is not visible to other users. If you want others to be able to contact you by e-mail, you should put your email address in the “about” box as well.

And probably when you do that you might want to type it out as

yourname at example com

Rather than as “yourname@example.com”.

Still, even if you write it that way there are probably some scrapers out there that will recognize it as an email and include it in some list.

You could try to be a bit cryptic like some do though. If for example the local-part of your email address is the same as your HN username, and you are using GMail, you could write something like:

“You can reach me on GMail. My username there is the same as the one I use on HN.”

Just don’t be too cryptic, or you won’t just ensure that scrapers can’t parse it — even people legitimately wanting to get in touch could be unable to understand what your address actually is :P

Alternatively, instead of stating your email address in any form you could put a link to your profile on some other platform where people can reach you.

mettamage 1639 days ago [-]

I'd love to! But I don't see your email. My email is in my profile bio.

abacadaba 1639 days ago [-]

If anyone wants to give it a shot removing Donna vocals from some choice 70's dead, much obliged! Thanks in advance.

1639 days ago [-]

mettamage 1639 days ago [-]

If there's a trial version, I'm willing to try it.

SyneRyder 1639 days ago [-]

Yep, iZotope RX7 has a 10-day free trial. The feature you're looking for is Music Rebalance:

https://www.izotope.com/en/products/rx.html

gumbi_nz 1638 days ago [-]

I downloaded the trial a few weeks ago. The results were not great unfortunately.

robbrown451 1639 days ago [-]

Interestingly, many years ago I had this start happening with my car stereo. The vocals were mostly missing but the other instruments were there. When it got to the instumental solo, the main instrument was missing. This happened on 90% of songs.

Turned out I had accidentally caused wires to come loose in the trunk, leaving the speakers wired in series, which caused the stereo channel cancelation effect.

Laforet 1639 days ago [-]

It's a common problem with headphones when the wire becomes damaged, allowing two audio channels to become electrically connected. One could even simulate this effect by delibrately not inserting an audio jack all the way in on some devices.

hantusk 1639 days ago [-]

A few relevant links:

https://www.celemony.com/ Celemony melodyne

https://www.youtube.com/watch?v=FMEk8cHF-OA

https://www.youtube.com/watch?v=zL6ltnSKf9k

https://github.com/f90/Wave-U-Net

superfamicom 1639 days ago [-]

I have tried many options, and https://phonicmind.com/ has had the best results.

erik_p 1639 days ago [-]

It can do a pretty impressive job, it does help to use high quality source material. Using mp3 rips from youtube might get a little more artifact-y. sometimes, vocal like instruments will get pulled into the vocal rip instead of the "karaoke" instrument part of the extract.

superfamicom 1639 days ago [-]

An interesting side effect I've seen with FLAC vinyl rips is an interesting type of noise removal on the "karaoke" tracks.

psychometry 1639 days ago [-]

I've always wondered where karaoke bars get vocal-free versions of seemingly every track that would ever get requested. Do record companies make them?

BurningFrog 1639 days ago [-]

A lot are rerecorded by studio musicians, I believe.

dajohnson89 1639 days ago [-]

i'd be curious about the copyright legality here...it must be expensive to pay royalties for a huge library of instrumental tracks.

aidenn0 1639 days ago [-]

Cover versions can always be made under a compulsory license; you have to pay the songwriter, but they can't say no.

So recording a version with a separate vocal track is 100% doable for any song.

However, to actually use it in Karaoke, it's no longer just a cover; even though it's just the words that are shown along with it, it's part of a larger work, so you need the same sort of license you would need for using it in e.g. a TV ad, and that needs to be negotiated with the copyright owner.

[edit]

For more details google "mechanical license" and "synchronization license"; the first is the "I want to record a cover" and the second is what you would need for using in karaoke.

anamexis 1639 days ago [-]

That’s fascinating, thanks for the google prompts.

FussyZeus 1639 days ago [-]

I would assume they're released under different licenses for manufacturers of those products. This is a lot in my mind like how Microsoft Windows Enterprise exists, but is only attainable via volume licensing or piracy; The product exists, and is for sale, but the general public just isn't permitted to buy it.

justzisguyuknow 1639 days ago [-]

Some of them are licensed official tracks from the actual producers, but for the unofficial ones I think the karaoke companies have music writers who just encode synths to roughly match the track.

elamje 1639 days ago [-]

I believe this is only effectively possible, not fully possible. Inherently, music and the voice will share some of the same frequency samples, since they are discrete. I’m sure you might be able to get a solution that works to the human ear, but I don’t know that it’s possible to perfectly strip out one or the other.

sachinsmc 1636 days ago [-]

https://github.com/deezer/spleeter worth checking once

ojm 1639 days ago [-]

I spent some time researching this (albeit in 2014) for my wife. Heres the best solution I could come up with at the time: https://ojm.co/blog/using-audacity-remove-vocals-audio-free/

1640 days ago [-]

abdulhaq 1639 days ago [-]

Voice is often at the centre of a stereo recording so invert the phase of one channel and combine?

grepfru_it 1639 days ago [-]

that is the basic premise. there are also audio engineering effects to deal with such as reverb, chorus, delay and phasing due to poor input quality. lets not get into the likelihood of layered vocal tracks. TL;DR: you want the original raw vocal to do the best removal, and if you have that then you probably already have the tracked version of the mixdown so you can just mute the vocal layers.

in my DJ days, you could just write a record label and ask for instrumental or vocal only demo tracks to perform mixes (this is how dj's became popular in the NYC scene). 80% of the time its a yes, and 90% of those vocal tracks were "audiomarked" copies so you could only make demos (but didn't stop people from using the non-watermarked part for a 15 second clip in their DJ mix.

anyway, what i'm saying is that your method works for basic recordings. as you move up the production food chain you need to become more creative in your methods

unlinked_dll 1639 days ago [-]

Most of the energy in an audio signal is at the center of the image, so in practice you don't really remove that much.

sellingwebsite 1636 days ago [-]

There is a thread on the homepage that might be useful to you: https://news.ycombinator.com/item?id=21431071

timrichard 1639 days ago [-]

There are several products from Audionamix that might be worth checking out :

https://audionamix.com/shop-adx/

LinuxBender 1639 days ago [-]

This is not a direct answer to the technical aspect of your question, but if you know who produced the music, they might give/sell you a version that is missing the vocal tracks.

mettamage 1639 days ago [-]

In my particular case that is unfortunately not possible. But otherwise, I'd go the social route as well.

monotoSTEREO 1638 days ago [-]

Those interested in removing voice from a music file may wish to check out the many resources available at my website, monotoSTEREO.info (https://www.monotostereo.info). There is also a companion Facebook page (https://www.facebook.com/monotostereo.info) where I post updates and related content. "Like" us on Facebook to follow the page for the updates! Be sure to check out the many examples on the MEDIA pages of the website!

person_of_color 1639 days ago [-]

Welcome to the rabbit hole that is the inverse problem.

1639 days ago [-]

conductr 1639 days ago [-]

I always wonder where DJs get the instrumental versions to sample/mix.

peapicker 1639 days ago [-]

A lot of places sell em, look up “DJ Stems” to get the idea.

villmann 1639 days ago [-]

My date was sending me mixed signals, so I did a fourier analysis.

techload 1639 days ago [-]

What about the inverse, isolating only the voice?

percutaneous 1639 days ago [-]

For a perfect solution, it would simply be a matter of subtracting one waveform from the other. I suspect both isolated voice and isolated music would have significant "noise" leftover. This would likely be more noticable in the voice due to our increased perception of odd vocal sounds.

pvtmert 1639 days ago [-]

if its like talking, fourier helps !

mettamage 1639 days ago [-]

Ah, unfortunately, it's singing. But good to know! That knowledge does come in handy :)

karambahh 1639 days ago [-]

Can you explain why would fourier help on talking and not singing?

akabaka777 1639 days ago [-]

maybe because voice has a very low frequency range so its easier to separate from the high frequency noises.

dx7tnt 1639 days ago [-]

This is a longstanding problem in audio engineering, along the lines of "how does one un-bake a cake to get the eggs?" There's always going to be artefacts and distortion ranging from unpleasant to extreme, hence audio engineers when mixing from stems/channels will do a stereo instrumental mix and a mix with vocals.

jotm 1639 days ago [-]

Yeah, it's impossible to do it perfectly as long as human voices' frequencies overlap with instruments, basically.

It's like asking to recover layers from a PNG/JPG File.

Rendered at 05:34:43 GMT+0000 (Coordinated Universal Time) with Vercel.