There is no way to do what teams does without significant infrastructure. Same with Slack and others.
If you want something that just gets close to the mark, look at Jitsi. It's about as complete as you could expect for just video/voice.
What you may not understand about conferencing platforms is that they are dozens of different hosted services working together to provide a cohesive UE. Video, SIP, VOIP, auth, identity...these are all separate services that are deployed as microservices to get what you get. If you find the bare minimum of the services you actually need, you can probably cobble something together, but it's not going to be a simple running of one service to get the same experience.