Show HN: Sonauto API – Generative music for developers

sonauto.ai

36 points by zaptrem 2 hours ago

Hello again HN,

Since our launch ten months ago, my cofounder and I have continued to improve our music model significantly. You can listen to some cool Staff Picks songs from the latest version here https://sonauto.ai/ , listen to an acapella song I made for my housemate here https://sonauto.ai/song/8a20210c-563e-491b-bb11-f8c6db92ee9b , or try the free and unlimited generations yourself.

However, given there are only two of us right now competing in the "best model and average user UI" race we haven't had the time to build some of the really neat ideas our users and pro musicians have been dreaming up (e..g, DAW plugins, live performance transition generators, etc). The hacker musician community has a rich history of taking new tech and doing really cool and unexpected stuff with it, too.

As such, we're opening up an API that gives full access to the features of our underlying diffusion model (e.g., generation, inpainting, extensions, transition generation, inverse sampling). Here are some things our early test users are already doing with it:

- A cool singing-to-video model by our friends at Lemon Slice: https://x.com/LemonSliceAI/status/1894084856889430147 (try it yourself here https://lemonslice.com/studio)

- Open source wrapper written by one of our musician users: https://github.com/OlaFosheimGrostad/networkmusic

- You can also play with all the API features via our consumer UI here: https://sonauto.ai/create

We also have some examples written in Python here: https://github.com/Sonauto/sonauto-api-examples

- Generate a rock song: https://github.com/Sonauto/sonauto-api-examples/blob/main/ro...

- Download two songs from YouTube (e.g., Smash Mouth to Rick Astley) and generate a transition between them: https://github.com/Sonauto/sonauto-api-examples/blob/main/tr...

- Generate a singing telegram video (powered by ours and also Lemon Slice's API): https://github.com/Sonauto/sonauto-api-examples/blob/main/si...

You can check out the full docs/get your key here: https://sonauto.ai/developers

We'd love to hear what you think, and are open to answering any tech questions about our model too! It's still a latent diffusion model, but much larger and with a much better GAN decoder.

zaptrem 2 hours ago

One thing I've been thinking about is how to do a better hobbyist plan system. It would be cool to do a flat rate unlimited plan, but we wouldn't want that to then be abused by larger customers/companies. Are there existing API providers you think solve this particularly well?

  • mvdtnz 40 minutes ago

    Why would a hobbyist need an unlimited plan?

    • zaptrem 38 minutes ago

      E.g., in the case of a future "LibreMusic" open source UI or an integration into their DAW they work with on the weekends. I'd get pretty annoyed if I had to keep putting a coin in the machine to adjust Logic Pro effects.

naltroc 2 hours ago

how did you create this without committing grand theft musica

  • JTyQZSnP3cQGa8B an hour ago

    The first 80s song I heard was a literal copy of Phil Collins. But there are no emotions attached to it (for me), and the lyrics are random. It’s more like supermarket background music IMHO, not something I would pay for, especially when we have centuries of music to discover already, why make fake stuff like that?

    Edit: I have just heard the funniest most ridiculous metal song ever without a touch of metal inside. Breathe of Death, it’s like a bad joke.

    If thats the future of anything, I’m going back to plain C (code) when I retire and I’ll never approach the internet ever again.

  • zaptrem an hour ago

    In my opinion training on all music is no more theft than Taylor Swift listening to the radio growing up (as long as we don't regurgitate existing songs which would be bad and useless anyway). I think an alternative legal interpretation where all of humanity's musical knowledge and history are controlled by three megacorporations (UMG/Sony/Warner) would be kinda depressing. If the above is true we might as well shutdown OpenAI and delete all LLM weights while we're at it, losing massive value to humanity.

    • gazebo64 21 minutes ago

      The difference being that a musician being influenced by other musicians still has to work to develop the skills necessary to distill those influences into a final product, and colors that output with their own subjective experiences and taste. This feels like a conveniently naive interpretation to justify stealing artists' work and using it to create derivative generative slop. The final line in your comment is pretty telling of how seriously you take this issue (which is near-universally decried by artists) -- some other massive company is doing a bad thing, so why shouldn't I?

      edit: I have to add how disingenuous I find calling out corporations owning "all of humanity's musical knowledge and history" as if generative AI music trained on unlicensed work from artists is somehow a moral good. At least the contracts artists make with these corporations are consensual and have the potential to yield the artist some benefit which is more than you can say for these gen-AI music apps.

      • zaptrem 12 minutes ago

        I don't see how the amount of work that went into it changes the core fact that all art is influenced by that which came before, and we don't call that stealing (unless you truly believe that "all art is theft").

        My point re: LLMs wasn't meant to exclusively be a "they're doing it" one, the hope was to give an example of something many people would agree is super useful and valuable (I work much faster and learned so much more in college thanks to LLMs) that would be impossible in the proposed strict interpretation of copyright.

        • gazebo64 4 minutes ago

          I think we intuitively allow for artists to derive and interpolate from their influences because of a baseline understanding that A) it is impossible to create art without influence and B) that there is an inherent value in a human creating art and expressing themselves. How that relates to someone using unlicensed music from actual humans to train an AI model in order to profit off of the collective work of thousands of actual human artists, I have no idea.

    • wryoak 44 minutes ago

      I’m skeptical about how much value AI art is going to really contribute to humanity but as a lifelong opponent of copyright I have to roll my eyes when I see people arguing against it on behalf of real artists, all of whom are thieves in the best case and imitators in the worst.

      • bongodongobob 40 minutes ago

        Yeah every musician has a story of writing a new song, bringing it to the band, and they say "oh, this sounds just like [song]." It's almost impossible to make something truly novel.

    • Xelynega 27 minutes ago

      Megacorporations owning copyrights to the majority of IPs(music, games, etc.) is a capitalism/monopoly problem. How does getting rid of copyright and allowing your company to profit off other peoples work in any way solve that issue?

tombot an hour ago

What is the point of generating this low quality AI slop music, what real use case do you have in mind?

  • patcon 30 minutes ago

    I made little gift songs for friends for awhile. It was nice and fun. Making a roadtrip theme song for friends on a vacation is way fun, and kinda locks in the moment

    I also used it when I was living in New Orleans to help a friend come up with a riff for a live set he had, which had some unusual constraints (only had a singer, drummer and trombone, but no others, in an echoey space). He used the generated song hook as inspiration for that nights' arrangement

    There's lots of stuff, and song of it supports artists who have tight timelines and want creative support

  • zaptrem an hour ago

    For the consumer stuff: It's fun, and IMO that's enough. Not every song has to be peak artistic quality pushing the world forward, sometimes it's enough to bring a smile to a friend's face by making a song about them. If you think their art is slop you shouldn't have to listen to it (IMO Spotify et al should have an optional "no AI music" filter for now).

    For the API: I think this could be integrated into artists workflows in lots of ways we can't even imagine right now as it gets better. One example I gave above was generating transitions between songs.

  • JTyQZSnP3cQGa8B an hour ago

    Some kind of Dadaist movement I guess. Listen to Breathe of Death, it’s hilarious and then you cry.

toisanji an hour ago

how is this better or different from suno besides api? I'm assuming since you are smaller the quality is not as good and the depth not as wide.

  • zaptrem an hour ago

    Suno's RVQ-token-based language model is tuned give you an acceptable song that most of their userbase would prefer every single time, but isn't very diverse. Our diffusion model is much more diverse and has higher vocal audio quality, but the results aren't always consistent (just like Flux et al). However, since we have unlimited generations this can be worked around. We're also never going to preference tune our model because I think the stuff that is lost in that process is valuable.