Relaunching Siev
Stabilizing the system
It’s been a hack-fueled few weeks - when Siev initially launched there were dozens of corners cut to try and get a demo-able idea in a state that could be shown off. Shipping fast is good but it can result in a stack thats hard to manage and unstable.
Any code base with shaky foundations is going to get harder and harder to build off of over time and this seemed like the right time to do it, especially since I was continuing to build out a Speaker Identification feature and had to pick up the pieces after accidentally deleting a core piece of infrastructure.
Thus, a Millennium Tower style drill-to-bedrock operation had to be done. It took a few weeks, but the shakiness of the original design has been purged and the system overall is more robust.
It lives! (again)
With that, I’m happy to say that Siev is back online and will actually stay up to date as time moves on, rather than getting progressively more stale.
The engineering efforts from the past month have added a few new features:
A cleaned up topic page, which shows an aggregate view of how the podcast space feels about a given topic.
A rich transcript, allowing you to scan what a show it talking about and the sentiment involved, without having to listen to each show.
And finally, a speaker identification page, which shows all the different times someone’s voice has appeared in the podcast space.
Siev is slowly shaping up to be a next generation audio processing pipeline, able to take in arbitrary audio and produce summaries and insights without having to spend time to listen to an entire stream to get and understanding of whats being talked about. Theres a ton more work to do, but its about time to cut back towards exploring use cases and seeing what cool things can be done with this data.