You may have seen the news that Snowflake announced the intent to acquire Streamlit to “empower developers and data scientists to mobilize the world’s data”. This sounds amazing and great news for Data Engineers and developers alike!
Streamlit says it is “being supercharged for all.”
Snowflake’s goal, it says, is to further “unlock the unrealized potential of data and make it easier to build beautiful applications.” If you’re a developer, data scientist or machine learning engineer, what does this actually mean to you? My immediate reaction is this could be another important milestone on the road to achieving #TrueDataOps.
First, a little background. Streamlit’s open source framework simplifies and speeds up the creation of data applications. It enables developers and data scientists to build and share data apps more quickly and iteratively without having to be an expert in front-end development. When you combine that capability with the Streamlit user footprint, you can see the attraction: Streamlit has more than eight million downloads, and more than 1.5 million applications have been built by what the company describes as an “incredibly inclusive community of data nerds”. Streamlit’s open source nature is another part of the story: it is, at the end of the day, Python code. We’d expect to see that continue, at least for the basics.
What makes this move particularly interesting is that Streamlit is a very lightweight framework that it gets you moving and creating data apps very quickly. However, it also has enough capabilities to enable some very advanced applications and use cases. It takes the pain away so you can focus on what you do best, testing and producing, rather than getting tangled up in admin and process. You can be more productive, up to a point: getting apps prototyped, tested, tweaked and into production. This plays neatly into the wider #TrueDataOps world.
However, we know that any Data Application is going to be closely tied to the data that it depends on. Developing the data pipeline (ingestion, transformation, publishing, testing) independently from the software pipeline (built, test, deploy) would be incredibly manual, slow and error prone. What is needed is a way to build, test and deploy data components in Snowflake and our data applications in a single, unified system.
Figure 1: Simplified Development and Deployment workflow
You still need all those data pipelines running in the optimal ways. You need that end-to-end orchestration and automated testing to get through iterations faster and more productively. You need all that governance and control across an end-to-end process, from ingesting the Snowflake data you need, to delivering the apps you’ve been tasked with delivering.
This acquisition is, in part, about continuing to bring the different pieces of the data (and ops) jigsaw puzzle together. DataOps is far more than just about the data of course; it’s taking lessons learned from software DevOps and applying them to meet the specific and evolving needs of today’s data world. It also means taking all the best from the CloudOps world too—after all we need to deploy our data applications to in some cloud infrastructure e.g.through AWS or other Kubernetes-based platforms.
All of this needs to be stitched together properly, orchestrated, and governed. In short, via a #TrueDataOps platform like DataOps.Live. As we know, solutions that only focus on the data are no longer enough. It’s what you can do with the data, to operationalize your apps, getting them tested and deployed into production quickly and in fully compliant ways that matters. You shouldn’t be having to deal with the headaches involved in complex co-ordination, when something else can do that for you.
So perhaps think of Streamlit as a quicker and easier way to render your data apps. And the DataOps.Live platform as the engine to drive that rendering, adding additional value in different ways, as well as making the process of building a branch, setting up all the variables and lining up credentials transparent.
The challenge of environment management purely in the data part of the equation is challenging enough (and one DataOps.live is famous for solving), but to really succeed here we need simultaneous environment management of:
- All of our Data configuration and code
- All of our Software configuration and code
- All of our Cloud Infrastructure configuration and code
- Simultaneous and integrated environment of all of these in the same pipeline
Figure 2: A more advanced, and valuable Development and Deployment workflow
In the video below I demonstrate all of this in reality. Starting with an existing set of data, software and cloud infrastructure, I take a new requirement that needs modification of each, and complete a full development lifecycle starting with a feature branch and then through dev, qa and finally into production, deploying and testing the new app versions at each stage.
From a Snowflake perspective, the “Powered by Streamlit” approach makes a great deal of sense. There are developments and improvements to be made, for sure. I’d expect to support for more Git repos, for example, but this is more about the opportunity. For instance, building and porting data intensive Jupiter apps, simple mobile utility apps, to make them a whole lot more.
This is a highly significant addition to Snowflake, and continues to signpost the huge future for data centric apps Powered By Snowflake and DataOps.live will continue to be at the forefront of automating, orchestrating and providing observability and development lifecycles for the data, the applications and the cloud infrastructure
Snowflake will now own an application framework. It’s a bold step, a statement of intent around Snowflake’s vision, and perfectly in keeping with the demands of the age. When you add a #TrueDataOps platform for end-to-end orchestration, automated testing, code management, governance, control and all the rest, the gains in productivity, speed and quality are there for the taking.
Sound like something you’d like to explore further? Connect with us now to be a data leader and not a follower.
Ready to get started?
Sign up for your free 14 day trial of DataOps.Live on Snowflake Partner Connect today!