Skip to content
DataOps.live Professional EditionNEW
Purpose-built environment for small data teams and dbt Core developers.
DataOps.live Enterprise Edition
DataOps.live is the leading provider of Snowflake environment management, end-to-end orchestration, CI/CD, automated testing & observability, and code management, wrapped in an elegant developer interface.
Spendview for Snowflake FREE

An inexpensive, quick and easy way to build beautiful responsive website pages without coding knowledge.


Pricing and Edition

See whats included in our Professional and Enterprise Editions.

Getting Started
Docs- New to DataOps.liveStart learning by doing. Create your first project and set up your DataOps execution environment.
Join the Community
Join the CommunityFind answers to your DataOps questions, collaborate with your peers, share your knowledge!
#TrueDataOps Podcast
#TrueDataOps PodcastWelcome to the #TrueDataOps podcast with your host Kent Graziano, The Data Warrior!
Resource Hub
On-demand resources: eBooks, white papers, videos, webinars.

Customer Stories
Academy

Enroll in the DataOps.live Academy to take advantage of training courses. These courses will help you make the most out of DataOps.live.


Learning Resources
A collection of resources to support your learning journey.
Events
Connect with fellow professionals, expand your network, and gain knowledge from our esteemed product and industry experts.
Blogs

Stay updated with the latest insights and news from our DataOps team and community.


#TrueDataOps.org
#TrueDataOps is defined by seven key characteristics or pillars:
In The News

In The News

Stay up-to-date with the latest developments, press releases, and news.
About Us
About UsFounded in 2020 with a vision to enhance customer insights and value, our company has since developed technologies focused on DataOps.
Careers

Careers

Join the DataOps.live team today! We're looking for colleagues on our Sales, Marketing, Engineering, Product, and Support teams.
Man walking in data center
Mincho Ganchev | CloudOps + DataOps Engineer | DataOps.liveNov 29, 2022 3:09:22 PM4 min read

Making Data Engineering Work: Clear, Repeatable, + Well-Documented

Make sure to read the first blog in our series featuring data engineers talking about all things data and DataOps. Our first blog covers Martin Getov, Bulgaria-based CloudOps & DataOps Engineer story covering; Driven by curiosity: making sense of the Data puzzle.
 
In this second blog, CloudOps & DataOps Engineer Mincho Ganchev talks about his route into data science and how data catacombs can be avoided by using DataOps.live. 

Data engineering is the link between the input that a business has, and the decisions that need to be made. As data engineers, our role is to collect, curate and automate the data so it’s available to be interpreted. And this is quite a responsibility in today’s whirlwind of tech solutions and methods.  

Businesses need to take decisions or make assumptions based on the facts. If an organization wants to stay competitive and be successful, it has to be data-driven and not gut-driven. That’s why DataOps-driven data engineering is so important. 

 I wasn’t always a data engineer. For almost a decade, I was a sales representative in the machinery sector, and I wanted to work in a more challenging field. I already understood business requirements from diverse stakeholders, and had a longstanding love for coding and solving complex problems, so I brought the two together. I started using a BI tool and learned SQL. One thing led to another. I was into the Jinja web template engine, which made me curious about Python. An avid fan of Snowflake and dbt, seeing how it all comes together in DataOps.live made me want to be part of the team.  

My main expertise lies in analytics and data modeling. But prior to using DataOps.live, there were a number of issues that I wondered how to tackle. This included essential but mundane and time-consuming tasks in your day-to-day work that you’d love to automate or skip. If I wanted to test a colleague’s code - not only if it’s working but have tests on the data results over a period of time—I had to switch over to their branch, use the same roles & permissions, use the same target, and hope the source data was up to date. By then, the PROD database had usually moved ahead a few ingestion batches, meaning I had to re-run that branch of data, which for sets of data that were based on logic from the previous day had too much overhead. I had to simulate ingestion with flags for the days that I lacked. Annoying, to say the least.  

Even when everyone has their own database/schema to work on, if not kept up to date (which requires resources) it quickly turns into a stale and inaccurate representation of the data. You risk what I call the data catacombs (or data swamp, if you prefer). A place of musty old data, with many schemas named like ‘CUSTOMERS_JOHN’ or ‘CUSTOMERS_CLARE’ or any other ‘CUSTOMER_your_colleague_name_here’.  

By contrast, the DataOps.live platform provides the automation and simplified processes you need to reach your destination faster and easier. You get the developer experience you want—to, say zero copy clone instantly and automatically when branching out—in a clever, quick and fast way. No need to wonder if I’ve ran ingestion, if my data valid, and so on. I know from experience that a well-administered DataOps.live project saves you a large amount of time, so you can turn your attention towards more fun stuff, interpreting rather than administering data. And that’s where the real value lies.  

I think our customers want answers to two questions: how quickly can we get a data product out, ready to be used by analysts; and how quickly can we onboard new team members? DataOps.live helps you to do both. This is about improving time-to-value without any compromises in the quality of the resulting data product, building something that is robust and can scale. 

When an organization outgrows a small team, you’re no longer in a simple Hello World environment, switching targets or copying schemas. And you face those data catacombs: dangling objects in your cloud instance that no one knows who made them, why they exist, if they’re a final version, or final-final?  

Based on my experiences and observations, there are three main requirements for an effective data engineering function. First, idempotence. That is, it can be applied multiple times and should work in all cases, no matter how many times it’s used or for what purpose. It should always do the same thing. Second, it needs to be simple. We sometimes forget it’s generally people reading our code and functions rather than machines; we spend most of our time reading code rather than writing it. And third, it should be well-documented so everyone—engineers, product owners, business users—can understand what the data engineering function is for, and what it can do. 

#TrueDataOps enabled via the DataOps.live platform is an opportunity to develop that true data-driven approach. It’s a mindset, a culture to be shared via champions: people who are inspired by this approach, and who can articulate the many benefits to be gained. 

Mincho has been with DataOps.live since August 2022. He was previously an Analytics Engineer at Infinite Lambda, Web Developer at Real Time Games, and Area Sales Manager at AIGER Engineering Ltd. He studied at the University of National and World Economics. Connect with Mincho here: https://www.linkedin.com/in/mincho-ganchev-426a9777/ 

Stay tuned for our next blog with CloudOps & DataOps Engineer Aleksandra  Shumkoska, where she explains how ensuring a more effective developer experience can help you move closer to becoming a data-driven organization.

 

RELATED ARTICLES