Egor Dmitriev
As a data engineer and machine learning enthusiast, Egor is always seeking innovative ways to explore the limits of what can be achieved. Having extensive domain knowledge in graph machine learning and database internals, Egor is dedicated to staying on top of emerging technologies. With his expertise aimed at helping organizations unlock the true potential of their data, quality and resilient data architecture are always present in the end product.
Deploying SageMaker Pipelines Using AWS CDK
Introduction SageMaker is a loved and feared AWS service. You can do anything with it, from building data pipelines, to training machine learning models, to serving said models to your customers. Because of this, there is a range of approaches...
LLM Series, part 1: A Comprehensive Introduction to Large Language Models
Large language models (LLMs) are all the buzz these days. From big corporations like Microsoft enhancing their office products to Snapchat having an assistant for entertainment, to high schoolers trying to cheat their assignments. Everyone is trying to incorporate LLMs...
Data Quality Series, part 3: Overview of Data Lineage
In this article, we delve into the often overlooked, but crucial aspect of data quality – data lineage. Data lineage records the flow of data and all the transformations throughout its life-cycle, from source to destination. Understanding this is vital...
Data Quality Series, part 2: Data Quality Testing with Deequ in Spark
In this blog, we explore how to ensure data quality in a Spark Scala ETL (Extract, Transform, Load) job. To achieve this, we leverage Deequ, an open-source library, to define and enforce various data quality checks. If you need a...
Data Quality Series, part 1: Introduction to Data Quality
We’ve all heard the phrase “garbage in, garbage out” which highlights the importance of quality data for data-driven systems. Here, quality data can be interpreted in two ways: firstly as clean and well-standardized data that meets expectations, and secondly, as...
I haven't written any whitepapers yet. Come back soon!
I do not have any courses available yet. Come back soon!
I do not have any webinars available yet. Come back soon!