Entries by napalytics

Data Ingestion Methods

A Data Pipeline is a set of steps that Extract, Load and Transform data for consumption by the end user. As part of the blog series on Data Pipelines, we spoke about Data Ingestion and the different open-source and commercial players. In this blog we will talk about different data ingestion methods The need for […]

Data Ingestion – Extraction part of a Data Pipeline

In the previous blog series, we defined data pipelines, the types of data pipelines, and the data pipeline components. We identified the three main pieces of a data pipeline: Extract, Load, and Transform. In this blog, we focus on “Extract”, also referred to as Data Ingestion. Data Ingestion Data Ingestion is the movement of data […]

ETL vs ELT: Why the shift to ELT?

The question of ETL vs ELT is a recurring one in the world of data analytics. Historically, data engineers provide data consumers with processed data ready for consumption in a data warehouse. The process of delivering such processed data to consumers can be thoughts of as Extract, Transform, and Load (ETL). This blog demonstrated hoe […]

What are different types of Data Pipelines

As part of the Data Pipeline series, in part one and two of the series, we talked about data pipeline and components of data pipeline. The third part deals with types of data pipelines. The type of data pipeline is related to the need for fresh data. Data pipeline types are traditional (batch) and real-time. […]

Data Pipelines – An Introduction

Data Pipeline is a sequence of steps that deliver consumable data to the end users. Why do we need a sequence of steps? In the present world, data comes from diverse sources in different formats. It is the job of a data engineer to make consumable data available to various consumers. Automated orchestration of these […]

Data States Explained

Data exists in one of three states: Data at Rest, Data in Motion and Data in Use. When we understand these three states, it lays the foundation for how to extract value from that data to support business operations. This blog intends to introduce the concepts and lay a basis for the upcoming data engineering […]