Logo displayed on the blog homepages
Data Engineering
Our selection
Spark with Jupyter

Get Started with PySpark and Jupyter Notebook in 3 Minutes

Spark is a fast and powerful framework.

Automate AWS Tasks Thanks to Airflow Hooks

This article is a step-by-step tutorial that will show you how to upload a file to an S3 bucket thanks to an Airflow ETL (Extract Transform Load) pipeline

Receive our articles in your mailbox every week

How to Get Certified in Spark by Databricks?

This article aims to prepare you for the Databricks Spark Developer Certification: register, train and succeed, based on my recent experience.

apache airflow to Celery

How Apache Airflow Distributes Jobs on Celery workers

The life of a distributed task instance

birds love

3 Steps to Improve the Data Quality of a Data lake

From Customising Logs in the Code to Monitoring in Kibana

Automate AWS Tasks Thanks to Airflow Hooks

This article is a step-by-step tutorial that will show you how to upload a file to an S3 bucket thanks to an Airflow ETL (Extract Transform Load) pipeline

lake

Publish Data Outside Your Data Lake with a Spark Connector

Feedback on implementing a Spark connector for Tableau.

keyboard

Git Branch Control when Deploying on AWS with Serverless Framework

Use a Serverless Plugin to check your Git branch before deploying on AWS.

wrong google account

Custom Nested and Validated Forms With React

A guide to build your own complex validated react forms.

cloud

How to Build a Serverless REST API in 15 Minutes on AWS

Use AWS Lambda to build a Serverless REST API, storing data in S3 and querying it with Athena.

pirate raspberry

Build Your Own Cloud with Kubernetes and Some Raspberry Pi

Managing several Raspberry Pi can be a lot of work. This article will teach you how Kubernetes and Docker can help.

docker and feathers logos

Set Up Your Real Time Chat App On Amazon EC2 With Docker and FeathersJS

We are going to learn how to set up a server to run a real time Chat App with FeathersJS

Spark with Jupyter

Get Started with PySpark and Jupyter Notebook in 3 Minutes

Spark is a fast and powerful framework.