AI: Web Applications and Command-Line Tools for Data Engineering

AI: Web Applications and Command-Line Tools for Data Engineering

by Pragmatic AI Labs

Data Engineering Foundations

Course Description

This comprehensive and practical course, "Data Engineering Foundations," is designed to equip you with the essential skills required in modern data engineering. Through a hands-on approach, you'll dive into the world of interactive Jupyter notebooks, cloud-based deployment, Python microservices, containerization, and command-line tool development. This course is perfect for aspiring data engineers, scientists, and analysts looking to enhance their technical toolkit and stay ahead in the rapidly evolving field of data engineering.

What You'll Learn

  • Building and utilizing interactive Jupyter notebooks for data analysis and machine learning
  • Deploying notebooks on popular cloud platforms like Google Colab and AWS SageMaker
  • Developing scalable Python microservices using FastAPI
  • Containerizing and deploying machine learning microservices
  • Creating robust command-line tools in both Python and Rust
  • Implementing automated testing and publishing workflows for data engineering projects

Prerequisites

This course is designed as an introductory-level program, and there are no specific prerequisites mentioned. However, a basic understanding of programming concepts, particularly in Python, would be beneficial for learners to make the most of the course content.

Course Content

  • Jupyter notebooks for data engineering workflows
  • Cloud notebook deployment on Google Colab and AWS SageMaker
  • FastAPI microservices development
  • Containerization of machine learning microservices
  • Python command-line tool creation
  • Rust CLI app development
  • Automated testing and publishing techniques

Who This Course is For

  • Aspiring data engineers looking to build a strong foundation in modern tools and techniques
  • Data scientists seeking to expand their skillset in deployment and microservices
  • Data analysts wanting to level up their technical abilities
  • Software developers interested in transitioning into data engineering roles
  • Anyone looking to gain practical, hands-on experience with cutting-edge data engineering tools

Real-World Applications

The skills acquired in this course are directly applicable to real-world data engineering scenarios. Learners will be able to:

  • Develop and deploy interactive data analysis notebooks in cloud environments
  • Build scalable and efficient microservices for data processing and machine learning
  • Create containerized applications for easy deployment and management
  • Develop powerful command-line tools to automate data engineering tasks
  • Implement robust testing and publishing workflows for data projects
  • Collaborate more effectively with cross-functional teams using industry-standard tools and practices

Course Syllabus

Module 1: Jupyter Notebooks (4 hours)

  • Introduction to web applications and command-line tools for data engineering
  • Overview of key concepts
  • Getting started with Jupyter notebooks
  • Code cells and text cells in Jupyter
  • Magics in Jupyter
  • Overview of Jupyter Lab

Module 2: Cloud-Hosted Notebooks (5 hours)

  • Introduction to Google Colab
  • Tour of Colab features
  • Data and documents in Colab
  • Introduction to AWS SageMaker
  • Tour of SageMaker Studio
  • Overview of SageMaker Pipelines

Module 3: Python Microservices (12 hours)

  • Introduction to building Python microservices
  • Benefits of microservices
  • Setting up Python project structure for CI
  • Building a random fruit web app with Python
  • Introduction to Python microservices with FastAPI
  • Building FastAPI microservices for ML predictions
  • Deploying a Python Lambda microservice
  • Introduction to building containerized microservices
  • Why use containers for microservices?
  • Deploying a containerized .NET 6 API
  • Deploying a containerized ML microservice

Module 4: Python Packaging and Rust Command-Line Tools (19 hours)

  • Introduction to Python packaging and command-line tools
  • Getting started with Python projects
  • Overview of command-line tool frameworks
  • Using Click to build a command-line tool
  • Exploring advanced command-line tool features
  • Introduction to packaging and distributing your Python project
  • Working with Python setup tools
  • Uploading to a Python registry
  • Introduction to continuous integration for command-line tools
  • Automating testing and publishing with GitHub Actions
  • Introduction to Rust command-line tools
  • Working with user input, output, modules in Rust
  • Optimizing Rust command-line tools
  • Big O notation final challenge
Similar Courses
Course Page   AI: Web Applications and Command-Line Tools for Data Engineering