This project follows the assignments in the Udacity Data Engineering course. Each project has its own README, which details the project scenarios, along with the technologies and processes used.
Some of the data technologies used include:
- SQL
- Python
- PySpark
- Numpy
- Pandas
- Apache Spark
- Apache Cassandra
- AWS Redshift
- AWS RDS
- AWS S3
- Apache Airflow
Some of the data skills used include:
- ETL pipelines
- Distributed cloud processing
- Parallel computing
- Scheduling data pipelines
The Capstone Project, A Marvel Social Network, simulates a social network among Marvel superhero characters. The project seeks to make "friend" recommendations, along with analyzing character screentime versus comic book appearances.