This repository contains a Python script that I have created to analyze the trading volume data of a cryptocurrency for my personal use.
This project represents a comprehensive ETL (Extract, Transform, Load) pipeline, crafted in Python, intended for the analysis, clustering, and regression computation of diverse datasets. For the sake of data security and confidentiality, certain datasets, specifically Android and iOS user data, have not been disclosed.
Input Data vs Output Data of The Pipeline:
After doing feature engineering, here's the output data of the project.
The ETL pipeline is structured around five primary classes:
Equipped with various methods, this class is responsible for extracting data from external sources.
This class contains a suite of methods designed to transform the data procured by the DataExtractor.
This class encompasses several methods for efficiently loading the provided data.
Embedded with numerous methods, it facilitates Exploratory Data Analysis (EDA) on the available data.
With several dedicated methods, it ensures data validation. Data will not proceed to analysis unless authenticated by this class.
Purposefully designed, this project aims to address pivotal questions, such as:
Clustered the days based on their volume