Skip to content
#

data-normalization

Here are 147 public repositories matching this topic...

The PyDI framework provides methods for end-to-end data integration. The framework covers all steps of the integration process, including schema matching, data translation, entity matching, and data fusion. The framework offers traditional string-based methods as well as modern LLM- and embedding-based techniques for these tasks.

  • Updated Mar 12, 2026
  • HTML

🔷 Data Cleaning and Insight Generation from Survey Data 🔷 Cleaned and preprocessed Kaggle’s Data Science Survey data, handling missing values, duplicates, and categorical responses. Applied label encoding and normalization to prepare the dataset for analysis. Built 12+ visualizations (pie, scatter, box, line, heatmap, etc.)

  • Updated Sep 30, 2025
  • Jupyter Notebook

This repository provides a practical introduction to data acquisition and analysis using Pandas. It covers loading datasets, exploring data, manipulating data, and gaining insights through statistical summaries. Ideal for beginners, it offers code examples and explanations to enhance your data manipulation skills using Pandas for Python.

  • Updated Aug 19, 2023
  • Jupyter Notebook

Highlighting expertise in data migration, data normalization and standardization, this project demonstrates successful data transfer from Snowflake to Databricks. It emphasizes optimized data flow and enhanced accessibility through standardization, showcasing a commitment to ethical data practices.

  • Updated Jul 3, 2024

Improve this page

Add a description, image, and links to the data-normalization topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the data-normalization topic, visit your repo's landing page and select "manage topics."

Learn more