Awesome Pandas Repositories
Pandas is a prominent tool for people working with data. Whether for cleaning, analyzing and modeling data, or to organize the results of the analysis into a form suitable for visualizations, pandas is the ideal tool for all of these tasks.
Pandas is a software library written in Python for data manipulation and analysis. In particular, it offers data structures and operations for manipulating (numerical) tables. It allows importing data from various file formats and performing various data manipulation operations such as merging, reshaping, selecting, as well as data cleaning, and data wrangling features.
In the following, we list 30+ notable open source projects / repositories related to pandas, which can be used for learning and to level up your pandas skill as well.
adamerose/pandasgui | A GUI for Pandas DataFrames |
awslabs/aws-data-wrangler | Pandas on AWS |
bukosabino/ta | Technical Analysis Library using Pandas and Numpy |
databricks/koalas | Koalas: pandas API on Apache Spark |
dgerlanc/programming-with-data | Programming with Data: Python and Pandas |
dimenwarper/chainlearn | Mini module with syntax sugar for pandas/sklearn |
donnemartin/data-science-ipython-notebooks | Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines. |
firmai/pandasvault | Advanced Pandas Vault Utilities, Functions and Snippets. |
guipsamora/pandas_exercises | Practice your pandas skills! |
huggingface/nlp | nlp: datasets and evaluation metrics for Natural Language Processing in NumPy, Pandas, PyTorch and TensorFlow |
IntelPython/sdc | Intel Scalable Dataframe Compiler for Pandas* |
justmarkham/pandas-videos | Jupyter notebook and datasets from the pandas Q&A video series |
jvns/pandas-cookbook | Recipes for using Python’s pandas library |
KeithGalli/pandas | Data & Code for my video on the Pandas library of Python |
KeithGalli/Pandas-Data-Science-Tasks | Set of real world data science tasks completed using the Python Pandas library |
kunalj101/Data-Science-Hacks | Data Science Hacks consists of tips, tricks to help you become a better data scientist. Data science hacks are for all - beginner to advanced. Data science hacks consist of python, jupyter notebook, pandas hacks and so on. |
man-group/dtale | Flask/React client for visualizing pandas data structures |
mars-project/mars | Mars is a tensor-based unified framework for large-scale data computation which scales Numpy, Pandas and Scikit-learn. |
mjbahmani/10-steps-to-become-a-data-scientist | Ready to learn or review your knowledge! You will learn 10 skills as data scientist: Python, Machine Learning, Deep Learning, Data Cleaning, EDA, python packages such as Numpy, Pandas, Seaborn, Matplotlib, Plotly, Tensorfolw, Theano…., Linear Algebra, Big Data, Analysis Tools and solve some real problems such as predict house prices. |
mm-mansour/Fast-Pandas | Benchmark for different operations in pandas against various dataframe sizes. |
modin-project/modin | Modin: Speed up your Pandas workflows by changing a single line of code |
pandas-dev/pandas | Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more |
pandas-profiling/pandas-profiling | Create HTML profiling reports from pandas DataFrame objects |
PatrikHlobil/Pandas-Bokeh | Bokeh Plotting Backend for Pandas and GeoPandas |
raviolli77/machineLearning_breastCancer_Python | Machine Learning Applications using Sklearn, matplotlib, pandas, and seaborn |
santosjorge/cufflinks | Productivity Tools for Plotly + Pandas |
tdpetrou/Learn-Pandas | Tutorials on how to use pandas effectively to do data analysis |
tdpetrou/pandas_cub | Learn how to build a data analysis library from scratch |
tkrabel/bamboolib | bamboolib - a GUI for pandas dataframes. Stop googling pandas commands |
willhaslett/covid-19-growth | Daily COVID-19 epidemiological data, piped into friendly Pandas dataframes, functions for dataset construction |