[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-9846":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":20,"compositeScore":21,"rankGlobal":10,"rankLanguage":10,"license":22,"archived":23,"fork":23,"defaultBranch":24,"hasWiki":25,"hasPages":25,"topics":26,"createdAt":10,"pushedAt":10,"updatedAt":47,"readmeContent":48,"aiSummary":49,"trendingCount":16,"starSnapshotCount":16,"syncStatus":18,"lastSyncTime":50,"discoverSource":51},9846,"Machine-Learning-with-Python","tirthajyoti\u002FMachine-Learning-with-Python","tirthajyoti","Practice and tutorial-style notebooks  covering wide variety of machine learning techniques","https:\u002F\u002Fmachine-learning-with-python.readthedocs.io\u002Fen\u002Flatest\u002F",null,"Jupyter Notebook",3316,1832,153,5,0,1,2,6,3,31.79,"BSD 2-Clause \"Simplified\" License",false,"master",true,[27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46],"artificial-intelligence","classification","clustering","data-science","decision-trees","deep-learning","dimensionality-reduction","flask","k-nearest-neighbours","machine-learning","matplotlib","naive-bayes","neural-network","numpy","pandas","pytest","random-forest","regression","scikit-learn","statistics","2026-06-12 02:02:13","[![License](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-BSD%202--Clause-orange.svg)](https:\u002F\u002Fopensource.org\u002Flicenses\u002FBSD-2-Clause)\n[![GitHub forks](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fforks\u002Ftirthajyoti\u002FMachine-Learning-with-Python.svg)](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fnetwork)\n[![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Ftirthajyoti\u002FMachine-Learning-with-Python.svg)](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fstargazers)\n[![PRs Welcome](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPRs-welcome-brightgreen.svg)](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fpulls)\n\n# Python Machine Learning Jupyter Notebooks ([ML website](https:\u002F\u002Fmachine-learning-with-python.readthedocs.io\u002Fen\u002Flatest\u002F))\n\n### Dr. Tirthajyoti Sarkar, Fremont, California ([Please feel free to connect on LinkedIn here](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Ftirthajyoti-sarkar-2127aa7))\n\n![ml-ds](https:\u002F\u002Fraw.githubusercontent.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fmaster\u002FImages\u002FML-DS-cycle-1.png)\n\n---\n\n## Also check out these super-useful Repos that I curated\n\n- ### [Highly cited and useful papers related to machine learning, deep learning, AI, game theory, reinforcement learning](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FPapers-Literature-ML-DL-RL-AI)\n\n- ### [Carefully curated resource links for data science in one place](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FData-science-best-resources)\n\n## Requirements\n* **Python 3.6+**\n* **NumPy (`pip install numpy`)**\n* **Pandas (`pip install pandas`)**\n* **Scikit-learn (`pip install scikit-learn`)**\n* **SciPy (`pip install scipy`)**\n* **Statsmodels (`pip install statsmodels`)**\n* **MatplotLib (`pip install matplotlib`)**\n* **Seaborn (`pip install seaborn`)**\n* **Sympy (`pip install sympy`)**\n* **Flask (`pip install flask`)**\n* **WTForms (`pip install wtforms`)**\n* **Tensorflow (`pip install tensorflow>=1.15`)**\n* **Keras (`pip install keras`)**\n* **pdpipe (`pip install pdpipe`)**\n\n---\n\nYou can start with this article that I wrote in Heartbeat magazine (on Medium platform): \n### [\"Some Essential Hacks and Tricks for Machine Learning with Python\"](https:\u002F\u002Fheartbeat.fritz.ai\u002Fsome-essential-hacks-and-tricks-for-machine-learning-with-python-5478bc6593f2)\n\u003Cimg src=\"https:\u002F\u002Fcookieegroup.com\u002Fwp-content\u002Fuploads\u002F2018\u002F10\u002F2-1.png\" width=\"450\" height=\"300\"\u002F>\n\n## Essential tutorial-type notebooks on Pandas and Numpy\nJupyter notebooks covering a wide range of functions and operations on the topics of NumPy, Pandans, Seaborn, Matplotlib etc.\n\n* [Detailed Numpy operations](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FPandas%20and%20Numpy\u002FNumpy_operations.ipynb)\n* [Detailed Pandas operations](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FPandas%20and%20Numpy\u002FPandas_Operations.ipynb)\n* [Numpy and Pandas quick basics](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FPandas%20and%20Numpy\u002FNumpy_Pandas_Quick.ipynb)\n* [Matplotlib and Seaborn quick basics](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FPandas%20and%20Numpy\u002FMatplotlib_Seaborn_basics.ipynb)\n* [Advanced Pandas operations](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FPandas%20and%20Numpy\u002FAdvanced%20Pandas%20Operations.ipynb)\n* [How to read various data sources](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FPandas%20and%20Numpy\u002FRead_data_various_sources\u002FHow%20to%20read%20various%20sources%20in%20a%20DataFrame.ipynb)\n* [PDF reading and table processing demo](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FPandas%20and%20Numpy\u002FRead_data_various_sources\u002FPDF%20table%20reading%20and%20processing%20demo.ipynb)\n* [How fast are Numpy operations compared to pure Python code?](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FPandas%20and%20Numpy\u002FHow%20fast%20are%20NumPy%20ops.ipynb) (Read my [article](https:\u002F\u002Ftowardsdatascience.com\u002Fwhy-you-should-forget-for-loop-for-data-science-code-and-embrace-vectorization-696632622d5f) on Medium related to this topic)\n* [Fast reading from Numpy using .npy file format](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FPandas%20and%20Numpy\u002FNumpy_Reading.ipynb) (Read my [article](https:\u002F\u002Ftowardsdatascience.com\u002Fwhy-you-should-start-using-npy-file-more-often-df2a13cc0161) on Medium on this topic)\n\n## Tutorial-type notebooks covering regression, classification, clustering, dimensionality reduction, and some basic neural network algorithms\n\n### Regression\n* Simple linear regression with t-statistic generation\n\u003Cimg src=\"https:\u002F\u002Fslideplayer.com\u002Fslide\u002F6053182\u002F20\u002Fimages\u002F10\u002FSimple+Linear+Regression+Model.jpg\" width=\"400\" height=\"300\"\u002F>\n\n* [Multiple ways to perform linear regression in Python and their speed comparison](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FRegression\u002FLinear_Regression_Methods.ipynb) ([check the article I wrote on freeCodeCamp](https:\u002F\u002Fmedium.freecodecamp.org\u002Fdata-science-with-python-8-ways-to-do-linear-regression-and-measure-their-speed-b5577d75f8b))\n\n* [Multi-variate regression with regularization](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FRegression\u002FMulti-variate%20LASSO%20regression%20with%20CV.ipynb)\n\u003Cimg src=\"https:\u002F\u002Fupload.wikimedia.org\u002Fwikipedia\u002Fcommons\u002Fthumb\u002Ff\u002Ff8\u002FL1_and_L2_balls.svg\u002F300px-L1_and_L2_balls.svg.png\"\u002F>\n\n* Polynomial regression using ***scikit-learn pipeline feature*** ([check the article I wrote on *Towards Data Science*](https:\u002F\u002Ftowardsdatascience.com\u002Fmachine-learning-with-python-easy-and-robust-method-to-fit-nonlinear-data-19e8a1ddbd49))\n\n* [Decision trees and Random Forest regression](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FRegression\u002FRandom_Forest_Regression.ipynb) (showing how the Random Forest works as a robust\u002Fregularized meta-estimator rejecting overfitting)\n\n* [Detailed visual analytics and goodness-of-fit diagnostic tests for a linear regression problem](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FRegression\u002FRegression_Diagnostics.ipynb)\n\n* [Robust linear regression using `HuberRegressor` from Scikit-learn](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FRegression\u002FRobust%20Linear%20Regression.ipynb)\n\n-----\n\n### Classification\n* Logistic regression\u002Fclassification ([Here is the Notebook](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FClassification\u002FLogistic_Regression_Classification.ipynb))\n\u003Cimg src=\"https:\u002F\u002Fqph.fs.quoracdn.net\u002Fmain-qimg-914b29e777e78b44b67246b66a4d6d71\"\u002F>\n\n* _k_-nearest neighbor classification ([Here is the Notebook](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FClassification\u002FKNN_Classification.ipynb))\n\n* Decision trees and Random Forest Classification ([Here is the Notebook](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FClassification\u002FDecisionTrees_RandomForest_Classification.ipynb))\n\n* Support vector machine classification ([Here is the Notebook](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FClassification\u002FSupport_Vector_Machine_Classification.ipynb)) (**[check the article I wrote in Towards Data Science on SVM and sorting algorithm](https:\u002F\u002Ftowardsdatascience.com\u002Fhow-the-good-old-sorting-algorithm-helps-a-great-machine-learning-technique-9e744020254b))**\n\n\u003Cimg src=\"https:\u002F\u002Fdocs.opencv.org\u002F2.4\u002F_images\u002Foptimal-hyperplane.png\"\u002F>\n\n* Naive Bayes classification ([Here is the Notebook](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FClassification\u002FNaive_Bayes_Classification.ipynb))\n\n---\n\n### Clustering\n\u003Cimg src=\"https:\u002F\u002Fi.ytimg.com\u002Fvi\u002FIJt62uaZR-M\u002Fmaxresdefault.jpg\" width=\"450\" height=\"300\"\u002F>\n\n* _K_-means clustering ([Here is the Notebook](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FClustering-Dimensionality-Reduction\u002FK_Means_Clustering_Practice.ipynb))\n\n* Affinity propagation (showing its time complexity and the effect of damping factor) ([Here is the Notebook](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FClustering-Dimensionality-Reduction\u002FAffinity_Propagation.ipynb))\n\n* Mean-shift technique (showing its time complexity and the effect of noise on cluster discovery) ([Here is the Notebook](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FClustering-Dimensionality-Reduction\u002FMean_Shift_Clustering.ipynb))\n\n* DBSCAN (showing how it can generically detect areas of high density irrespective of cluster shapes, which the k-means fails to do) ([Here is the Notebook](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FClustering-Dimensionality-Reduction\u002FDBScan_Clustering.ipynb))\n\n* Hierarchical clustering with Dendograms showing how to choose optimal number of clusters ([Here is the Notebook](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FClustering-Dimensionality-Reduction\u002FHierarchical_Clustering.ipynb))\n\n\u003Cimg src=\"https:\u002F\u002Fwww.researchgate.net\u002Fprofile\u002FCarsten_Walther\u002Fpublication\u002F273456906\u002Ffigure\u002Ffig3\u002FAS:294866065084419@1447312956501\u002FExample-of-hierarchical-clustering-clusters-are-consecutively-merged-with-the-most.png\" width=\"700\" height=\"400\"\u002F>\n\n---\n\n### Dimensionality reduction\n* Principal component analysis\n\n\u003Cimg src=\"https:\u002F\u002Fi.ytimg.com\u002Fvi\u002FQP43Iy-QQWY\u002Fmaxresdefault.jpg\" width=\"450\" height=\"300\"\u002F>\n\n---\n\n### Deep Learning\u002FNeural Network\n* [Demo notebook to illustrate the superiority of deep neural network for complex nonlinear function approximation task](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FFunction%20Approximation%20by%20Neural%20Network\u002FPolynomial%20regression%20-%20linear%20and%20neural%20network.ipynb)\n* Step-by-step building of 1-hidden-layer and 2-hidden-layer dense network using basic TensorFlow methods\n\n---\n\n### Random data generation using symbolic expressions\n* How to use [Sympy package](https:\u002F\u002Fwww.sympy.org\u002Fen\u002Findex.html) to generate random datasets using symbolic mathematical expressions.\n\n* Here is my article on Medium on this topic: [Random regression and classification problem generation with symbolic expression](https:\u002F\u002Ftowardsdatascience.com\u002Frandom-regression-and-classification-problem-generation-with-symbolic-expression-a4e190e37b8d)\n\n---\n\n### Synthetic data generation techniques\n* [Notebooks here](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Ftree\u002Fmaster\u002FSynthetic_data_generation)\n\n### Simple deployment examples (serving ML models on web API)\n* [Serving a linear regression model through a simple HTTP server interface](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Ftree\u002Fmaster\u002FDeployment\u002FLinear_regression). User needs to request predictions by executing a Python script. Uses `Flask` and `Gunicorn`.\n\n* [Serving a recurrent neural network (RNN) through a HTTP webpage](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Ftree\u002Fmaster\u002FDeployment\u002Frnn_app), complete with a web form, where users can input parameters and click a button to generate text based on the pre-trained RNN model. Uses `Flask`, `Jinja`, `Keras`\u002F`TensorFlow`, `WTForms`.\n\n---\n\n### Object-oriented programming with machine learning\nImplementing some of the core OOP principles in a machine learning context by [building your own Scikit-learn-like estimator, and making it better](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FOOP_in_ML\u002FClass_MyLinearRegression.ipynb).\n\nSee my articles on Medium on this topic.\n\n* [Object-oriented programming for data scientists: Build your ML estimator](https:\u002F\u002Ftowardsdatascience.com\u002Fobject-oriented-programming-for-data-scientists-build-your-ml-estimator-7da416751f64)\n* [How a simple mix of object-oriented programming can sharpen your deep learning prototype](https:\u002F\u002Ftowardsdatascience.com\u002Fhow-a-simple-mix-of-object-oriented-programming-can-sharpen-your-deep-learning-prototype-19893bd969bd)\n\n---\n### Unit testing ML code with Pytest\nCheck the files and detailed instructions in the [Pytest](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Ftree\u002Fmaster\u002FPytest) directory to understand how one should write unit testing code\u002Fmodule for machine learning models\n\n---\n\n### Memory and timing profiling\n\nProfiling data science code and ML models for memory footprint and computing time is a critical but often overlooed area. Here are a couple of Notebooks showing the ideas,\n\n* [Memory profling using Scalene](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Ftree\u002Fmaster\u002FMemory-profiling\u002FScalene)\n* [Time-profiling data science code](https:\u002F\u002Fgithub.com\u002Ftirthajyoti\u002FMachine-Learning-with-Python\u002Fblob\u002Fmaster\u002FTime-profiling\u002FcProfile.ipynb)\n","该项目提供了一系列基于Jupyter Notebook的教程和实践，涵盖了广泛的机器学习技术。核心功能包括分类、聚类、回归、深度学习等算法的实现与应用，并且利用了NumPy、Pandas、Scikit-learn等Python库进行数据处理与分析。项目中还提供了关于数据可视化（如使用Matplotlib和Seaborn）以及Web框架Flask集成的示例。非常适合初学者通过动手实践来学习机器学习的基础知识和技术，同时也为有一定基础的数据科学家提供了丰富的参考资料。","2026-06-11 03:24:59","top_topic"]