Understanding Function Caching in Python

Image for post
Image for post
Photo by Kaboompics .com on Pexels

Memoization is a method used to store the results of previous function calls to speed up future calculations. If repeated function calls are made with the same parameters, we can store the previous values instead of repeating unnecessary calculations. This results in a significant speed up in calculations. In this post, we will use memoization to find factorials.

Let’s get started!

First, let’s define a recursive function that we can use to display the first factorials up to n. If you are unfamiliar with recursion, check out this article: Recursion in Python.

As a reminder, the factorial is defined for…


Custom Python Classes for Generating Statistical Insights from Data

Image for post
Image for post
Photo by Max Fischer on Pexels

In computer programming, a class is a blueprint for a user-defined data type. Classes are defined in terms of attributes (data) and methods (functions). These data structures are a great way to organize data and methods such that they are easy to reuse and extend in the future. In this post, we will define a python class that will allow us to generate simple summary statistics and perform some EDA on data.

Let’s get started!

For our purposes we will be working with the FIFA 19 data set which can be found here.

To start, let’s import the pandas package:


EDA and Sentiment Analysis of Reddit Data

Image for post
Image for post
Photo by energepic.com on Pexels

Reddit WallStreetBets Posts is a data set available on the Kaggle website that contains WallStreetBet information. WallStreetBets is a subreddit used for discussing stock and option trading. WallStreetBets is most notable for its role in the GameStop short squeeze that resulted in $70 billion in losses on short positions in US firms. In this post we will explore the Reddit WallStreetBets Posts in python. The data was scraped using the python Reddit API wrapper (PRAW) in compliance with Reddit’s rules around API usage. The data is can be found here.

Let’s get started!

First, let’s read the data into a…


A Short Survey of Healthcare Cost Data

Image for post
Image for post
Photo by RF._.studio on Pexels

Healthcare spending in the US continues to rapidly grow as the aging population and disease prevalence increase. A study published in the Journal of the American Medical Association (JAMA) reported that healthcare spending in the US rose by almost $1 trillion between 1996 and 2015.

Health cost and quality are often made opaque to consumers due to lack of healthcare transparency. If consumers had access to quality healthcare information they may be able to have more agency in their healthcare services. For example, another study published in the American Heart Journal found that, after adjusting for patient risk and length…


Millennium Prize Problem: Yang-Mills and Mass Gap

Image for post
Image for post
Photo by Pixabay on Pexels

The Millennium prize problems are seven challenging problems in mathematics for which a solution results in a $1 million prize. In this post we will briefly discuss one of the Millennium prize problems, the Yang-Mills and Mass Gap problem.

All of the Millennium prize problems are listed on the Clay Mathematics Institutes’ website here.

Yang-Mills and Mass Gap

The Yang-Mills theory describes elementary particles, which are particles with no substructure (quarks, leptons, Higgs boson), using algebraic objects. Specifically, non-abelian Lie groups are used to unify electromagnetic and weak forces. …


Building Machine Learning Models with Scikit-learn

Image for post
Image for post
Photo by Steve Johnson on Pexels

Scikit-learn is a powerful machine learning library in python. It provides many tools for classification, regression and clustering tasks. In this post we will discuss some popular tools for building classification models using scikit-learn.

Let’s get started!

For our purposes we will be working with the Bank Churn Modeling data set. The data can be found here.

To start, let’s import the Pandas library, relax display limits and print the first five rows of data:

import pandas as pddf = pd.read_csv("Bank_churn_modelling.csv")pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
print(df.head())


Data selection, aggregation and statistics with Bank Churn Modeling data

Image for post
Image for post
Photo by Burst on Pexels

Pandas is a python library that is used for wrangling data, generating statistics, aggregating data and much more. In this post we will discuss how to perform data selection, aggregation and statistical analysis using the Pandas library.

Let’s get started!

For our purposes we will be working with the Bank Churn Modeling data set. The data can be found here.

To start, let’s import the Pandas library, relax display limits and print the first five rows of data:

import pandas as pddf = pd.read_csv("Bank_churn_modelling.csv")
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
print(df.head())


Understanding Regular Expressions in Python

Image for post
Image for post
Photo by Iva Muškić on Pexels

Regular expressions are sequences of characters that define patterns which can be used for tasks such as pattern matching and text searching. In this post, we will discuss how to use the search method in the python regular expressions module.

Let’s get started!

Consider the following sentence:

sentence1 = 'Python is great'

We can use the ‘search()’ method from the ‘re’ module to search for patterns in this text. The syntax for searching for patterns in the beginning and end of text is as follows:

import re
result = re.search("^begin.*end$", text)

The ‘^’ is the character we use for finding…


Understanding Class Inheritance in Python

Image for post
Image for post
Photo by Max Fischer on Pexels

Inheritance is a concept in object oriented programming where existing classes can be modified by a new class. The existing class is called the base class and the new class is called the derived class. In this post, we will discuss class inheritance in python.

Let’s get started!

The syntax of python class inheritance is as follows:

class BaseClass: 
#body of BaseClass
class DerivedClass(BaseClass):
#body of DerivedClass

For our example we will consider a SpotifyUser derived class inheriting from FacebookUser base class. First let’s define our FacebookUser class:

class FacebookUser: 
pass

Now let’s consider some attributes of a Facebook user…


String Formatting in Python

Image for post
Image for post
Photo by Chris F. on Pexels

Python offers a variety of methods for string formatting. In this post, we will review three methods for formatting strings in python. Specifically, we will discuss %-formatting, str.format() and formatting strings with f-strings.

Let’s get started!

Formatting Strings with %

First, we will consider ‘%” formatting. Consider two variables that store a name and email address. We can write a function that takes the name and email and prints out a customized message using ‘%’ formatting:

def get_message_pct(name, email):
print("%s's email is %s."%(name, email))

We can call this function with values for name and email and get the following output:

get_message_pct('John', 'johnadams@gmail.com')

Sadrach Pierre, Ph.D.

Data Scientist at WorldQuant Predictive. Writer for Built In & Towards Data Science. Cornell University Ph. D. in Chemical Physics.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store