Skip to main content

The journey of a thousand miles..


"The journey of thousand miles begins with one step", this famous quote by Lao Tzu explains my current state of mind very well.And that journey, which I am embarking on, is the field of data science.This blog whose sole purpose is to share my growth from infancy to maturity , is going to be the testimony of my growth, my ups and downs and all the relevant experiences.
Wish me good luck! 

Comments

Post a Comment

Popular posts from this blog

WeRateDogs Complete Project

wrangle_act #Import libraries In [1]: import pandas as pd import requests import tweepy from tweepy import OAuthHandler from tweepy import API from tweepy import Cursor import time import datetime as dt import matplotlib import matplotlib.pyplot as plt import seaborn as sns % matplotlib inline #data gathering section Do the following activities: 1.Read from archive file ¶ 2.Read from TSV file with URL 3.Read from twitter via Twitter API In [2]: #Read CSV file into a dataframe using pandas read-csv function. archive_df = pd . read_csv ( 'twitter-archive-enhanced.csv' ) In [3]: #Read TSV file from a URL using requests function. url = 'https://d17h27t6h515a5.cloudfront.net/topher/2017/August/599fd2ad_image-predictions/image-

Pandas cheat sheet

In my learning so far , I have observed that NumPy, Pandas and matplotlib are core parts of Python which are going to help in data analysis. Here are some of the Pandas code snippets which I tested myself in Jupyter notebook. ==>to figure out number of duplicate rows df1['is_duplicated'] = df1.duplicated(['col1', 'Col2'.....'coln']) print(df1['is_duplicated'].sum()) ==> to figure out rows with missing values .. sum(df1.apply(lambda x: sum(x.isnull().values), axis = 1)>0) ==> to figure out unique values for a column        np.unique(df1['column name']) ==>to drop a column df.drop(['col_name', axis=1, inplace=True) ==> to replace a column df_08.rename(columns=lambda x: x.strip().lower().replace(" ", "_"), inplace=True) ==> to rename a column df = df.rename(columns={'old name': 'new name'}) ==> replace spaces with underscore df.rename(columns=lamb