Data Science Interview Questions and Answers
Question - 91 : - How and by what methods data visualisations can be effectively used?
Answer - 91 : - Data visualisation is greatly helpful while creation of reports. There are quite a few reporting tools available such as tableau, Qlikview etc. which make use of plots, graphs etc for representing the overall idea and results for analysis. Data visualisations are also used in exploratory data analysis so that it gives us an overview of the data.
Question - 92 : - You are given a data set consisting of variables with more than 30 per cent missing values. How will you deal with them?
Answer - 92 : - If 30 per cent data is missing from a single column then, in general, we remove the column. If the column is too important to be removed we may impute values. For imputation, several methods can be used and for each method of imputation, we need to evaluate the model. We should stick with one that model which gives us the best results and generalises well to unseen data.
Question - 93 : - What is skewed Distribution & uniform distribution?
Answer - 93 : - The skewed distribution is a distribution in which the majority of the data points lie to the right or left of the centre. A uniform distribution is a probability distribution in which all outcomes are equally likely.
Question - 94 : - How to keep only the top 2 most frequent values as it is and replace everything else as ‘Other’?
Answer - 94 : - 
"s = pd.Series(np.random.randint(1, 5, [12]))
print(s.value_counts())
s[~s.isin(ser.value_counts().index[:2])] = 'Other'
s"
Question - 95 : - How to convert the index of a series into a column of a dataframe?
Answer - 95 : - df = df.reset_index() will convert index to a column in a pandas dataframe.
Question - 96 : - Define GroupBy in Pandas?
Answer - 96 : - groupby is a special function in pandas which is used to group rows together given certain specific columns which have information for categories used for grouping data together.
Question - 97 : - Describe Data Operations in Pandas?
Answer - 97 : - Common data operations in pandas are data cleaning, data preprocessing, data transformation, data standardisation, data normalisation, data aggregation.
Question - 98 : - What is Data Aggregation?
Answer - 98 : - Data aggregation is a process in which aggregate functions are used to get the necessary outcomes after a groupby. Common aggregation functions are sum, count, avg, max, min.
Question - 99 : - How to convert a numpy array to a dataframe of given shape?
Answer - 99 : - If matrix is the numpy array in question: df = pd.DataFrame(matrix) will convert matrix into a dataframe.
Question - 100 : - How to get frequency counts of unique items of a series?
Answer - 100 : - pandas.Series.value_counts gives the frequency of items in a series.