Tip 1 : Prepare basics of ML, stats and Sql properly.
Tip 2 : Go through all the previous interview experiences from Codestudio and Leetcode.
Tip 3 : Do at-least 2 good projects and you must know every bit of them.
Tip 1 : Have at-least 2 good projects explained in short with all important points covered.
Tip 2 : Every skill must be mentioned.
Tip 3 : Focus on skills, projects and experiences more.
Technical Interview round with questions on ML mainly.
What are the assumptions of linear regression model?
There are four assumptions associated with a linear regression model:
1. Linearity: The relationship between X and the mean of Y is linear.
2. Homoscedasticity: The variance of residual is the same for any value of X.
3. Independence: Observations are independent of each other.
4. Normality: For any fixed value of X, Y is normally distributed.
What problems do multicollinearity in regression analysis cause?
1. The coefficient estimates can swing wildly based on which other independent variables are in the model. The coefficients become very sensitive to small changes in the model.
2. Multicollinearity reduces the precision of the estimated coefficients, which weakens the statistical power of your regression model. You might not be able to trust the p-values to identify independent variables that are statistically significant.
What are different measures used to check performance of classification model?
Confusion Matrix : A confusion matrix is a table with two dimensions viz. “Actual” and “Predicted” and furthermore, both the dimensions have “True Positives (TP)”, “True Negatives (TN)”, “False Positives (FP)”, “False Negatives (FN)".
Precision : Precision = TP / (TP + FP)
Out of all that were marked as positive, how many are actually truly positive.
Recall/ Sensitivity : Recall = TP/ (TN + FN)
Out of all the actual real positive cases, how many were identified as positive.
Specificity : Specificity = TN/ (TN + FP)
Out of all the real negative cases, how many were identified as negative.
F1-Score : F1 score = 2* (Precision * Recall) / (Precision + Recall)
F1 score is a weighted average of Precision and Recall, which means there is equal importance given to FP and FN.
AUC & ROC Curve : AUC or Area Under Curve is used in conjecture with ROC Curve which is Receiver Operating Characteristics Curve. AUC is the area under the ROC Curve.
What are disadvantage of logistics regression?
1. If the number of observations is lesser than the number of features, Logistic Regression should not be used, otherwise, it may lead to overfitting.
2. It constructs linear boundaries.
3. The major limitation of Logistic Regression is the assumption of linearity between the dependent variable and the independent variables.
4. It can only be used to predict discrete functions. Hence, the dependent variable of Logistic Regression is bound to the discrete number set.

Here's your problem of the day
Solving this problem will increase your chance to get selected in this company
How do you remove whitespace from the start of a string?