Lets explore logisitic regression code done in python today. We have a dataset available for sample telecom provided where we have data of its customer who may or may not churn.
We have to make a prediction on the data set as accurately as possible.
Lets see how we can do that !
Yet another Linear regression code for US housing dataset.
Dataset was taken from : https://www.kaggle.com/huzaifsayyed/us-housing-data
Here is another easy to follow code for Multiple Linear regression code on housing data !
Here is sample code for Simple Regression that you can easily follow !
Github link : https://github.com/jeswinaugustine/machine_learning_code/blob/master/Linear%20regression/Simple_Linear_Regression.ipynb
I have a compiled a basic jupyter notebook listing some basic introduction to python and its string operations. This is only for quick reference !
In the previous post, you saw some common interview questions asked on linear regression. The questions in that segment were mostly related to the essence of linear regression and focused on general concepts related to linear regression. This section extensively covers the common interview questions asked related to the concepts learnt in multiple linear regression.
Q1. What is Multicollinearity? How does it affect the linear regression? How can you deal with it?
occurs when some of the independent variables are highly correlated
(positively or negatively) with each other. This multicollinearity
causes a problem as it is against the basic assumption of linear
regression. The presence of multicollinearity does not affect the
predictive capability of the model. So, if you just want predictions,
the presence of multicollinearity does not affect your output. However,
if you want to draw some insights from the model and apply them in,
let’s say, some business model, it may cause problems.
is a common practice to test data science aspirants on linear
regression as it is the first algorithm that almost everyone studies in
Data Science/Machine Learning. Aspirants are expected to possess an
in-depth knowledge of these algorithms. We consulted hiring managers and
data scientists from various organisations to know about the typical
Linear Regression questions which they ask in an interview. Based on
their extensive feedback a set of question and answers were prepared to
help students in their conversations.
Q1. What is linear regression?
simple terms, linear regression is a method of finding the best
straight line fitting to the given data, i.e. finding the best linear
relationship between the independent and dependent variables.
Q1. What is accuracy?
Accuracy is the number of correct predictions out of all predictions made.
Accuracy=True Positives+True NegativesTotal Number of Predictions
Q2. Why is accuracy not a good measure for classification problems?
is not a good measure for classification problems because it gives
equal importance to both false positives and false negatives. However,
this may not be the case in most business problems. For example, in the
case of cancer prediction, declaring cancer as benign is more serious
than wrongly informing the patient that he is suffering from cancer.
Accuracy gives equal importance to both cases and cannot differentiate
Q1. What is the Maximum Likelihood Estimator (MLE)?
MLE chooses those sets of unknown parameters (estimator) that maximise
the likelihood function. The method to find the MLE is to use calculus
and setting the derivative of the logistic function with respect to an
unknown parameter to zero, and solving it will give the MLE. For a
binomial model, this will be easy, but for a logistic model, the
calculations are complex. Computer programs are used for deriving MLE
for logistic models.
(Here’s another approach to answering the question.)
OSPF is by far the most popular and important protocol in use today.
Most important features of OSPF:
- Its open source !
- Very fast convergence time, ( a tad close to even EIGRP )
- Link-state routing protocol
- Supports multiple, equal cost routes to the same destination
- Supports both IPv4 and IPv6
- Uses Dijkstra’s algorithm to find the shortest path tree and follows that by populating the routing table with resulting best path.
- Allows creation of areas and autonomous system
- Minimizes routing update traffic
- Supports VLSM/CIDR
- Unlimited hop count (unlike RIP)
- Supports Multi-vendor deployment.