## ValueError: Found input variables with inconsistent numbers of samples: [2935848, 2935849]

**How to solve ValueError: Found input variables with inconsistent numbers of samples: [2935848, 2935849]**Your problem is reached because you two dataframe (train and sales) have different length. Your train dataset has 2935848 samples and the sales dataset has 2935849.

**Both dataset has to have the same length in order to work properly**. Check why this length is not matching and add one row or drop one to match them.

Secondly, but no least, you should understand what are you doing with`train_test_split`

and which is your goal. This function inputs are X and Y, and outputs`X_train`

,`X_test`

,`y_train`

,`y_test`

. Reading your code, you are inputting two X (`X_train`

and`X_sales`

) with same 5 features. I hope you are doing this because some reason, be aware of this.

X are all the samples with their features, and Y are the corresponding outputs value you want to predict. Check that and evaluate is using`train_test_split`

## Solution 1

Your problem is reached because you two dataframe (train and sales) have different length. Your train dataset has 2935848 samples and the sales dataset has 2935849. **Both dataset has to have the same length in order to work properly**. Check why this length is not matching and add one row or drop one to match them.

Secondly, but no least, you should understand what are you doing with `train_test_split`

and which is your goal. This function inputs are X and Y, and outputs `X_train`

, `X_test`

, `y_train`

, `y_test`

. Reading your code, you are inputting two X (`X_train`

and `X_sales`

) with same 5 features. I hope you are doing this because some reason, be aware of this.

X are all the samples with their features, and Y are the corresponding outputs value you want to predict. Check that and evaluate is using `train_test_split`

is the function you are looking for.

