ML models not appearing correctly. dont know why

hi, I’m doing something similar the plotly models of ml

the code I’ve reaches is this one:


@app.callback(
    Output("graph", "figure"), 
    Input('dropdown', "value"), 
    Input('dropdown2', "value"), 
    Input('dropdown3', "value"))
def train_and_display(name,drop1,drop2):
    #df = df # replace with your own data source 
    #df=df.columns
    X = np.array(df[drop1].astype(float).dropna())
    Y= np.array(df[drop2].astype(float).dropna())
    print("long de x",len(X))
    print("long de y",len(Y))
    newX=pd.DataFrame(X)
    newY=pd.DataFrame(Y)
    if len(newX)>len(newY):
        w=len(newX)-len(newY)
        newXnew=newX.drop(newX.index[:w])
        newXnewn = np.array(newXnew.astype(float).dropna()).reshape(-1,1)
        newYnewn= np.array(newY.astype(float).dropna())
        print(newXnewn)
    elif len(newX)<len(newY):
        w=len(newY)-len(newX)
        newYnew=newY.drop(newY.index[:w])
        newYnewn= np.array(newYnew.astype(float).dropna())
        newXnewn = np.array(newX.astype(float).dropna()).reshape(-1,1)
        print(newYnewn)
    elif len(newX)==len(newY):
        newXnewn = np.array(newX.astype(float).dropna()).reshape(-1,1)
        newYnewn= np.array(newY.astype(float).dropna())
    else: print("good")
    
    
    print("long de x",len(newXnewn))
    print("long de y",len(newYnewn))
    print(newXnewn)
    print(newYnewn)
    X_train, X_test, y_train, y_test = train_test_split(
        newXnewn, newYnewn)
    model = models[name]()


    model.fit(X_train, y_train)

    x_range = np.linspace(newXnewn.min(), newXnewn.max(), 100)
    y_range = model.predict(x_range.reshape(-1, 1))

but for some reason I dont know, the image only shows, 1 point for training, 1 for test and one for the ml. what can I change to make it work?
(the data has more than 1 point)

Hi,

Could you share how you are generating the figure in the callback too?

fig = go.Figure([
        go.Scatter(x=X_train.squeeze(), y=y_train, 
                   name='train', mode='markers'),
        go.Scatter(x=X_test.squeeze(), y=y_test, 
                   name='test', mode='markers'),
        go.Scatter(x=x_range, y=y_range, 
                   name='prediction')
    ])
    return fig

I’m hitting lots of problems … cant manage to get a scatter … how can I do this?

changed some things … but still getting errors:


@app.callback(
    Output("graph", "figure"), 
    Input('dropdown', "value"), 
    Input('dropdown2', "value"), 
    Input('dropdown3', "value"))
def train_and_display(name,drop1,drop2):
    #df = df # replace with your own data source 
    #df=df.columns
    X = np.array(df[drop1].astype(float).dropna())#.reshape(-1,1)
    Y= np.array(df[drop2].astype(float).dropna())
    print("long de x",len(X))
    print("long de y",len(Y))
    newX=pd.DataFrame(X)#((X),index=df[drop1],columns=df[drop1])
    newY=pd.DataFrame(Y)#((Y),index=df[drop1],columns=df[drop1])
    if len(newX)>len(newY):
        w=len(newX)-len(newY)
        newXnew=newX.drop(newX.index[:w])
        newXnewn = np.array(newXnew.astype(float).dropna())#.reshape(-1,1)
        newYnewn= np.array(newY.astype(float).dropna())
        newXnewnn=newXnewn.reshape(len(newY),1,order='C')
        newYnewnn=newYnewn.reshape(len(newY),1,order='C')
        print(newXnewnn.shape)
        print(newYnewnn.shape)
        #print(newXnewn)
    elif len(newX)<len(newY):
        w=len(newY)-len(newX)
        newYnew=newY.drop(newY.index[:w])
        newYnewn= np.array(newYnew.astype(float).dropna())
        newXnewn = np.array(newX.astype(float).dropna())#.reshape(-1,1)
        newXnewnn=newXnewn.reshape(len(newX),1,order='C')
        newYnewnn=newYnewn.reshape(len(newX),1,order='C')
        print(newXnewnn.shape)
        print(newYnewnn.shape)
        #print(newYnewn)
    elif len(newX)==len(newY):
        newXnewn = np.array(newX.astype(float).dropna())#.reshape(-1,1)
        newYnewn= np.array(newY.astype(float).dropna())
        newXnewnn=newXnewn.reshape(-1,len(newX))
        newYnewnn=newYnewn.reshape(1,len(newX))
        print(newXnewnn.shape)
        print(newYnewnn.shape)
    else: print("good")
    
    newXnewnnn=np.ones(len(newX),float)*newXnewnn
    print("long de x",len(newXnewnnn))
    print("long de y",len(newYnewnn))
    print(newXnewnnn)
    print(newYnewnn)
    X_train, X_test, y_train, y_test = train_test_split( 
        newXnewnnn, newYnewnn, train_size=0.2, test_size=0.8, shuffle=True)
    model = models[name]()


    model.fit(X_train, y_train)

    x_range = np.linspace(newXnewnnn.min(), newXnewnnn.max(), 100)
    #print(x_range.shape)
    #print(x_range)
    #x_ranges=x_range*np.ones(len(x_range),float)
    #print(x_ranges)
    y_range = model.predict(x_range.reshape(-1, len(x_range)))

    fig = go.Figure([
        go.Scatter(x=X_train.squeeze(), y=y_train, 
                   name='train', mode='markers'),
        go.Scatter(x=X_test.squeeze(), y=y_test, 
                   name='test', mode='markers'),
        go.Scatter(x=x_range, y=y_range, 
                   name='prediction')
    ])
    return fig
type or paste code here

with all the changes I’ve done … I still only see 3 points castterd. what do I need to change?


@app.callback(
    Output("graph", "figure"), 
    Input('dropdown', "value"), 
    Input('dropdown2', "value"), 
    Input('dropdown3', "value"))
def train_and_display(name,drop1,drop2):
    #df = df # replace with your own data source 
    #df=df.columns
    X = np.array(df[drop1].astype(float).dropna())#.reshape(-1,1)
    Y= np.array(df[drop2].astype(float).dropna())
    print("long de x",len(X))
    print("long de y",len(Y))
    newX=pd.DataFrame(X)#((X),index=df[drop1],columns=df[drop1])
    newY=pd.DataFrame(Y)#((Y),index=df[drop1],columns=df[drop1])
    if len(newX)>len(newY):
        w=len(newX)-len(newY)
        newXnew=newX.drop(newX.index[:w])
        newXnewn = np.array(newXnew.astype(float).dropna())#.reshape(-1,1)
        newYnewn= np.array(newY.astype(float).dropna())
        newXnewnn=newXnewn.reshape(-len(newY),1,order='C')
        newYnewnn=newYnewn.reshape(len(newY),1,order='C')
        print(newXnewnn.shape)
        print(newYnewnn.shape)
        #print(newXnewn)
    elif len(newX)<len(newY):
        w=len(newY)-len(newX)
        newYnew=newY.drop(newY.index[:w])
        newYnewn= np.array(newYnew.astype(float).dropna())
        newXnewn = np.array(newX.astype(float).dropna())#.reshape(-1,1)
        newXnewnn=newXnewn.reshape(-len(newX),1,order='C')
        newYnewnn=newYnewn.reshape(len(newX),1,order='C')
        print(newXnewnn.shape)
        print(newYnewnn.shape)
        #print(newYnewn)
    elif len(newX)==len(newY):
        newXnewn = np.array(newX.astype(float).dropna())#.reshape(-1,1)
        newYnewn= np.array(newY.astype(float).dropna())
        newXnewnn=newXnewn.reshape(-len(newX),1,order='C')
        newYnewnn=newYnewn.reshape(len(newX),1,order='C')
        print(newXnewnn.shape)
        print(newYnewnn.shape)
    else: print("good")
    
    #newXnewnnn=np.ones(len(newX),float)*newXnewnn
    newXnewnnn=np.array(newXnewnn.astype(float))
    print("long de x",len(newXnewnnn))
    print("long de y",len(newYnewnn))
    print(newXnewnnn)
    print(newYnewnn)
    X_train, X_test, y_train, y_test = train_test_split( 
        newXnewnnn, newYnewnn, train_size=0.2, test_size=0.8, shuffle=True)
    model = models[name]()


    model.fit(X_train, y_train)

    x_range = np.linspace(newXnewnnn.min(), newXnewnnn.max(), 100)
    print(x_range.shape)
    print(x_range)
    #x_ranges=x_range*np.ones(len(x_range),float)
    #print(x_ranges)
    y_range = model.predict(x_range.reshape(-len(x_range),1))

    fig = go.Figure([
        go.Scatter(x=X_train.squeeze(), y=y_train, 
                   name='train', mode='markers'),
        go.Scatter(x=X_test.squeeze(), y=y_test, 
                   name='test', mode='markers'),
        go.Scatter(x=x_range, y=y_range, 
                   name='prediction')
    ])
    return fig

edited: I must be multiplaying wrongly rows and columns … because I still dont understand how those functions work

It is a bit difficult to read your code because of all the array transformations and reshaping.

That said, looking at your figure, it seems to me that the arrays you are providing to go.Scatter are in the wrong shape. Try to reshape them so they are all with shape = (N, ).

all scatter youd be shaped the same? because range_y and range_x have shape (100,) and the others before have (20,)

what can I do?

also … the ones before have this shape (20,1) and ranges have this shape (100,) does this affect?

As long as x and y for the same Scatter trace have the same shape, it should be fine.

but something is wrong, and dont know what?

If you still have some arrays with shape (20,1), reshape them to (20, ).

0k, how do I reshape it to not appear the 1?

x.squeeze() or x.reshape(x.shape[0], )

I reshaped it by only using the first param … but new errors appeared (in model.fit)

ValueError: Expected 2D array, got 1D array instead:
array=[99.  0.  0.  0. 99.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

resheped, seemed to work (it game me errors using x.reshape(x.shape[0]:


long de x 20
long de y 132
newXnewnn shape (20,)
newYnewnn shape (20,)
long de x 20
long de y 20
[ 0.  0.  0.  0.  0.  0. 99.  0.  0.  0.  0.  0. 99.  0.  0.  0.  0. 99.
 99.  0.]
[99. 99. 99. 99.  0. 50. 99. 99. 99. 99. 99.  0. 50. 25. 33.  0.  0.  0.
  0.  0.]
X_train shape (15,)
y_train shape (15,)
X_test shape (5,)
y_test shape (5,)

do I reshape x_train and y_train to (n,1)?

My suggestion is to reshape the data that you are giving as x and y to go.Scatter, not the data before that.

0k, it worked … I needed to add a squeeze() to y’s … but only works with decission tree … what else could I be messing arround? (when I choose knn or regression … no prediction is shown) …

0k, solved … just needed to add .queeze to all x and y