Solving the problem of overlapping text labels in a scatterplot by manually assigning the position of each label

I’ve got a scatter plot with a state abbreviation for each text label. The trouble is overlapping text everywhere. So, I hard-coded a dictionary with a text position for all 51 points (I’m including D.C.) that will solve the problem. Example:
position_dict = {
‘DC’:‘top center’,
‘MS’:‘top center’,
‘LA’:‘top center’,
‘GA’:‘middle left’,

}

Question: How do write a function that checks the dictionary for each key and returns the value for that key:

        fig.update_traces(textposition = my_function_that_checks_the_dict_and_returns_the_right_value(df['StateAbb'])

Is this even possible? Or do I need to just kludge my way through this by manually sorting the dataframe so that my text position = an np.array of text positions that are ordered in sync with the order of the dataframe?

1 Like

Hey @russmcb ,

Maybe this would give you some idea

#Dummy Data

name	lat	long	abb
state1	26	28	s1
state2	34	36	s2
state3	26	35	s3
state4	27	33	s4

#Check State Names And Return Proper Positions

def set_text_position(name):

    if name in ['state1', 'state2']:
        
        return 'bottom center'
    
    else:
        
        return 'middle left'

#Draw Figure

fig = go.Figure()

fig.add_trace(go.Scatter(x = data['long'], y = data['lat'],
                         mode = 'markers+text',
                         text = data['abb'],
                         textposition = list(map(set_text_position, data['name']))  #Call function here
                         )
              )

fig.show()

#Example Output

1 Like

Thanks @akroma!

I realized that there was no way to avoid manually positioning every label based on where each of the dots fell on the scatterplot so I kludged my way through by making sure that the dictionary, position_dict, order matched my dataframe and then did:

    fig.update_traces(textposition=list(position_dict.values()))

This required going through all 50 labels and manually assigning them a position, but it gave a better result than randomly moving through text positions.

Thanks for the help!

1 Like

I did something similar to @russmcb but with bit of a shortcut.

I first checked what default position suits most my data without a tweek, by plotting them against the unique identifier of the dataframe.

Then I added a text_position column to my dataframe with that default position as the default text position.

Then using df.loc with condition, I replaced the default text position of overlapping labels only.

Example code: Let say most of my data doesn’t overlap with ‘middle right’.

exp_df['text_position'] = 'middle right'
exp_df.loc[(((exp_df['exp'] == 'RT05') | (exp_df['exp'] == 'RT08') | (exp_df['exp'] == 'RT12')), 'text_position')] = 'bottom right'

# in go.Scatter
fig.add_trace(go.Scatter(
        x=exp_df['...'],
        y=exp_df['...'],
        mode='markers+text',
        text=exp_df['...'],
        textposition=exp_df['Text position'],
        marker=dict(
            color=exp_df['...'], 
            line_width=2,
            size=15, 
            colorscale='Viridis', 
            showscale=True
        ),
    )
)

Hope this helps.