✊🏿 Black Lives Matter. Please consider donating to Black Girls Code today.
🧬 Learn how to build RNA-Seq data apps with Python & Dash. Register for the May 20 Webinar!

Getting the Trace that a Datapoint is Enclosed by in Ternary Plot

I have three regions on a ternary plot drawn as trace1, trace2 and trace3. My question is, if I plot a series of datapoints on the ternary plot, one at a time in a for loop, how can I find which region a given datapoint is located in at the end of each iteration? In my example, the coordinates of the first tuple is in the top region (region1), the second in the bottom left region (region2), and the third in the bottom right region (region3). In reality, I have millions of datapoints for which I need to retrieve the region of in a more complicated diagram, so I don’t need the plot to be rendered at the end of each iteration, and I would like to know how to clear the last datapoint before another iteration.

import plotly.graph_objs as go

data_points = [(60, 20, 20), (20, 70, 10), (30, 20, 50)]

trace1 = go.Scatterternary( a=[40, 100, 40], b=[60, 0, 0], c=[0, 0, 60], fill='toself', name='region1' )
trace2 = go.Scatterternary( a=[0, 40, 40, 0], b=[100, 60, 30, 50], c=[0, 0, 30, 50], fill='toself', name='region2' )
trace3 = go.Scatterternary( a=[40, 0, 0, 40], b=[30, 50, 0, 0], c=[30, 50, 100, 60], fill='toself', name='region3' )

for point in data_points:
    data_point = go.Scatterternary( a=[point[0]], b=[point[1]], c=[point[2]], name='datapoint' )
    fig = go.Figure( data=[trace1, trace2, trace3, data_point] )
    fig.show( )

Below is the figure made by the third iteration showing the datapoint that is in region3. So, for example, I would like the for loop to print “region3” at the third iteration.

I am considering a work around using the PIL library to create an image that I can look up the color values of any pixel with. So, basically I can use a color index since colors correspond to the different regions. I cleared the ternary diagram of any additional markings and just have pure color (below is the actual image). One issue I’m having is that the colors bleed together in 2-3 pixels at the borders of each region which is breaking the logic of my color index.

Hi @af2k15 welcome to the forum! Could you use scikit-image’s points_in_poly function? If you know the geometry of each domain you can check whether your points lie inside this domain or not. I’m not sure it works with the ternary geometry, if not you can convert to Cartesian coordinates by keeping only a and b.

Thanks @Emmanuelle for the suggestion to use scikit-image. For the points_in_poly function, an image isn’t necessarily needed, just an array. So, I ended up not needing the Plotly image. However, I think using the Plotly image and looking up values in a color index using the PIL library, is a more efficient method when you have millions of datapoints because you don’t have to loop through the polygons to find the right one. The reason I went with the scikit function was because I seemed to run into fewer special cases that were making the code convoluted. The special cases happened when a data point fell on the edges of the triangle and on borders between polygons.

Both scikit-image and PIL require cartesian coordinates. I used:

x = 0.5*a + c
y = a

The height/width ratio of an equilateral triangle is ~866/1000. Since my inputs were percentages, I scaled x by 1000/100 and y by 866/100. For the Plotly method, I made the image 866/1000 pixels.

The code seems to be giving the expected output, but I’ll still be doing a little more testing.

Much appreciated.