How to highlight (set different colors for) specific words in long text? Tags do not work

Hello everyone!

I have been trying to highlight specific words in long text but I have not succeeded. Is there a method for doing this in DASH?
I thought that I could add tags around the words I want to highlight and then use css to create a new class to assign any chosen color. For instance, if I wanted to highlight the words inflated and surf_thickness using a red font, I would use tags in the following way:

Executing commands from fv_snaps_<tag>surf_thickness</tag>_disptmpl_<tag>inflated</tag>_thresh_1.0-3.5_lh.txt

However, this does not work. Any suggestions about how to achieve this?

Thank you for your help! :slight_smile:

Hello @numam,

In order to change colors in a string, you’ll need to use span to wrap the text you want to change.


edit: Rofl spam… gotta love autocorrect…

Thank you so much for your answer @jinnyzor!!! It did the trick. I owe you another one! You are the best! thank you so much!!! :slight_smile: :mechanical_arm:

I did it using a very small snipped of all the data I need to highlight which has several thousands of text (I am highlighting error logs of MRI data I am processing). When I try to apply the code to all the data, I get the following error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/var/folders/cl/sd_8grvj2b1d1pk5cb96kdp00000gn/T/ipykernel_2625/1762203074.py in <module>
      4 app.layout = dbc.Container([
      5     html.Div([
----> 6        eval(allHighlightedTXT)
      7     ], style={"color":"white", 'font-size': '12pt'})
      8 ])

ValueError: source code string cannot contain null bytes

Do you know how I can fix this? When I check the text (printing it in Jupyter Notebook) Everything looks correct. I am thinking that the error is related to what dictates white spaces etc in the text and I cannot see those things using Jupyter. If you have or anyone has seen this before and know how to solve this, please let me know :slight_smile:

Here is the sample code for how I did it. in case it benefits others. Maybe there is a better way but this is how I applied your suggestion:

# NEGATIVE LOGS HIGHLIGHTING 
"""
First layer wraps keywords with '' and second layer wrapps keywords with the html.Span component
The Span components are categorized using className (positiveLogs and negativeLogs)
Use a CSS file in the assets folder for creating the classes positiveLogs and negativeLogs so you can chose the color and text
effects for each set of keywords. 
"""
textToBeHighlighted = "Executing commands from fv_snaps_surf_thickness_disptmpl_inflated_thresh_1.0-3.5_lh.txt Executing commands from fv_snaps_surf_thickness_disptmpl_inflated_thresh_1.0-3.5_rh.txt Executing commands from fv_snaps_surf_thickness_disptmpl_inflated_thresh_1.0-3.5_lh.txt Executing commands from fv_snaps_surf_thickness_disptmpl_inflated_thresh_1.0-3.5_rh.txtEExecuting commands from fv_snaps_surf_thickness_disptmpl_inflated_thresh_1.0-3.5_lh.txt "

# NEGATIVE LOGS HIGHLIGHTING 
firstLayerKeywords = ["txt", "error", "failed", "cant't", "doesn't","could not open", "not reading","unable", "not found","mgz", "mgh"]
nfirstLayerPos = find_MultipleKeywordsInLog(firstLayerKeywords, textToBeHighlighted)
nfirstLayerHighlights= highlightKeywords(nfirstLayerPos, textToBeHighlighted, start_highlight="''", end_highlight="''")

secondLayerKeywords = ["'txt'", "'error'", "'failed'", "'cant't'", "'doesn't'","'could not open'", "'not reading'","'unable'", "'not found'","'mgz'", "'mgh'"]
nSecLayerPos = find_MultipleKeywordsInLog(secondLayerKeywords, nfirstLayerHighlights)
nSecLayerHighlights = highlightKeywords(nSecLayerPos, nfirstLayerHighlights, start_highlight=",html.Span(", end_highlight=",className= 'negativeLogs effect-shine'),")

# POSITIVE LOGS HIGHLIGHTING 
PositiveFirstLayerKeywords = ["snap", "disptmpl"]
pfirstLayerPos = find_MultipleKeywordsInLog(PositiveFirstLayerKeywords, nSecLayerHighlights)
pfirstLayerHighlights= highlightKeywords(pfirstLayerPos ,nSecLayerHighlights, start_highlight="''", end_highlight="''")

PositiveSecLayerKeywords = ["'snap'", "'disptmpl'"]
pSecLayerPos = find_MultipleKeywordsInLog(PositiveSecLayerKeywords, pfirstLayerHighlights)
pSecLayerHighlights = highlightKeywords(pSecLayerPos, pfirstLayerHighlights, start_highlight=",html.Span(", end_highlight=",className= 'positiveLogs effect-shine'),")

allHighlightedTXT = ''.join(("html.Div(['",pSecLayerHighlights,"'])"))

app = JupyterDash(__name__, external_stylesheets=[dbc.themes.BOOTSTRAP, dbc.icons.FONT_AWESOME], external_scripts=[
        {'src': 'assets/spanText.js'}])

app.layout = dbc.Container([
    html.Div([
       eval(allHighlightedTXT)
    ], style={"color":"white", 'font-size': '12pt'})
])

app.run_server(mode ="external", debug=True)

## FUNCTIONS USED ABOVE OFR FINDING AND HIGHLIGHTING THE KEYWORDS ( I COULD NOT CREATE ATTACHMENTS TO THIS POST FOR THEM BUT HERE THEY ARE)

import regex as re 

def find_MultipleKeywordsInLog(keywords, text):
    # GET ALL THE INDXS FOR ALL THE KEYWORDS AND PLACE THEM IN LIST AS PAIRS
    kwPositions = []
    for kw in keywords:
        matches = re.finditer(kw, text, flags=re.I)
        for match in matches:
            start, end = match.span()
            kwPositions += [[start, end]]
    # SORT THE LIST IN ASCENDING ORDER
    kwPositions.sort(key=lambda y: y[0])
    
    # PUT THE INDICES IN ONE LIST 
    keywordIndxs = []
    for i in range(0,len(kwPositions)):
        keywordIndxs += kwPositions[i][0], kwPositions[i][1]
    return keywordIndxs


def highlightKeywords(kwPositions, text, start_highlight="<mark>", end_highlight="</mark>"):
    entireTextWithhighlights = ""
    for i, (start, end) in enumerate(zip([None] + kwPositions, kwPositions + [None])):
        if i % 2:  # odd segments are highlighted
            entireTextWithhighlights += start_highlight + text[start:end] + end_highlight
        else:      # even segments are not
            entireTextWithhighlights += text[start:end]
    return entireTextWithhighlights

Typically, you’ll want to stay away from “eval” as it execute as python code…

Say I pass a string of text:

import shutil
shutil.rmtree()

This would delete your files.

You are better off building the function to take the string of text and replace your stings.

Thank you for letting me know :grimacing: It is my first time dealing with having constructed a string that I want to execute. I did not know that using eval() is not good practice. I will find a different solution.

Give this a try and see if it works for you. :slight_smile:


# NEGATIVE LOGS HIGHLIGHTING
"""
First layer wraps keywords with '' and second layer wrapps keywords with the html.Span component
The Span components are categorized using className (positiveLogs and negativeLogs)
Use a CSS file in the assets folder for creating the classes positiveLogs and negativeLogs so you can chose the color and text
effects for each set of keywords. 
"""
textToBeHighlighted = ["Executing commands from fv_snaps_surf_thickness_disptmpl_inflated_thresh_1.0-3.5_lh.txt Executing commands from fv_snaps_surf_thickness_disptmpl_inflated_thresh_1.0-3.5_rh.txt Executing commands from fv_snaps_surf_thickness_disptmpl_inflated_thresh_1.0-3.5_lh.txt Executing commands from fv_snaps_surf_thickness_disptmpl_inflated_thresh_1.0-3.5_rh.txtEExecuting commands from fv_snaps_surf_thickness_disptmpl_inflated_thresh_1.0-3.5_lh.txt "]

negativeWords = ["txt", "error", "failed", "cant't", "doesn't", "could not open", "not reading", "unable",
                      "not found", "mgz", "mgh"]
positiveWords = ["snap", "disptmpl"]


for i in negativeWords:
    newArray = []
    for x in textToBeHighlighted:
        if isinstance(x, str):
            for y in range(len(x.split(i))):
                newArray.append(x.split(i)[y])
                if y != len(x.split(i))-1:
                    newArray.append(html.Span(i, className='negativeLogs effect-shine'))
        else:
            newArray.append(x)

    textToBeHighlighted = newArray

for i in positiveWords:
    newArray = []
    for x in textToBeHighlighted:
        if isinstance(x, str):
            for y in range(len(x.split(i))):
                newArray.append(x.split(i)[y])
                if y != len(x.split(i))-1:
                    newArray.append(html.Span(i, className='positiveLogs effect-shine'))
        else:
            newArray.append(x)

    textToBeHighlighted = newArray

app = JupyterDash(__name__, external_stylesheets=[dbc.themes.BOOTSTRAP, dbc.icons.FONT_AWESOME], external_scripts=[
    {'src': 'assets/spanText.js'}])

app.layout = dbc.Container([
    html.Div(textToBeHighlighted, style={"color": "white", 'font-size': '12pt'})
])

app.run_server(mode="external", debug=True)

@jinnyzor You just blew my mind away :exploding_head: Such a succinct and elegant solution. You are my hero. I did not have a clue that I could do that with split!!! i always used it to split strings based on symbols and not characters :exploding_head: It worked!
But when I ran the one of the log textfiles through the code, it broke down. I got an error: “Maximum call stack size exceeded” and the array does not look right. here is a snipped.

['2', '0', '2', '2', '-', '1', '2', '-', '0', '3', ' ', '0', '1', ':', '4', '3', ':', '0', '7', ':', ' ', 'E', 'x', 't', 'r', 'a', 'c', 't', 'i', 'o', 'n', ' ', 'o', 'f', ' ', 'S', 'n', 'a', 'p', 's', 'h', 'o', 't', 's', ' ', 'S', 't', 'a', 'r', 't', 'e', 'd', ' ', '.', '.', '.', '\n', 'C', 'h', 'e', 'c', 'k', 'i', 'n', 'g', ' ', 'f', 'o', 'r', ' ', 's', 'u', 'b', 'j', 'e', 'c', 't', ' ', 'l', 'i', 's', 't', '.', '.', '.', '\n', 's', 'u', 'b', '-', 'D', 'Z', '1', '7', 'B', 'H', '-', 's', 'e', 's', '-', '1', '_', 'r', 'u', 'n', 'l', 'i', 's', 't', '4', 'Q', 'C', '.', 't', 'x', 't', ' ', 'S', 'u', 'b', 'j', 'e', 'c', 't', ' ', 'L', 'i', 's', 't', ' ', 's', 'u', 'c', 'c', 'e', 's', 'f', 'u', 'l', 'l', 'y', ' ', 'M', 'a', 'd', 'e', '\n', 'C', 'h', 'e', 'c', 'k', 'i', 'n', 'g', ' ', 'f', 'o', 'r', ' ', 'T', 'h', 'i', 'c', 'k', 'n', 'e', 's', 's', ' ', 'f', 'i', 'l', 'e', ' ', 'i', 'n', ' ', 'f', 's', 'a', 'v', 'e', 'r', 'a', 'g', 'e', ' ', 's', 'p', 'a', 'c', 'e', ' ', '.', '.', '.', '\n', 'E', 'x', 'i', 's', 't', 'i', 'n', 'g', ' ', 'R', 'H', ' ', 'f', 's', 'a', 'v', 'e', 'r', 'a', 'g', 'e', ' ', 't', 'h', 'i', 'c', 'k', 'n', 'e', 's', 's', ' ', 'f', 'i', 'l', 'e', ' ', 'f', 'o', 'r', ' ', 'r', 'u', 'n', '-', '7', ' ', 'w', 'a', 's', ' ', 'f', 'o', 'u', 'n', 'd', '.', ' ', 'I', 't', ' ', 'w', 'i', 'l', 'l', ' ', 'b', 'e', ' ', 'u', 's', 'e', 'd', ' ', 'f', 'o', 'r', ' ', 's', 'n', 'a', 'p', 's', 'h', 'o', 't', 's', '.', '\n', 'E', 'x', 'i', 's', 't', 'i', 'n', 'g', ' ', 'L', 'H', ' ', 'f', 's', 'a', 'v', 'e', 'r', 'a', 'g', 'e', ' ', 't', 'h', 'i', 'c', 'k', 'n', 'e', 's', 's', ' ', 'f', 'i', 'l', 'e', ' ', 'f', 'o', 'r', ' ', 'r', 'u', 'n', '-', '7', ' ', 'w', 'a', 's', ' ', 'f', 'o', 'u', 'n', 'd', '.', ' ', 'I', 't', ' ', 'w', 'i', 'l', 'l', ' ', 'b', 'e', ' ', 'u', 's', 'e', 'd', ' ', 'f', 'o', 'r', ' ', 's', 'n', 'a', 'p', 's', 'h', 'o', 't', 's', '.', '\n', 'C', 'h', 'e', 'c', 'k', 'i', 'n', 'g', ' ', 'f', 'o', 'r', ' ', 'T', 'h', 'i', 'c', 'k', 'n', 'e', 's', 's', ' ', 'f', 'i', 'l', 'e', ' ', 'i', 'n', ' ', 'f', 's', 'a', 'v', 'e', 'r', 'a', 'g', 'e', ' ', 's', 'p', 'a', 'c', 'e', ' ', '.', '.', '.', '\n', 'E', 'x', 'i', 's', 't', 'i', 'n', 'g', ' ', 'R', 'H', ' ', 'f', 's', 'a', 'v', 'e', 'r', 'a', 'g', 'e', ' ', 't', 'h', 'i', 'c', 'k', 'n', 'e', 's', 's', ' ', 'f', 'i', 'l', 'e', ' ', 'f', 'o', 'r', ' ', 'r', 'u', 'n', '-', '5', ' ', 'w', 'a', 's', ' ', 'f', 'o', 'u', 'n', 'd', '.', ' ', 'I', 't', ' ', 'w', 'i', 'l', 'l', ' ', 'b', 'e', ' ', 'u', 's', 'e', 'd', ' ', 'f', 'o', 'r', ' ', 's', 'n', 'a', 'p', 's', 'h', 'o', 't', 's', '.', '\n', 'E', 'x', 'i', 's', 't', 'i', 'n', 'g', ' ', 'L']

I think I can troubleshoot this but here is the textfile I am trying to run through it. In case you would like to check out what is going on. Box

You need to wrap the textfile as a list:

with open(file, 'r') as f:
     textToBeHighlighted = [f.read()]

I came to tell you that doing that would make it work perfectly and you already knew!!! You are the best. Thank you so much!!

Here’s a simple way using dmc.Highlight if you only want to highlight words in one color.
docs: Dash Mantine Components - Highlight

import dash_mantine_components as dmc

text = """Executing commands from fv_snaps_surf_thickness_disptmpl_inflated_thresh_1.0-3.5_lh.txt Executing commands from fv_snaps_surf_thickness_disptmpl_inflated_thresh_1.0-3.5_rh.txt Executing commands from fv_snaps_surf_thickness_disptmpl_inflated_thresh_1.0-3.5_lh.txt Executing commands from fv_snaps_surf_thickness_disptmpl_inflated_thresh_1.0-3.5_rh.txtEExecuting commands from fv_snaps_surf_thickness_disptmpl_inflated_thresh_1.0-3.5_lh.txt"""

words_to_highlight = ["snap", "disptmpl"]

app.layout = dmc.Highlight(text, highlight= words_to_highlight, highlightColor="green")

Output:

Thank you for the reply @snehilvj ! I did not know this component existed :slight_smile: