[SOLVED] How to set conditional color range using np where python

Issue

I have a data frame that looks like:

print(file.head())

            miRNAs    baseMean  log2FoldChange          padj
0    hsa-miR-31-5p  221.442806       -7.037259  4.360127e-27
1   hsa-miR-337-5p   14.711123       -5.790422  4.556183e-01
2  hsa-miR-196b-5p  162.278255       -5.652917  1.365264e-3
3   hsa-miR-584-5p    6.430919       -5.554578  4.077578e-04
4  hsa-miR-196a-5p  455.152841       -5.361830  1.019622e-59

What I want to do is to set the color range for the file['padj'] column like below:

file['padj'] > 0.01 #gray
file['padj'] <= 0.01 #red
file['padj'] <= 0.0001 #orange

I tried to do this using np.where but only for one of the conditions.

p = file.plot.scatter(x='baseMean',y='log2FoldChange',c=np.where(np.abs(file['padj'])>0.01, 'gray', 'b'), logx=True)

# specifying horizontal line type
plt.axhline(y = 0, color = 'r', linestyle = '-')
plt.show()

I could not manage to define multiple conditions to np.where – any help is appreciated.

Solution

You could make a color column like so:

file['color'] = 'gray'
file.loc[file['padj']<=0.01, 'color'] = 'red'
file.loc[file['padj']<=0.0001, 'color'] = 'orange'

and plot like this:

plt.scatter(file['baseMean'], file['log2FoldChange'], color=file['color'])

#edit:
hacky way to generate legend:

for label, color in zip(['>0.01', '<=0.01', '<=0.0001'], ['gray', 'red', 'orange']):
    plt.scatter([],[], c=color,label=label)
plt.legend()

Answered By – warped

Answer Checked By – Willingham (BugsFixing Volunteer)

Leave a Reply

Your email address will not be published. Required fields are marked *