Detecting RLO character in Python
At work, I learned about how Right-to-Left Override was being used to make actually malicious files to look harmless. For example,
a .exe
file was being made to appear as .doc
files. We didn’t want to allow uploading such files.
This meant that I nedded to detect the presence of the RLO
character in the filename.
Then, I came across this post, where I learned about unicode bidirectional class and Python’s bidirectional() method.
The final solution for detection looked like this:
import unicodedata
..
filename = 'arbitrary_filename.doc'
if 'RLO' in [unicodedata.bidirectional(c) for c in unicode(filename)]:
raise ValueError('Invalid character in one or more of the file names')
..