[SOLVED] Building an input text area to allow HTML but prevent security / script attacks

Issue

Platform: ASP.NET 4.0 MVC 4 C# jQuery

Here’s what I want to do.

I’m building a simple forum for my product. I want to give users a text area to enter their posts or comments.

  1. I’d like to allow basic text formatting HTML and links – like p, a, b, i
  2. Don’t want any other html styling – i.e. div, span, etc. etc.
  3. Don’t want any scripting access

Is there a clever way to do this? I could, for e.g., allow unsafe text and examine it on the server side but I doubt I’d be able to clean it up correctly and might open security holes.

Preferably want to avoid heavy duty plugins.

Thanks!

(PS – my worst fallback is that I allow safe text only, i.e. keep the ASP.NET security on, and then use a special markup for links – like [link] [b] [i])

Solution

No matter what approach you use, you need to assume everything entered into the field is malicious, i.e. don’t trust any data.

I wouldn’t bother too much with any client validation in JavaScript/jQuery. It’ll be complex and only need to be redone server side.

Server side you want to take a whitelist approach, i.e. if it’s not on the list, it’s invalid. You wouldn’t be able to use a XML processor because the user’s text may not result in valid XML, instead you’d probably want to use a regular expression.

I would define a set of tags that are valid (you’ve said p, a, b and i but I would be weary of the last two as you’d almost never get them in ‘wild’ html), I would then define if and which attributes are valid for these tags. I’m guessing you’d want at the very least a href on the a.

You could strip any text within tags that doesn’t match… my regex skills aren’t great, but this appears to find all the tags you want to keep, it needs to be inverted.

\<a\shref\=".[^\"]*\"\>|\</?[abip]\s?\>

Answered By – joocer

Answer Checked By – Candace Johnson (BugsFixing Volunteer)

Leave a Reply

Your email address will not be published. Required fields are marked *