If you allow user-contributed content in your site, you run into the problem of dealing with user supplied HTML in a safe manner. The most secure way of dealing with things, of course, is to strip or escape all HTML from user input fields. Unfortunately, there are many situations where it would be nice to allow a large subset of HTML input, but block out anything potentially dangerous.

SafeHTML is a lightweight PHP user input sanitizer that does just that. Just run any input field through the SafeHTML filter and any javascript, object tags, or layout breaking tags will be stripped from the supplied text. It also does a reasonable job of correcting any gnarly, malformed code, which is also a common problem with user-contributed data.

Using it is easy. Just instantiate the SafeHTML object and call its parse method:


$safehtml =& new SafeHTML();

if ( isset( $_POST["inputfield"] ) )
  $cleaninput = $safehtml->parse($inputfield);

This will take the posted “inputfield” parameter, strip any baddies, XHTMLify what’s left, and the result will be stored in the $cleaninput variable. It’s a simple addition to your code, and a lot more straightforward than trying to roll your own.

My only beef with the package is that it’s written with a default allow policy, stripping out tags that are in its deleteTags array, but essentially allowing anything else through. If you’d rather only let through tags that you specifically want to allow, I’d recommend adding an allowTags array and adjusting the _openHandler method, adding the following after the deleteTags check:

if ( ! in_array($name, $this->allowTags)) {
  return true;

You’ll need to fill allowTags with everything you know to be safe and welcome, and you may miss a few that people will end up wanting to legitimately use, but this is easily corrected and the default deny policy is much safer in the long run.

SafeHTML – an anti-XSS HTML parser, written in PHP