Where all can javascript (and plugins) live/hide in a webpage?

0 投票
最新提问 用户: (120 分)

I've written an intercepting proxy and I'm working on adding a feature to it wherein it removes all JavaScript (and VBScript and any plugin content such as Flash, Silverlight, ActiveX components, and Java Applets which can run arbitrary, if sandboxed, code which interacts with the page) from the webpages which are received through it. If removing the page content is "the easy part", getting an exhaustive list of places where such things can hide might be the "hard part".

Of course, JavaScript can include other JavaScript in various ways such as adding a new script element to the dom. This doesn't concern me because if I head off all ways in which the HTML and/or CSS can pull in JavaScript, I don't need to be concerned that JavaScript can pull in other JavaScript. (Or that Flash can pull in JavaScript or that JavaScript can pull in Flash content etc.)

The ways I've identified in which a page can include JavaScript (or VBScript) are:

  1. A script tag, either inline or with a "src=" attribute for an external file.
  2. An "onload=", "onclick=", "onmouseover=", or other "on_event_=" attribute.
  3. "javascript:" urls in "href=" attributes on a tags.
  4. There's the CSS 'background-image: url("javascript: ...")' trick (though it doesn't work in all browsers).

One question I have is whether I need to worry about "javascript:" urls in places other than "href=" attributes on a tags. I know some older browsers (such as IE6) execute the JavaScript in a "javascript:" url in a "src=" attributes in img tags. That doesn't concern me simply because I'm not writing my intercepting proxy for use with browsers as old as IE6. But if there are other "tricky" places in which "javascript:" urls might be executed, that would be good to know. (Maybe in frame and iframe "src=" attributes?)

And, of course, any other ways JavaScript might be executed that I haven't considered.

The ways I've identifid in which a page can include other (sandboxed) code are:

  1. embed tags.
  2. object tags.
  3. applet tags.

One question I have here (and this may be considered "future-proofing") regards how WebAssembly bytecode is loaded. Just my cursory googling hasn't determined how that works exactly.

While PDF's can be loaded in most modern browsers without extensions or anything, I consider PDF's to be outside of the scope of this question because while they can include arbitrary (sandboxed) code, this code cannot interact with the webpage.

I also consider things such as HTML5 video to be outside the scope of this question because they don't run arbitrary code.

I'm also not concerned with any ways in which JavaScript or plugins might execute which require the presence of browser extensions unless said extensions are extremely widely used.

I do, however, desire for this script and plugin stripping feature to be ironclad. If somebody were able to use tricky measures to slip code by this feature, the tricky measures which might be employed are relevant to this question as far as I'm concerned.

And, ways to include code which only work in some browsers and not others are also in-scope.

Are there any other kinds of plugins or languages that can be included in a webpage that I haven't considered above?

发表于 用户: (100 分)
Use a whitelist of tags, attributes, and protocols instead of a blacklist, first off - and be very careful with it. (Comments, for example, aren't safe; IE has conditional comments. data: isn't safe either.)
发表于 用户: (100 分)
Oh, and you should make sure, for each whitelisted object, that theres no way to force an HTML context (because otherwise its impossible to tell what to filter). And make sure no software on the devices can force an HTML context. Might be better just to check that JavaScript is disabled in-browser if possible

登录 或者 注册 后回答这个问题。

欢迎来到 Security Q&A ,有什么不懂的可以尽管在这里提问,你将会收到社区其他成员的回答。
...