Javascript RegEx Replace All Characters Not Within HTML Tags
Looking for a bit of help, my regex is a bit rusty... I'm trying to replace all characters not within HTML tags in javascript by a character. For example replace those characters b
Solution 1:
Using jQuery:
html = '<div class="test">Lorem Ipsum <br/> Dolor Sit Amet</div>';
node = $("<div>" + html + "</div>");
node.find('*').contents().each(function() {
if(this.nodeType == 3)
this.nodeValue = Array(this.nodeValue.length).join('-')
});
console.log(node.html())
(I don't have IE7 at hand, let me know if this works).
If you prefer regular expressions, it goes like this:
html = html.replace(/<[^<>]+>|./g, function($0) {
return $0[0] == '<' ? $0 : '-';
});
Basically, we replace tags with themselves and out-of-tags characters with dashes.
Solution 2:
Instead of using a regex-only approach, you can find all text nodes within the document and replace their content with hyphens.
Using the TreeWalker API:
var tree = document.createTreeWalker(document.body, NodeFilter.SHOW_TEXT);
while (tree.nextNode()) {
var textNode = tree.currentNode;
textNode.nodeValue = textNode.nodeValue.replace(/./g, '-');
}
A recursive solution:
function findTextNodes(node, fn){
for (node = node.firstChild; node;node=node.nextSibling){
if (node.nodeType === Node.TEXT_NODE) fn(node);
else if(node.nodeType === Node.ELEMENT_NODE && node.nodeName !== 'SCRIPT') findTextNodes(node, fn);
}
}
findTextNodes(document.body, function (node) {
node.nodeValue = node.nodeValue.replace(/./g, '-');
});
The predicate node.nodeName !== 'SCRIPT'
is required to prevent the function from replacing any script content within the body.
Post a Comment for "Javascript RegEx Replace All Characters Not Within HTML Tags"