Clover coverage report - dom4j - 1.5
Coverage timestamp: vr sep 3 2004 20:47:03 GMT+01:00
file stats: LOC: 757   Methods: 32
NCLOC: 267   Classes: 2
 
 Source file Conditionals Statements Methods TOTAL
HTMLWriter.java 29,5% 37% 37,5% 35,5%
coverage coverage
 1    /*
 2    * Copyright 2001-2004 (C) MetaStuff, Ltd. All Rights Reserved.
 3    *
 4    * This software is open source.
 5    * See the bottom of this file for the licence.
 6    *
 7    * $Id: HTMLWriter.java,v 1.19 2004/06/25 08:03:36 maartenc Exp $
 8    */
 9   
 10    package org.dom4j.io;
 11   
 12    import java.io.IOException;
 13    import java.io.OutputStream;
 14    import java.io.StringWriter;
 15    import java.io.UnsupportedEncodingException;
 16    import java.io.Writer;
 17    import java.util.HashSet;
 18    import java.util.Iterator;
 19    import java.util.Set;
 20   
 21    import org.dom4j.Document;
 22    import org.dom4j.DocumentHelper;
 23    import org.dom4j.Element;
 24    import org.dom4j.Entity;
 25    import org.dom4j.Node;
 26    import org.xml.sax.SAXException;
 27   
 28    /** <p><code>HTMLWriter</code> takes a DOM4J tree and formats it to a
 29    * stream as HTML.
 30    * This formatter is similar to XMLWriter but it outputs the text of CDATA
 31    * and Entity sections rather than the serialised format as in XML,
 32    * it has an XHTML mode, it retains whitespace in certain elements such as &lt;PRE&gt;,
 33    * and it supports certain elements which have no corresponding close tag such
 34    * as for &lt;BR&gt; and &lt;P&gt;.
 35    *
 36    * <p> The OutputFormat passed in to the constructor is checked for isXHTML() and isExpandEmptyElements().
 37    * See {@link OutputFormat OutputFormat} for details. Here are the rules for
 38    * <b>this class</b> based on an OutputFormat, "format", passed in to the constructor:<br/><br/>
 39    * <ul>
 40    * <li>If an element is in {@link #getOmitElementCloseSet() getOmitElementCloseSet}, then it is treated specially:</li>
 41    * <ul>
 42    * <li>It never expands, since some browsers treat this as two separate Horizontal Rules: &lt;HR&gt;&lt;/HR&gt;</li>
 43    * <li>If {@link org.dom4j.io.OutputFormat#isXHTML() format.isXHTML()}, then it has a space before the closing single-tag slash, since Netscape 4.x- treats this: &lt;HR /&gt; as
 44    * an element named "HR" with an attribute named "/", but that's better than when it refuses to recognize this: &lt;hr/&gt;
 45    * which it thinks is an element named "HR/". </li>
 46    * </ul>
 47    * <li>If {@link org.dom4j.io.OutputFormat#isXHTML() format.isXHTML()}, all elements must have
 48    * either a close element, or be a closed single tag.</li>
 49    * <li>If {@link org.dom4j.io.OutputFormat#isExpandEmptyElements() format.isExpandEmptyElements()}() is true,
 50    * all elements are expanded except as above.</li>
 51    * </ul>
 52    * <b>Examples</b>
 53    *
 54    * <table border="1" cellpadding="0" cellspacing="0">
 55    * <tr>
 56    * <th colspan="3" align="left">isXHTML == true</th>
 57    * </tr>
 58    * <tr>
 59    * <td width="25">&#160;</td>
 60    * <th align="left">isExpandEmptyElements == true</th>
 61    * <td><code>
 62    * &lt;td&gt;&lt;/td&gt;<br />
 63    * &lt;br&#160;/&gt;<br />
 64    * &lt;foo&gt;&lt;/foo&gt;</code>
 65    * </td>
 66    * </tr>
 67    * <tr>
 68    * <td width="25">&#160;</td>
 69    * <th align="left">isExpandEmptyElements == false</th>
 70    * <td><code>
 71    * &lt;td/&gt;<br />
 72    * &lt;br&#160;/&gt;<br />
 73    * &lt;foo/&gt;</code>
 74    * </td>
 75    * </tr>
 76    * <tr>
 77    * <th colspan="3" align="left">isXHTML == false</th>
 78    * </tr>
 79    * <tr>
 80    * <td width="25">&#160;</td>
 81    * <th align="left">isExpandEmptyElements == true</th>
 82    * <td><code>
 83    * &lt;td&gt;&lt;/td&gt;<br />
 84    * &lt;br&gt;<br />
 85    * &lt;foo&gt;&lt;/foo&gt;</code>
 86    * </td>
 87    * </tr>
 88    * <tr>
 89    * <td width="25">&#160;</td>
 90    * <th align="left">isExpandEmptyElements == false</th>
 91    * <td><code>
 92    * &lt;td/&gt;<br />
 93    * &lt;br&gt;<br />
 94    * &lt;foo/&gt;</code>
 95    * </td>
 96    * </tr>
 97    * </table>
 98    * <p>
 99    * <p>
 100    * If isXHTML == true, CDATA sections look like this:
 101    * <PRE>
 102    * <b>&lt;myelement&gt;&lt;![CDATA[My data]]&gt;&lt;/myelement&gt;</b>
 103    * </PRE>
 104    * Otherwise, they look like this:
 105    * <PRE>
 106    * <b>&lt;myelement&gt;My data&lt;/myelement&gt;</b>
 107    * </PRE>
 108    * </p>
 109    *
 110    * Basically, {@link org.dom4j.io.OutputFormat#isXHTML() OutputFormat.isXHTML()} == true will produce valid XML,
 111    * while {@link org.dom4j.io.OutputFormat#isExpandEmptyElements() format.isExpandEmptyElements()}
 112    * determines whether empty elements are expanded
 113    * if isXHTML is true, excepting the special HTML single tags.
 114    * </p>
 115    *
 116    *
 117    * <p>Also, HTMLWriter handles tags whose contents should be preformatted, that is, whitespace-preserved.
 118    * By default, this set includes the tags &lt;PRE&gt;, &lt;SCRIPT&gt;, &lt;STYLE&gt;, and &lt;TEXTAREA&gt;, case insensitively.
 119    * It does not include &lt;IFRAME&gt;.
 120    * Other tags, such as &lt;CODE&gt;, &lt;KBD&gt;, &lt;TT&gt;, &lt;VAR&gt;, are usually rendered in a different font in most browsers,
 121    * but don't preserve whitespace, so they also don't appear in the default list. HTML Comments
 122    * are always whitespace-preserved. However, the parser you use may store comments with linefeed-only
 123    * text nodes (\n) even if your platform uses another line.separator character, and HTMLWriter outputs
 124    * Comment nodes exactly as the DOM is set up by the parser.
 125    * See examples and discussion here: {@link #setPreformattedTags(java.util.Set) setPreformattedTags}</p>
 126    *
 127    * <p><b>Examples</b></p>
 128    * <blockquote>
 129    * <p><b>Pretty Printing</b></p>
 130    * <p>This example shows how to pretty print a string containing a valid HTML document to a string.
 131    * You can also just call the static methods of this class:<br/>
 132    * {@link #prettyPrintHTML(String) prettyPrintHTML(String)}
 133    * or<br/>
 134    * {@link #prettyPrintHTML(String,boolean,boolean,boolean,boolean) prettyPrintHTML(String,boolean,boolean,boolean,boolean)}
 135    * or, <br/>
 136    * {@link #prettyPrintXHTML(String) prettyPrintXHTML(String)} for XHTML (note the X)
 137    * </p>
 138    * <pre>
 139    * String testPrettyPrint(String html){
 140    * StringWriter sw = new StringWriter();
 141    * org.dom4j.io.OutputFormat format = org.dom4j.io.OutputFormat.createPrettyPrint();
 142    * <font color='green'>//These are the default formats from createPrettyPrint, so you needn't set them:</font>
 143    * <font color='green'>// format.setNewlines(true);</font>
 144    * <font color='green'>// format.setTrimText(true);</font>
 145    * format.setXHTML(true); <font color='green'>//Default is false, this produces XHTML</font>
 146    * org.dom4j.io.HTMLWriter writer = new org.dom4j.io.HTMLWriter(sw, format);
 147    * org.dom4j.Document document = org.dom4j.DocumentHelper.parseText(html);
 148    * writer.write(document);
 149    * writer.flush();
 150    * return sw.toString();
 151    * }
 152    * </pre>
 153    *
 154    * <p>This example shows how to create a "squeezed" document, but one that will work in browsers
 155    * even if the browser line length is limited. No newlines are included, no extra whitespace
 156    * at all, except where it it required by {@link #setPreformattedTags(java.util.Set) setPreformattedTags}.
 157    * </p>
 158    * <pre>
 159    * String testCrunch(String html){
 160    * StringWriter sw = new StringWriter();
 161    * org.dom4j.io.OutputFormat format = org.dom4j.io.OutputFormat.createPrettyPrint();
 162    * format.setNewlines(false);
 163    * format.setTrimText(true);
 164    * format.setIndent("");
 165    * format.setXHTML(true);
 166    * format.setExpandEmptyElements(false);
 167    * format.setNewLineAfterNTags(20); <font color='green'>//print a line every so often.</font>
 168    * org.dom4j.io.HTMLWriter writer = new org.dom4j.io.HTMLWriter(sw, format);
 169    * org.dom4j.Document document = org.dom4j.DocumentHelper.parseText(html);
 170    * writer.write(document);
 171    * writer.flush();
 172    * return sw.toString();
 173    * }
 174    * </pre>
 175    *
 176    * </blockquote>
 177    *
 178    * </p>
 179    *
 180    * @author <a href="mailto:james.strachan@metastuff.com">James Strachan</a> (james.strachan@metastuff.com)
 181    * @author Laramie Crocker
 182    * @version $Revision: 1.19 $
 183    */
 184    public class HTMLWriter extends XMLWriter {
 185   
 186  2 public HTMLWriter(Writer writer) {
 187  2 super( writer, defaultHtmlFormat );
 188    }
 189   
 190  10 public HTMLWriter(Writer writer, OutputFormat format) {
 191  10 super( writer, format );
 192    }
 193   
 194  0 public HTMLWriter() throws UnsupportedEncodingException {
 195  0 super( defaultHtmlFormat );
 196    }
 197   
 198  0 public HTMLWriter(OutputFormat format) throws UnsupportedEncodingException {
 199  0 super( format );
 200    }
 201   
 202  0 public HTMLWriter(OutputStream out) throws UnsupportedEncodingException {
 203  0 super( out, defaultHtmlFormat );
 204    }
 205   
 206  0 public HTMLWriter(OutputStream out, OutputFormat format) throws UnsupportedEncodingException {
 207  0 super( out, format );
 208    }
 209   
 210   
 211    //Allows us to the current state of the format in this struct on the m_formatStack.
 212    private class FormatState {
 213  0 public FormatState(boolean newLines, boolean trimText, String indent){
 214  0 this.m_Newlines = newLines;
 215  0 this.m_TrimText = trimText;
 216  0 this.m_indent = indent;
 217    }
 218    private boolean m_Newlines = false;
 219  0 public boolean isNewlines(){return m_Newlines;}
 220    private boolean m_TrimText = false;
 221  0 public boolean isTrimText(){return m_TrimText;}
 222    private String m_indent = "";
 223  0 public String getIndent(){return m_indent;}
 224    }
 225   
 226   
 227   
 228    private java.util.Stack m_formatStack = new java.util.Stack();
 229   
 230    private static String m_lineSeparator = System.getProperty("line.separator");
 231   
 232    private String m_lastText = "";
 233   
 234    private int m_tagsOuput = 0;
 235   
 236    private int m_newLineAfterNTags = -1; //legal values are 0+, but -1 signifies lazy initialization.
 237   
 238    protected static final HashSet defaultPreformattedTags;
 239   
 240    static {
 241    //If you change this list, update the javadoc examples, above in the class javadoc,
 242    // in writeElement, and in setPreformattedTags().
 243  2 defaultPreformattedTags = new HashSet();
 244  2 defaultPreformattedTags.add("PRE");
 245  2 defaultPreformattedTags.add("SCRIPT");
 246  2 defaultPreformattedTags.add("STYLE");
 247  2 defaultPreformattedTags.add("TEXTAREA");
 248    }
 249   
 250    private HashSet preformattedTags = defaultPreformattedTags;
 251   
 252    protected static final OutputFormat defaultHtmlFormat;
 253   
 254    static {
 255  2 defaultHtmlFormat = new OutputFormat( " ", true );
 256  2 defaultHtmlFormat.setTrimText( true );
 257  2 defaultHtmlFormat.setSuppressDeclaration( true );
 258    }
 259   
 260    /** Used to store the qualified element names which
 261    * should have no close element tag
 262    */
 263    private HashSet omitElementCloseSet; //keep as a HashSet, but only show as a Set when asked for by getOmitElementCloseSet().
 264   
 265  0 public void startCDATA() throws SAXException {
 266    }
 267   
 268  0 public void endCDATA() throws SAXException {
 269    }
 270   
 271    // Overloaded methods
 272   
 273    // laramiec 3/21/2002 added isXHTML() stuff so you get the CDATA brackets if you desire.
 274  2 protected void writeCDATA(String text) throws IOException {
 275    // XXX: Should we escape entities?
 276    // writer.write( escapeElementEntities( text ) );
 277  2 if ( getOutputFormat().isXHTML() ) {
 278  0 super.writeCDATA(text);
 279    } else {
 280  2 writer.write( text );
 281    }
 282  2 lastOutputNodeType = Node.CDATA_SECTION_NODE;
 283    }
 284   
 285  0 protected void writeEntity(Entity entity) throws IOException {
 286  0 writer.write(entity.getText());
 287  0 lastOutputNodeType = Node.ENTITY_REFERENCE_NODE;
 288    }
 289   
 290  6 protected void writeDeclaration() throws IOException {
 291    }
 292   
 293  8 protected void writeString(String text) throws IOException {
 294    //DOM stores \n at the end of text nodes that are newlines. This is significant if
 295    // we are in a PRE section. However, we only want to output the system line.separator, not \n.
 296    // This is a little brittle, but this function appears to be called with these lineseparators
 297    // as a separate TEXT_NODE. If we are in a preformatted section, output the right line.separator,
 298    // otherwise ditch. If the single \n character is not the text, then do the super thing
 299    // to output the text.
 300    // Also, we store the last text that was not a \n since it may be used by writeElement in this class to
 301    // line up preformatted tags.
 302  8 if ( text.equals("\n")){
 303  0 if ( ! m_formatStack.empty() ) {
 304  0 super.writeString(m_lineSeparator);
 305    }
 306  0 return;
 307    }
 308  8 m_lastText = text;
 309  8 if ( m_formatStack.empty() ) {
 310  8 super.writeString(text.trim());
 311    } else {
 312  0 super.writeString(text);
 313    }
 314    }
 315   
 316    /** Overriden method to not close certain element names to avoid
 317    * wierd behaviour from browsers for versions up to 5.x
 318    */
 319  0 protected void writeClose(String qualifiedName) throws IOException {
 320  0 if ( ! omitElementClose( qualifiedName ) ) {
 321  0 super.writeClose(qualifiedName);
 322    }
 323    }
 324   
 325  4 protected void writeEmptyElementClose(String qualifiedName) throws IOException {
 326  4 if (getOutputFormat().isXHTML()){
 327    //xhtml, always check with format object whether to expand or not.
 328  0 if ( omitElementClose(qualifiedName) ) {
 329    // it was a special omit tag, do it the XHTML way: "<br/>", ignoring the expansion option,
 330    // since <br></br> is OK XML, but produces twice the linefeeds desired in the browser.
 331    // for netscape 4.7, though all are fine with it, write a space before the close slash.
 332  0 writer.write(" />");
 333    } else {
 334  0 super.writeEmptyElementClose(qualifiedName);
 335    }
 336    } else {
 337    //html, not xhtml
 338  4 if ( omitElementClose(qualifiedName) ) {
 339    // it was a special omit tag, do it the old html way: "<br>".
 340  2 writer.write(">");
 341    } else {
 342    // it was NOT a special omit tag, check with format object whether to expand or not.
 343  2 super.writeEmptyElementClose(qualifiedName);
 344    }
 345    }
 346    }
 347   
 348  4 protected boolean omitElementClose( String qualifiedName ) {
 349  4 return internalGetOmitElementCloseSet().contains( qualifiedName.toUpperCase() );
 350    }
 351   
 352  4 private HashSet internalGetOmitElementCloseSet() {
 353  4 if (omitElementCloseSet == null) {
 354  4 omitElementCloseSet = new HashSet();
 355  4 loadOmitElementCloseSet(omitElementCloseSet);
 356    }
 357  4 return omitElementCloseSet;
 358    }
 359   
 360    //If you change this, change the javadoc for getOmitElementCloseSet.
 361  4 protected void loadOmitElementCloseSet(Set set) {
 362  4 set.add( "AREA" );
 363  4 set.add( "BASE" );
 364  4 set.add( "BR" );
 365  4 set.add( "COL" );
 366  4 set.add( "HR" );
 367  4 set.add( "IMG" );
 368  4 set.add( "INPUT" );
 369  4 set.add( "LINK" );
 370  4 set.add( "META" );
 371  4 set.add( "P" );
 372  4 set.add( "PARAM" );
 373    }
 374   
 375    //let the people see the set, but not modify it.
 376    /** A clone of the Set of elements that can have their close-tags omitted. By default it
 377    * should be
 378    * "AREA",
 379    * "BASE",
 380    * "BR",
 381    * "COL",
 382    * "HR",
 383    * "IMG",
 384    * "INPUT",
 385    * "LINK",
 386    * "META",
 387    * "P",
 388    * "PARAM"
 389    * @return A clone of the Set.
 390    */
 391  0 public Set getOmitElementCloseSet(){
 392  0 return (Set)(internalGetOmitElementCloseSet().clone());
 393    }
 394   
 395    /** To use the empty set, pass an empty Set, or null:
 396    * <pre>
 397    * setOmitElementCloseSet(new HashSet());
 398    * or
 399    * setOmitElementCloseSet(null);
 400    * </pre>
 401    */
 402  0 public void setOmitElementCloseSet(Set newSet){
 403  0 omitElementCloseSet = new HashSet(); //resets, and safely empties it out if newSet is null.
 404  0 if (newSet != null){
 405  0 omitElementCloseSet = new HashSet();
 406  0 Object aTag;
 407  0 Iterator iter = newSet.iterator();
 408  0 while ( iter.hasNext() ) {
 409  0 aTag = iter.next();
 410  0 if (aTag != null){
 411  0 omitElementCloseSet.add(aTag.toString().toUpperCase());
 412    }
 413    }
 414   
 415    }
 416    }
 417   
 418    /** @see #setPreformattedTags(java.util.Set) setPreformattedTags
 419    */
 420  0 public Set getPreformattedTags(){
 421  0 return (Set)(preformattedTags.clone());
 422    }
 423   
 424    /**
 425    * <p>Override the default set, which includes PRE, SCRIPT, STYLE, and TEXTAREA, case insensitively.</p>
 426    *
 427    * <p><b>Setting Preformatted Tags</b></p>
 428    *
 429    *
 430    * <p>Pass in a Set of Strings, one for each tag name that should be treated like a PRE tag.
 431    * You may pass in null or an empty Set to assign the empty set, in which case no tags
 432    * will be treated as preformatted, except that HTML Comments will continue to be preformatted.
 433    * If a tag is included in the set of preformatted tags, all whitespace within the tag will be preserved,
 434    * including whitespace on the same line preceding the close tag. This will generally make the close tag
 435    * not line up with the start tag, but it preserves the intention of the whitespace within the tag.
 436    * </p>
 437    * <p>The browser considers leading whitespace before the close tag to be significant,
 438    * but leading whitespace before the open tag to be insignificant.
 439    * For example, if the HTML author doesn't put the close TEXTAREA tag flush to the left margin,
 440    * then the TEXTAREA control in the browser will have spaces on the last line inside the control. This may be
 441    * the HTML author's intent. Similarly, in a PRE, the browser treats a flushed left close PRE tag as different from
 442    * a close tag with leading whitespace. Again, this must be left up to the HTML author.</p>
 443    *
 444    * <p><b>Examples</b></p>
 445    * <blockquote>
 446    * <p>
 447    * Here is an example of how you can set the PreformattedTags list using setPreformattedTags
 448    * to include IFRAME, as well as the default set,
 449    * if you have an instance of this class named myHTMLWriter:
 450    * <pre>
 451    * Set current = myHTMLWriter.getPreformattedTags();
 452    * current.add("IFRAME");
 453    * myHTMLWriter.setPreformattedTags(current);
 454    *
 455    * <font color='green'>//The set is now <b>{PRE, SCRIPT, STYLE, TEXTAREA, IFRAME}</b></font>
 456    * </pre>
 457    *
 458    * Similarly, you can simply replace it with your own:
 459    * <pre>
 460    * HashSet newset = new HashSet();
 461    * newset.add("PRE");
 462    * newset.add("TEXTAREA");
 463    * myHTMLWriter.setPreformattedTags(newset);
 464    *
 465    * <font color='green'>//The set is now <b>{PRE, TEXTAREA}</b></font>
 466    * </pre>
 467    *
 468    * You can remove all tags from the preformatted tags list, with an empty set, like this:
 469    * <pre>
 470    * myHTMLWriter.setPreformattedTags(new HashSet());
 471    *
 472    * <font color='green'>//The set is now <b>{}</b></font>
 473    * </pre>
 474    *
 475    * or with null, like this:
 476    * <pre>
 477    * myHTMLWriter.setPreformattedTags(null);
 478    *
 479    * <font color='green'>//The set is now <b>{}</b></font>
 480    * </pre>
 481    *
 482    * </blockquote>
 483    *
 484    */
 485  0 public void setPreformattedTags(Set newSet){
 486    // no fancy merging, just set it, assuming they did a getExcludeTrimTags()
 487    // first if they wanted to preserve the default set.
 488  0 preformattedTags = new HashSet(); //resets, and safely empties it out if newSet is null.
 489  0 if ( newSet != null ) {
 490  0 Object aTag;
 491  0 Iterator iter = newSet.iterator();
 492  0 while ( iter.hasNext() ) {
 493  0 aTag = iter.next();
 494  0 if (aTag != null){
 495  0 preformattedTags.add(aTag.toString().toUpperCase());
 496    }
 497    }
 498    }
 499    }
 500   
 501   
 502    /**
 503    * @return true if the qualifiedName passed in matched (case-insensitively)
 504    * a tag in the preformattedTags set,
 505    * or false if not found or if the set is empty or null.
 506    * @see #setPreformattedTags(java.util.Set) setPreformattedTags
 507    */
 508  18 public boolean isPreformattedTag(String qualifiedName){
 509    //A null set implies that the user called setPreformattedTags(null), which means they want
 510    //no tags to be preformatted.
 511  18 return (preformattedTags != null) && (preformattedTags.contains(qualifiedName.toUpperCase()));
 512    }
 513   
 514    /** This override handles any elements that should not remove whitespace,
 515    * such as &lt;PRE&gt;, &lt;SCRIPT&gt;, &lt;STYLE&gt;, and &lt;TEXTAREA&gt;.
 516    * Note: the close tags won't line up with the open tag, but we can't alter that.
 517    * See javadoc note at setPreformattedTags.
 518    *
 519    * @see #setPreformattedTags(java.util.Set) setPreformattedTags
 520    * @throws java.io.IOException When the stream could not be written to.
 521    *
 522    */
 523  18 protected void writeElement(Element element) throws IOException {
 524  18 if ( m_newLineAfterNTags == -1 ) { //lazy initialization check
 525  6 lazyInitNewLinesAfterNTags();
 526    }
 527  18 if ( m_newLineAfterNTags > 0 ) {
 528  0 if ( (m_tagsOuput>0) && (m_tagsOuput % m_newLineAfterNTags == 0)) {
 529  0 super.writer.write(m_lineSeparator);
 530    }
 531    }
 532  18 m_tagsOuput++;
 533   
 534  18 String qualifiedName = element.getQualifiedName();
 535  18 String saveLastText = m_lastText;
 536  18 int size = element.nodeCount();
 537  18 if ( isPreformattedTag(qualifiedName) ) {
 538  0 OutputFormat currentFormat = getOutputFormat();
 539  0 boolean saveNewlines = currentFormat.isNewlines();
 540  0 boolean saveTrimText = currentFormat.isTrimText();
 541  0 String currentIndent = currentFormat.getIndent();
 542    //You could have nested PREs, or SCRIPTS within PRE... etc., therefore use push and pop.
 543  0 m_formatStack.push(new FormatState(saveNewlines, saveTrimText, currentIndent));
 544  0 try {
 545  0 super.writePrintln(); //do this manually, since it won't be done while outputting the tag.
 546  0 if ( saveLastText.trim().length() == 0 && currentIndent != null && currentIndent.length()>0) {
 547    //We are indenting, but we want to line up with the close tag.
 548    //m_lastText was the indent (whitespace, no \n) before the preformatted start tag.
 549    //So write it out instead of the current indent level. This makes it line up with its
 550    //close tag.
 551  0 super.writer.write(justSpaces(saveLastText));
 552    }
 553  0 currentFormat.setNewlines(false);//actually, newlines are handled in this class by writeString, depending on if the stack is empty.
 554  0 currentFormat.setTrimText(false);
 555  0 currentFormat.setIndent("");
 556    //This line is the recursive one:
 557  0 super.writeElement(element);
 558    } finally {
 559  0 FormatState state = (FormatState)m_formatStack.pop();
 560  0 currentFormat.setNewlines(state.isNewlines());
 561  0 currentFormat.setTrimText(state.isTrimText());
 562  0 currentFormat.setIndent(state.getIndent());
 563    }
 564    } else {
 565  18 super.writeElement(element);
 566    }
 567    }
 568   
 569  0 private String justSpaces(String text){
 570  0 int size = text.length();
 571  0 StringBuffer res = new StringBuffer(size);
 572  0 char c;
 573  0 for (int i=0; i < size; i++) {
 574  0 c = text.charAt(i);
 575  0 switch ( c ) {
 576  0 case '\r':
 577  0 case '\n':
 578  0 continue;
 579  0 default:
 580  0 res.append(c);
 581    }
 582    }
 583  0 return res.toString();
 584    }
 585   
 586  6 private void lazyInitNewLinesAfterNTags(){
 587  6 if ( getOutputFormat().isNewlines() ) {
 588  4 m_newLineAfterNTags = 0; //don't bother, newlines are going to happen anyway.
 589    } else {
 590  2 m_newLineAfterNTags = getOutputFormat().getNewLineAfterNTags();
 591    }
 592    }
 593   
 594    // Convenience methods, static, with bunch-o-defaults
 595   
 596    /** Convenience method to just get a String result.
 597    *
 598    * @return a pretty printed String from the source string,
 599    * preserving whitespace in the defaultPreformattedTags set,
 600    * and leaving the close tags off of the default omitElementCloseSet set.
 601    *
 602    * Use one of the write methods if you want stream output.
 603    * @throws java.io.IOException
 604    * @throws java.io.UnsupportedEncodingException
 605    * @throws org.dom4j.DocumentException
 606    */
 607  0 public static String prettyPrintHTML(String html)
 608    throws java.io.IOException, java.io.UnsupportedEncodingException, org.dom4j.DocumentException {
 609  0 return prettyPrintHTML(html, true, true, false, true);
 610    }
 611   
 612    /** Convenience method to just get a String result, but <b>As XHTML</b>.
 613    *
 614    * @return a pretty printed String from the source string,
 615    * preserving whitespace in the defaultPreformattedTags set,
 616    * but conforming to XHTML: no close tags are omitted (though if empty, they will
 617    * be converted to XHTML empty tags: &lt;HR/&gt;
 618    *
 619    * Use one of the write methods if you want stream output.
 620    * @throws java.io.IOException
 621    * @throws java.io.UnsupportedEncodingException
 622    * @throws org.dom4j.DocumentException
 623    */
 624  0 public static String prettyPrintXHTML(String html)
 625    throws java.io.IOException, java.io.UnsupportedEncodingException, org.dom4j.DocumentException {
 626  0 return prettyPrintHTML(html, true, true, true, false);
 627    }
 628   
 629    /** @return a pretty printed String from the source string,
 630    * preserving whitespace in the defaultPreformattedTags set,
 631    * and leaving the close tags off of the default omitElementCloseSet set.
 632    * This override allows you to specify various formatter options.
 633    * Use one of the write methods if you want stream output.
 634    * @throws java.io.IOException
 635    * @throws java.io.UnsupportedEncodingException
 636    * @throws org.dom4j.DocumentException
 637    */
 638  0 public static String prettyPrintHTML(String html,
 639    boolean newlines,
 640    boolean trim,
 641    boolean isXHTML,
 642    boolean expandEmpty)
 643    throws java.io.IOException, java.io.UnsupportedEncodingException, org.dom4j.DocumentException {
 644  0 StringWriter sw = new StringWriter();
 645  0 OutputFormat format = OutputFormat.createPrettyPrint();
 646  0 format.setNewlines(newlines);
 647  0 format.setTrimText(trim);
 648  0 format.setXHTML(isXHTML);
 649  0 format.setExpandEmptyElements(expandEmpty);
 650  0 HTMLWriter writer = new HTMLWriter(sw, format);
 651  0 Document document = DocumentHelper.parseText(html);
 652  0 writer.write(document);
 653  0 writer.flush();
 654  0 return sw.toString();
 655    }
 656   
 657    }
 658   
 659    //==================== Test xml file: ================================
 660    /*
 661    <html>
 662    <head>
 663    <title>My Title</title>
 664    <style>
 665    .foo {
 666    text-align: Right;
 667    }
 668    </style>
 669    <script>
 670    function mojo(){
 671    return "bar";
 672    }
 673    </script>
 674    <script language="JavaScript">
 675    <!--
 676    //this is the canonical javascript hiding.
 677    function foo(){
 678    return "foo";
 679    }
 680    //-->
 681    </script>
 682    </head>
 683    <!-- this
 684    is a comment
 685    -->
 686    <body bgcolor="#A4BFDD" mojo="&amp;">
 687    entities: &#160; &amp; &quot; &lt; &gt; %23
 688    <p></p>
 689    <mojo></mojo>
 690    <foo />
 691    <table border="1">
 692    <tr>
 693    <td>
 694    <pre>line0
 695    <hr />
 696    line1
 697    <b>line2, should line up, indent-wise</b>
 698    line 3
 699    line 4
 700    </pre>
 701    </td>
 702    <td></td>
 703    </tr>
 704    </table>
 705    <myCDATAElement><![CDATA[My data]]></myCDATAElement>
 706    </body>
 707    </html>
 708    */
 709    //====================================================
 710   
 711   
 712   
 713   
 714    /*
 715    * Redistribution and use of this software and associated documentation
 716    * ("Software"), with or without modification, are permitted provided
 717    * that the following conditions are met:
 718    *
 719    * 1. Redistributions of source code must retain copyright
 720    * statements and notices. Redistributions must also contain a
 721    * copy of this document.
 722    *
 723    * 2. Redistributions in binary form must reproduce the
 724    * above copyright notice, this list of conditions and the
 725    * following disclaimer in the documentation and/or other
 726    * materials provided with the distribution.
 727    *
 728    * 3. The name "DOM4J" must not be used to endorse or promote
 729    * products derived from this Software without prior written
 730    * permission of MetaStuff, Ltd. For written permission,
 731    * please contact dom4j-info@metastuff.com.
 732    *
 733    * 4. Products derived from this Software may not be called "DOM4J"
 734    * nor may "DOM4J" appear in their names without prior written
 735    * permission of MetaStuff, Ltd. DOM4J is a registered
 736    * trademark of MetaStuff, Ltd.
 737    *
 738    * 5. Due credit should be given to the DOM4J Project -
 739    * http://www.dom4j.org
 740    *
 741    * THIS SOFTWARE IS PROVIDED BY METASTUFF, LTD. AND CONTRIBUTORS
 742    * ``AS IS'' AND ANY EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT
 743    * NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
 744    * FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
 745    * METASTUFF, LTD. OR ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
 746    * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
 747    * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
 748    * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 749    * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
 750    * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
 751    * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
 752    * OF THE POSSIBILITY OF SUCH DAMAGE.
 753    *
 754    * Copyright 2001-2004 (C) MetaStuff, Ltd. All Rights Reserved.
 755    *
 756    * $Id: HTMLWriter.java,v 1.19 2004/06/25 08:03:36 maartenc Exp $
 757    */