There are many of unsuitable HTML which Another HTML-lint 5 is unable to detect. The followings are some of them.

Note: This page was written in 1998 and 1999, so some of the contents here may be out of date.

  1. Use double-byte space for indent

    This is typically used for itemization as below. (□ stands for double-byte space)

    There are some reasons.<BR>
    □□Because it is .....<BR>
    □□Because it is .....<BR>
      Date:□□□August 15th 1999<BR>
    At:□□□ABC park<BR>
    Contact:□□XYZ corporation, Mr. XXX<BR>

    It is impossible for HTML to realize the exact indent you want. The above examples may look odd in a narrowed window. You can simply change them like below and stylesheet will be required for more strict layout.

      <DL>
    <DT>There are some reasons.
    <DD>Because it is .....
    <BR>Because it is .....
    </DL>
      <TABLE>
    <TR><TD>Date:<TD>August 15th 1999</TR>
    <TR><TD>At:<TD>ABC park</TR>
    <TR><TD>Contact:<TD>XYZ corporation, Mr. XXX</TR>
    </TABLE>

    It could be assumed that a lot of people use browsers with the default settings, which means Japanese font width etc. is not configured.

  2. Enclose everything with <PRE></PRE>

    In many cases, <PRE> seems to be used without necessity. This might be because they believe wrong/unfounded theory such as "A double-byte character has to have a double width of a single-byte character". In certain cases, of course, this code can be used in a logical reason. You need to think carefully before you write <PRE> code. A small text with forced linefeed looks a bit goofy in a high definition screen.
    Also, some people use <PRE> to just adjust the numbers of characters.

      <PRE>
    □□TEL□03-5678-9XXX
    □□FAX □03-8765-4XXX
    </PRE>

    Even if it is displayed as intended in your environment, that does not mean it looks the same in other environments.

    In addition, it is a quick and easy way to enclose a fixed-width document with <PRE> when trying to htmlize it.

  3. Ignore punctuation marks and enclose whole text with <P></P>

    Some foundation website used to be a plain text with no HTML tag. Every sentence was started in a new line by 40 characters. Since the line break is simply ignored in HTML, whole text was displayed in a single line. After a while, they seemed to have noticed that and every sentence was enclosed with <P> and </P>. They probably did not have enough knowledge to use <PRE> instead. <HEAD> tag was also added. It is not wrong as HTML, but is still improper.

  4. Create a big <TABLE> for indenting reason.

    Some pages are composed of columns using <TABLE> It is not a wrong way to use <TABLE>, but causes heavy load to browsers. Such pages are only displayed after the whole table has been validated.

  5. A page containing nothing but a clickable map.

    This is just terrible.

  6. Insert line breaks with no reason

    You may often use a text editor to write HTML. Some editors can not handle a long line, thus you have to break lines in the middle of a sentence. In HTML, a line break is handled as a space, so the sentence will be displayed with upexpected spaces as below.

      There is a long sent
    ence including unexpected spaces.

    As you can see, there is a space between "sent" and "ence". This mistake may be specific to Japanese, but please remember to insert a line break in a proper place.
    Especially if you use text-based browsers such as Lynx, you need to be carefull. In text-based browsers, font width and display digit are always fixed, so you might insert a line break according to the display digit without knowing it.
    However, RFC2070 says that user agents (= www browsers) should hanlde line breaks accordingly depending on the script (E.g.: English or Japanese). Since Japanese does not include a space between words, browsers should ignore such unnecessary line breaks. In reality, only a few browsers can do those processings. It would be nice if you consider those people who use www browsers which do include unnecessary tags when you write HTML.

    Please refer to Keisuke Yano's "About line breaks in HTML" as well. However, I disagree the theory that we should start a new line every 30 to 35 characters. It should be a reader but not a writer to define where to start a new line.

    1. Japanese does not include a space between words?

      It is quite natural for English to replace a line break with a space.
      Ignoring line breaks is a theory based on a premise that Japanese does not include a space between words. In reality, however, spaces are included in particular cases.
      Therefore, it is not good to simply ignore line breaks.
      The idea of ignoring line breaks except spaces in a sentence could be a better solution, but it would still be specific to certain environments.

      In case of mixed-scripts, LANG attribute is recommended to use.

        Some Japanese text include  <SPAN lang="en">English</SPAN> as well.

      Spaces are added before and after SPAN.

        Some Japanese text include
      <SPAN lang="en">English</SPAN> as well.

      In this case, the sentence is line-broken. Instead, spaces could be included inside SPAN.

        Some Japanese text include<SPAN lang="en"> English </SPAN>as well.

      However, such a way of writing is not recommended according to HTML4.0.

    2. 35 characters per line is the best?

      This theory is applicable to the printed media such as books. However it is different when it comes to a computer display. You still can follow the theory in HTML as well if you really want to.
      In addition, some writers prefer text with line breaks and others do not. Therefore, nobody can say like "it is wrong to write 1000 characters in a line". To reuse text with a lot of line breaks, it requires you to delete all unnecessary line breaks. That means you have to read the whole text to tell which line breaks are unnecessary.

    3. It is not my job to adjust HTML as it is the fault of browsers?

      That is ture and selfish. With your small effort, a lot of people will receive its benefit, why don't you? It is not smart enough to just sit and wait for future browsers as it might take long before it is revised.
      HTML writers often make an effort concerning text-based browsers or disabled people. Please do not forget to concern ordinal environments and visitors as well.
      Simply concluding that this is a browser spec is as same as limiting browsers.
      FYI: XML1.0 handles a line break as a space.

    4. Specify font family in Stylesheet

      Although many of www browsers do not completely supported stylesheets yet, it is getting popular to create web pages with stylesheet. Those who edit such web pages must find difficulties in the spec difference between browsers. However, it still seems too early to specify font family for Japanese. There are of course a few browsers which handle it well, and surely there will be more. Yet current Macintosh Mozilla does not display Japanese font family as specified. Instead, it is handled as Euramerican font which is indecipherable. MSIE4.0 handles it correctly but still website editors should be notice this.
      This content is not applicable to the recent (2002) Mozilla.

    Do not include unnecessary line breaks in text. Most of the recent editors do not include line breaks in the middle of a sentence. Website visitors will adjust the window width on their own if they want to limit a line to 35 characters. Fortunately, many of the editors also do Japanese hyphenation, so punctuations never come to the beginning of a sentence by mistake.

  7. Specify font family in Stylesheet

    Although many of www browsers do not completely supported stylesheets yet, it is getting popular to create web pages with stylesheet. Those who edit such web pages must find difficulties in the spec difference between browsers. However, it still seems too early to specify font family for Japanese. There are of course a few browsers which handle it well, and surely there will be more. Yet current Macintosh Mozilla does not display Japanese font family as specified. Instead, it is handled as Euramerican font which is indecipherable. MSIE4.0 handles it correctly but still website editors should be notice this.
    This content is not applicable to the recent (2002) Mozilla.