HTML 5

Draft Recommendation — 7 July 2008

4.8 Tabular data

4.8.1 Introduction

This section is non-normative.

...examples, how to write tables accessibly, a brief mention of the table model, etc...

4.8.2 The table element

Categories
Flow content.
Contexts in which this element may be used:
Where flow content is expected.
Content model:
In this order: optionally a caption element, followed by either zero or more colgroup elements, followed optionally by a thead element, followed optionally by a tfoot element, followed by either zero or more tbody elements or one or more tr elements, followed optionally by a tfoot element (but there can only be one tfoot element child in total).
Element-specific attributes:
None.
DOM interface:
interface HTMLTableElement : HTMLElement {
           attribute HTMLTableCaptionElement caption;
  HTMLElement createCaption();
  void deleteCaption();
           attribute HTMLTableSectionElement tHead;
  HTMLElement createTHead();
  void deleteTHead();
           attribute HTMLTableSectionElement tFoot;
  HTMLElement createTFoot();
  void deleteTFoot();
  readonly attribute HTMLCollection tBodies;
  HTMLElement createTBody();
  readonly attribute HTMLCollection rows;
  HTMLElement insertRow(in long index);
  void deleteRow(in long index);
};

The table element represents data with more than one dimension (a table).

we need some editorial text on how layout tables are bad practice and non-conforming

The children of a table element must be, in order:

  1. Zero or one caption elements.

  2. Zero or more colgroup elements.

  3. Zero or one thead elements.

  4. Zero or one tfoot elements, if the last element in the table is not a tfoot element.

  5. Either:

  6. Zero or one tfoot element, if there are no other tfoot elements in the table.

The table element takes part in the table model.

The caption DOM attribute must return, on getting, the first caption element child of the table element, if any, or null otherwise. On setting, if the new value is a caption element, the first caption element child of the table element, if any, must be removed, and the new value must be inserted as the first node of the table element. If the new value is not a caption element, then a HIERARCHY_REQUEST_ERR DOM exception must be raised instead.

The createCaption() method must return the first caption element child of the table element, if any; otherwise a new caption element must be created, inserted as the first node of the table element, and then returned.

The deleteCaption() method must remove the first caption element child of the table element, if any.

The tHead DOM attribute must return, on getting, the first thead element child of the table element, if any, or null otherwise. On setting, if the new value is a thead element, the first thead element child of the table element, if any, must be removed, and the new value must be inserted immediately before the first element in the table element that is neither a caption element nor a colgroup element, if any, or at the end of the table otherwise. If the new value is not a thead element, then a HIERARCHY_REQUEST_ERR DOM exception must be raised instead.

The createTHead() method must return the first thead element child of the table element, if any; otherwise a new thead element must be created and inserted immediately before the first element in the table element that is neither a caption element nor a colgroup element, if any, or at the end of the table otherwise, and then that new element must be returned.

The deleteTHead() method must remove the first thead element child of the table element, if any.

The tFoot DOM attribute must return, on getting, the first tfoot element child of the table element, if any, or null otherwise. On setting, if the new value is a tfoot element, the first tfoot element child of the table element, if any, must be removed, and the new value must be inserted immediately before the first element in the table element that is neither a caption element, a colgroup element, nor a thead element, if any, or at the end of the table if there are no such elements. If the new value is not a tfoot element, then a HIERARCHY_REQUEST_ERR DOM exception must be raised instead.

The createTFoot() method must return the first tfoot element child of the table element, if any; otherwise a new tfoot element must be created and inserted immediately before the first element in the table element that is neither a caption element, a colgroup element, nor a thead element, if any, or at the end of the table if there are no such elements, and then that new element must be returned.

The deleteTFoot() method must remove the first tfoot element child of the table element, if any.

The tBodies attribute must return an HTMLCollection rooted at the table node, whose filter matches only tbody elements that are children of the table element.

The createTBody() method must create a new tbody element, insert it immediately after the last tbody element in the table element, if any, or at the end of the table element if the table element has no tbody element children, and then must return the new tbody element.

The rows attribute must return an HTMLCollection rooted at the table node, whose filter matches only tr elements that are either children of the table element, or children of thead, tbody, or tfoot elements that are themselves children of the table element. The elements in the collection must be ordered such that those elements whose parent is a thead are included first, in tree order, followed by those elements whose parent is either a table or tbody element, again in tree order, followed finally by those elements whose parent is a tfoot element, still in tree order.

The behavior of the insertRow(index) method depends on the state of the table. When it is called, the method must act as required by the first item in the following list of conditions that describes the state of the table and the index argument:

If index is less than −1 or greater than the number of elements in rows collection:
The method must raise an INDEX_SIZE_ERR exception.
If the rows collection has zero elements in it, and the table has no tbody elements in it:
The method must create a tbody element, then create a tr element, then append the tr element to the tbody element, then append the tbody element to the table element, and finally return the tr element.
If the rows collection has zero elements in it:
The method must create a tr element, append it to the last tbody element in the table, and return the tr element.
If index is equal to −1 or equal to the number of items in rows collection:
The method must create a tr element, and append it to the parent of the last tr element in the rows collection. Then, the newly created tr element must be returned.
Otherwise:
The method must create a tr element, insert it immediately before the indexth tr element in the rows collection, in the same parent, and finally must return the newly created tr element.

When the deleteRow(index) method is called, the user agent must run the following steps:

  1. If index is equal to −1, then index must be set to the number if items in the rows collection, minus one.

  2. Now, if index is less than zero, or greater than or equal to the number of elements in the rows collection, the method must instead raise an INDEX_SIZE_ERR exception, and these steps must be aborted.

  3. Otherwise, the method must remove the indexth element in the rows collection from its parent.

4.8.3 The caption element

Categories
None.
Contexts in which this element may be used:
As the first element child of a table element.
Content model:
Phrasing content.
Element-specific attributes:
None.
DOM interface:
Uses HTMLElement.

The caption element represents the title of the table that is its parent, if it has a parent and that is a table element.

The caption element takes part in the table model.

4.8.4 The colgroup element

Categories
None.
Contexts in which this element may be used:
As a child of a table element, after any caption elements and before any thead, tbody, tfoot, and tr elements.
Content model:
Zero or more col elements.
Element-specific attributes:
span
DOM interface:
interface HTMLTableColElement : HTMLElement {
           attribute unsigned long span;
};

The colgroup element represents a group of one or more columns in the table that is its parent, if it has a parent and that is a table element.

If the colgroup element contains no col elements, then the element may have a span content attribute specified, whose value must be a valid non-negative integer greater than zero.

The colgroup element and its span attribute take part in the table model.

The span DOM attribute must reflect the content attribute of the same name. The value must be limited to only positive non-zero numbers.

4.8.5 The col element

Categories
None.
Contexts in which this element may be used:
As a child of a colgroup element that doesn't have a span attribute.
Content model:
Empty.
Element-specific attributes:
span
DOM interface:

HTMLTableColElement, same as for colgroup elements. This interface defines one member, span.

If a col element has a parent and that is a colgroup element that itself has a parent that is a table element, then the col element represents one or more columns in the column group represented by that colgroup.

The element may have a span content attribute specified, whose value must be a valid non-negative integer greater than zero.

The col element and its span attribute take part in the table model.

The span DOM attribute must reflect the content attribute of the same name. The value must be limited to only positive non-zero numbers.

4.8.6 The tbody element

Categories
None.
Contexts in which this element may be used:
As a child of a table element, after any caption, colgroup, and thead elements, but only if there are no tr elements that are children of the table element.
Content model:
Zero or more tr elements
Element-specific attributes:
None.
DOM interface:
interface HTMLTableSectionElement : HTMLElement {
  readonly attribute HTMLCollection rows;
  HTMLElement insertRow(in long index);
  void deleteRow(in long index);
};

The HTMLTableSectionElement interface is also used for thead and tfoot elements.

The tbody element represents a block of rows that consist of a body of data for the parent table element, if the tbody element has a parent and it is a table.

The tbody element takes part in the table model.

The rows attribute must return an HTMLCollection rooted at the element, whose filter matches only tr elements that are children of the element.

The insertRow(index) method must, when invoked on an element table section, act as follows:

If index is less than −1 or greater than the number of elements in the rows collection, the method must raise an INDEX_SIZE_ERR exception.

If index is equal to −1 or equal to the number of items in the rows collection, the method must create a tr element, append it to the element table section, and return the newly created tr element.

Otherwise, the method must create a tr element, insert it as a child of the table section element, immediately before the indexth tr element in the rows collection, and finally must return the newly created tr element.

The deleteRow(index) method must remove the indexth element in the rows collection from its parent. If index is less than zero or greater than or equal to the number of elements in the rows collection, the method must instead raise an INDEX_SIZE_ERR exception.

4.8.7 The thead element

Categories
None.
Contexts in which this element may be used:
As a child of a table element, after any caption, and colgroup elements and before any tbody, tfoot, and tr elements, but only if there are no other thead elements that are children of the table element.
Content model:
Zero or more tr elements
Element-specific attributes:
None.
DOM interface:
HTMLTableSectionElement, as defined for tbody elements.

The thead element represents the block of rows that consist of the column labels (headers) for the parent table element, if the thead element has a parent and it is a table.

The thead element takes part in the table model.

4.8.8 The tfoot element

Categories
None.
Contexts in which this element may be used:
As a child of a table element, after any caption, colgroup, and thead elements and before any tbody and tr elements, but only if there are no other tfoot elements that are children of the table element.
As a child of a table element, after any caption, colgroup, thead, tbody, and tr elements, but only if there are no other tfoot elements that are children of the table element.
Content model:
Zero or more tr elements
Element-specific attributes:
None.
DOM interface:
HTMLTableSectionElement, as defined for tbody elements.

The tfoot element represents the block of rows that consist of the column summaries (footers) for the parent table element, if the tfoot element has a parent and it is a table.

The tfoot element takes part in the table model.

4.8.9 The tr element

Categories
None.
Contexts in which this element may be used:
As a child of a thead element.
As a child of a tbody element.
As a child of a tfoot element.
As a child of a table element, after any caption, colgroup, and thead elements, but only if there are no tbody elements that are children of the table element.
Content model:
Zero or more td or th elements
Element-specific attributes:
None.
DOM interface:
interface HTMLTableRowElement : HTMLElement {
  readonly attribute long rowIndex;
  readonly attribute long sectionRowIndex;
  readonly attribute HTMLCollection cells;
  HTMLElement insertCell(in long index);
  void deleteCell(in long index);
};

The tr element represents a row of cells in a table.

The tr element takes part in the table model.

The rowIndex attribute must, if the element has a parent table element, or a parent tbody, thead, or tfoot element and a grandparent table element, return the index of the tr element in that table element's rows collection. If there is no such table element, then the attribute must return −1.

The sectionRowIndex attribute must, if the element has a parent table, tbody, thead, or tfoot element, return the index of the tr element in the parent element's rows collection (for tables, that's the rows collection; for table sections, that's the rows collection). If there is no such parent element, then the attribute must return −1.

The cells attribute must return an HTMLCollection rooted at the tr element, whose filter matches only td and th elements that are children of the tr element.

The insertCell(index) method must act as follows:

If index is less than −1 or greater than the number of elements in the cells collection, the method must raise an INDEX_SIZE_ERR exception.

If index is equal to −1 or equal to the number of items in cells collection, the method must create a td element, append it to the tr element, and return the newly created td element.

Otherwise, the method must create a td element, insert it as a child of the tr element, immediately before the indexth td or th element in the cells collection, and finally must return the newly created td element.

The deleteCell(index) method must remove the indexth element in the cells collection from its parent. If index is less than zero or greater than or equal to the number of elements in the cells collection, the method must instead raise an INDEX_SIZE_ERR exception.

4.8.10 The td element

Categories
Sectioning root.
Contexts in which this element may be used:
As a child of a tr element.
Content model:
Flow content.
Element-specific attributes:
colspan
rowspan
headers
DOM interface:
interface HTMLTableDataCellElement : HTMLTableCellElement {
           attribute DOMString headers;
};

The td element represents a data cell in a table.

The td element may have a headers content attribute specified. The headers attribute, if specified, must contain a string consisting of an unordered set of unique space-separated tokens, each of which must have the value of an ID of a th element taking part in the same table as the td element (as defined by the table model).

The exact effect of the attribute is described in detail in the algorithm for assigning header cells to data cells, which user agents must apply to determine the relationships between data cells and header cells.

The td element and its colspan and rowspan attributes take part in the table model.

The headers DOM attribute must reflect the content attribute of the same name.

4.8.11 The th element

Categories
None.
Contexts in which this element may be used:
As a child of a tr element.
Content model:
Phrasing content.
Element-specific attributes:
colspan
rowspan
scope
DOM interface:
interface HTMLTableHeaderCellElement : HTMLTableCellElement {
           attribute DOMString scope;
};

The th element represents a header cell in a table.

The th element may have a scope content attribute specified. The scope attribute is an enumerated attribute with five states, four of which have explicit keywords:

The row keyword, which maps to the row state
The row state means the header cell applies to all the remaining cells in the row.
The col keyword, which maps to the column state
The column state means the header cell applies to all the remaining cells in the column.
The rowgroup keyword, which maps to the row group state
The row group state means the header cell applies to all the remaining cells in the row group.
The colgroup keyword, which maps to the column group state
The column group state means the header cell applies to all the remaining cells in the column group.
The auto state
The auto state makes the header cell apply to a set of cells selected based on context.

The scope attribute's missing value default is the auto state.

The exact effect of these values is described in detail in the algorithm for assigning header cells to data cells, which user agents must apply to determine the relationships between data cells and header cells.

The th element and its colspan and rowspan attributes take part in the table model.

The scope DOM attribute must reflect the content attribute of the same name.

4.8.12 Attributes common to td and th elements

The td and th elements may have a colspan content attribute specified, whose value must be a valid non-negative integer greater than zero.

The td and th elements may also have a rowspan content attribute specified, whose value must be a valid non-negative integer.

The td and th elements implement interfaces that inherit from the HTMLTableCellElement interface:

interface HTMLTableCellElement : HTMLElement {
           attribute long colSpan;
           attribute long rowSpan;
  readonly attribute long cellIndex;
};

The colSpan DOM attribute must reflect the content attribute of the same name. The value must be limited to only positive non-zero numbers.

The rowSpan DOM attribute must reflect the content attribute of the same name. Its default value, which must be used if parsing the attribute as a non-negative integer returns an error, is also 1.

The cellIndex DOM attribute must, if the element has a parent tr element, return the index of the cell's element in the parent element's cells collection. If there is no such parent element, then the attribute must return 0.

4.8.13 Processing model

The various table elements and their content attributes together define the table model.

A table consists of cells aligned on a two-dimensional grid of slots with coordinates (x, y). The grid is finite, and is either empty or has one or more slots. If the grid has one or more slots, then the x coordinates are always in the range 0 ≤ x < xwidth, and the y coordinates are always in the range 0 ≤ y < yheight. If one or both of xwidth and yheight are zero, then the table is empty (has no slots). Tables correspond to table elements.

A cell is a set of slots anchored at a slot (cellx, celly), and with a particular width and height such that the cell covers all the slots with coordinates (x, y) where cellx ≤ x < cellx+width and celly ≤ y < celly+height. Cells can either be data cells or header cells. Data cells correspond to td elements, and have zero or more associated header cells. Header cells correspond to th elements.

A row is a complete set of slots from x=0 to x=xwidth-1, for a particular value of y. Rows correspond to tr elements.

A column is a complete set of slots from y=0 to y=yheight-1, for a particular value of x. Columns can correspond to col elements, but in the absence of col elements are implied.

A row group is a set of rows anchored at a slot (0, groupy) with a particular height such that the row group covers all the slots with coordinates (x, y) where 0 ≤ x < xwidth and groupy ≤ y < groupy+height. Row groups correspond to tbody, thead, and tfoot elements. Not every row is necessarily in a row group.

A column group is a set of columns anchored at a slot (groupx, 0) with a particular width such that the column group covers all the slots with coordinates (x, y) where groupx ≤ x < groupx+width and 0 ≤ y < yheight. Column groups correspond to colgroup elements. Not every column is necessarily in a column group.

Row groups cannot overlap each other. Similarly, column groups cannot overlap each other.

A cell cannot cover slots that are from two or more row groups. It is, however, possible for a cell to be in multiple column groups. All the slots that form part of one cell are part of zero or one row groups and zero or more column groups.

In addition to cells, columns, rows, row groups, and column groups, tables can have a caption element associated with them. This gives the table a heading, or legend.

A table model error is an error with the data represented by table elements and their descendants. Documents must not have table model errors.

4.8.13.1. Forming a table

To determine which elements correspond to which slots in a table associated with a table element, to determine the dimensions of the table (xwidth and yheight), and to determine if there are any table model errors, user agents must use the following algorithm:

  1. Let xwidth be zero.

  2. Let yheight be zero.

  3. Let pending tfoot elements be a list of tfoot elements, initially empty.

  4. Let the table be the table represented by the table element. The xwidth and yheight variables give the table's dimensions. The table is initially empty.

  5. If the table element has no children elements, then return the table (which will be empty), and abort these steps.

  6. Associate the first caption element child of the table element with the table. If there are no such children, then it has no associated caption element.

  7. Let the current element be the first element child of the table element.

    If a step in this algorithm ever requires the current element to be advanced to the next child of the table when there is no such next child, then the user agent must jump to the step labeled end, near the end of this algorithm.

  8. While the current element is not one of the following elements, advance the current element to the next child of the table:

  9. If the current element is a colgroup, follow these substeps:

    1. Column groups: Process the current element according to the appropriate case below:

      If the current element has any col element children

      Follow these steps:

      1. Let xstart have the value of xwidth.

      2. Let the current column be the first col element child of the colgroup element.

      3. Columns: If the current column col element has a span attribute, then parse its value using the rules for parsing non-negative integers.

        If the result of parsing the value is not an error or zero, then let span be that value.

        Otherwise, if the col element has no span attribute, or if trying to parse the attribute's value resulted in an error, then let span be 1.

      4. Increase xwidth by span.

      5. Let the last span columns in the table correspond to the current column col element.

      6. If current column is not the last col element child of the colgroup element, then let the current column be the next col element child of the colgroup element, and return to the step labeled columns.

      7. Let all the last columns in the table from x=xstart to x=xwidth-1 form a new column group, anchored at the slot (xstart, 0), with width xwidth-xstart, corresponding to the colgroup element.

      If the current element has no col element children
      1. If the colgroup element has a span attribute, then parse its value using the rules for parsing non-negative integers.

        If the result of parsing the value is not an error or zero, then let span be that value.

        Otherwise, if the colgroup element has no span attribute, or if trying to parse the attribute's value resulted in an error, then let span be 1.

      2. Increase xwidth by span.

      3. Let the last span columns in the table form a new column group, anchored at the slot (xwidth-span, 0), with width span, corresponding to the colgroup element.

    2. Advance the current element to the next child of the table.

    3. While the current element is not one of the following elements, advance the current element to the next child of the table:

    4. If the current element is a colgroup element, jump to the step labeled column groups above.

  10. Let ycurrent be zero.

  11. Let the list of downward-growing cells be an empty list.

  12. Rows: While the current element is not one of the following elements, advance the current element to the next child of the table:

  13. If the current element is a tr, then run the algorithm for processing rows, advance the current element to the next child of the table, and return to the step labeled rows.

  14. Run the algorithm for ending a row group.

  15. If the current element is a tfoot, then add that element to the list of pending tfoot elements, advance the current element to the next child of the table, and return to the step labeled rows.

  16. The current element is either a thead or a tbody.

    Run the algorithm for processing row groups.

  17. Advance the current element to the next child of the table.

  18. Return to the step labeled rows.

  19. End: For each tfoot element in the list of pending tfoot elements, in tree order, run the algorithm for processing row groups.

  20. If there exists a row or column in the table the table containing only slots that do not have a cell anchored to them, then this is a table model error.

  21. Return the table.

The algorithm for processing row groups, which is invoked by the set of steps above for processing thead, tbody, and tfoot elements, is:

  1. Let ystart have the value of yheight.

  2. For each tr element that is a child of the element being processed, in tree order, run the algorithm for processing rows.

  3. If yheight > ystart, then let all the last rows in the table from y=ystart to y=yheight-1 form a new row group, anchored at the slot with coordinate (0, ystart), with height yheight-ystart, corresponding to the current element.

  4. Run the algorithm for ending a row group.

The algorithm for ending a row group, which is invoked by the set of steps above when starting and ending a block of rows, is:

  1. While ycurrent is less than yheight, follow these steps:

    1. Run the algorithm for growing downward-growing cells.

    2. Increase ycurrent by 1.

  2. Empty the list of downward-growing cells.

The algorithm for processing rows, which is invoked by the set of steps above for processing tr elements, is:

  1. If yheight is equal to ycurrent, then increase yheight by 1. (ycurrent is never greater than yheight.)

  2. Let xcurrent be 0.

  3. Let current cell be the first td or th element in the tr element being processed.

  4. Run the algorithm for growing downward-growing cells.

  5. Cells: While xcurrent is less than xwidth and the slot with coordinate (xcurrent, ycurrent) already has a cell assigned to it, increase xcurrent by 1.

  6. If xcurrent is equal to xwidth, increase xwidth by 1. (xcurrent is never greater than xwidth.)

  7. If the current cell has a colspan attribute, then parse that attribute's value, and let colspan be the result.

    If parsing that value failed, or returned zero, or if the attribute is absent, then let colspan be 1, instead.

  8. If the current cell has a rowspan attribute, then parse that attribute's value, and let rowspan be the result.

    If parsing that value failed or if the attribute is absent, then let rowspan be 1, instead.

  9. If rowspan is zero, then let cell grows downward be true, and set rowspan to 1. Otherwise, let cell grows downward be false.

  10. If xwidth < xcurrent+colspan, then let xwidth be xcurrent+colspan.

  11. If yheight < ycurrent+rowspan, then let yheight be ycurrent+rowspan.

  12. Let the slots with coordinates (x, y) such that xcurrent ≤ x < xcurrent+colspan and ycurrent ≤ y < ycurrent+rowspan be covered by a new cell c, anchored at (xcurrent, ycurrent), which has width colspan and height rowspan, corresponding to the current cell element.

    If the current cell element is a th element, let this new cell c be a header cell; otherwise, let it be a data cell. To establish what header cells apply to a data cell, use the algorithm for assigning header cells to data cells described in the next section.

    If any of the slots involved already had a cell covering them, then this is a table model error. Those slots now have two cells overlapping.

  13. If cell grows downward is true, then add the tuple {c, xcurrent, colspan} to the list of downward-growing cells.

  14. Increase xcurrent by colspan.

  15. If current cell is the last td or th element in the tr element being processed, then increase ycurrent by 1, abort this set of steps, and return to the algorithm above.

  16. Let current cell be the next td or th element in the tr element being processed.

  17. Return to step 5 (cells).

When the algorithms above require the user agent to run the algorithm for growing downward-growing cells, the user agent must, for each {cell, cellx, width} tuple in the list of downward-growing cells, if any, extend the cell cell so that it also covers the slots with coordinates (x, ycurrent), where cellx ≤ x < cellx+width.

4.8.13.2. Forming relationships between data cells and header cells

Each data cell can be assigned zero or more header cells. The algorithm for assigning header cells to data cells is as follows.

  1. For each header cell in the table, in tree order, run these substeps:

    1. Let (headerx, headery) be the coordinate of the slot to which the header cell is anchored.

    2. Let headerwidth be the width of the header cell.

    3. Let headerheight be the height of the header cell.

    4. Let data cells be a list of data cells, initially empty.

    5. Examine the scope attribute of the th element corresponding to the header cell, and, based on its state, apply the appropriate substep:

      If it is in the row state

      Add all the data cells that cover slots with coordinates (slotx, sloty), where headerx+headerwidth ≤ slotx < xwidth and headery ≤ sloty < headery+headerheight, to the data cells list.

      If it is in the column state

      Add all the data cells that cover slots with coordinates (slotx, sloty), where headerx ≤ slotx < headerx+headerwidth and headery+headerheight ≤ sloty < yheight, to the data cells list.

      If it is in the row group state

      If the header cell is not in a row group, then do nothing.

      Otherwise, let (0, groupy) be the slot at which the row group is anchored, let height be the number of rows in the row group, and add all the data cells that cover slots with coordinates (slotx, sloty), where headerx ≤ slotx < xwidth and headery ≤ sloty < groupy+height, to the data cells list.

      If it is in the column group state

      If the header cell is not anchored in a column group, then do nothing.

      Otherwise, let (groupx, 0) be the slot at which that column group is anchored, let width be the number of columns in the column group, and add all the data cells that cover slots with coordinates (slotx, sloty), where headerx ≤ slotx < groupx+width and headery ≤ sloty < yheight, to the data cells list.

      Otherwise, it is in the auto state

      Run these steps:

      1. If the header cell is equivalent to a wide cell, let headerwidth equal xwidth-headerx. [UNICODE]

      2. Let x equal headerx+headerwidth.

      3. Horizontal: If x is equal to xwidth, then jump down to the step below labeled vertical.

      4. If there is a header cell anchored at (x, headery) with height headerheight, then jump down to the step below labeled vertical.

      5. Add all the data cells that cover slots with coordinates (slotx, sloty), where slotx = x and headery ≤ sloty < headery+headerheight, to the data cells list.

      6. Increase x by 1.

      7. Jump up to the step above labeled horizontal.

      8. Vertical: Let y equal headery+headerheight.

      9. If y is equal to yheight, then jump to the step below labeled end.

      10. If there is a header cell cell anchored at (headerx, y), then follow these substeps:

        1. If the header cell cell is equivalent to a wide cell, then let width be xwidth-headerx. Otherwise, let width be the width of the header cell cell.

        2. If width is equal to headerwidth, then jump to the step below labeled end.

      11. Add all the data cells that cover slots with coordinates (slotx, sloty), where headerx ≤ slotx < headerx+headerwidth and sloty = y, to the data cells list.

      12. Increase y by 1.

      13. Jump up to the step above labeled vertical.

      14. End: Coalesce all the duplicate entries in the data cells list, so that each data cell is only present once, in tree order.

    6. Assign the header cell to all the data cells in the data cells list that correspond to td elements that do not have a headers attribute specified.

  2. For each data cell in the table, in tree order, run these substeps:

    1. If the data cell corresponds to a td element that does not have a headers attribute specified, then skip these substeps and move on to the next data cell (if any).

    2. Otherwise, take the value of the headers attribute and split it on spaces, letting id list be the list of tokens obtained.

    3. For each token in the id list, run the following steps:

      1. Let id be the token.

      2. If there is a header cell in the table whose corresponding th element has an ID that is equal to the value of id, then assign that header cell to the data cell.

A header cell anchored at (headerx, headery) with width headerwidth and height headerheight is said to be equivalent to a wide cell if all the slots with coordinates (slotx, sloty), where headerx+headerwidth ≤ slotx < xwidth and headery ≤ sloty < headery+headerheight, are all either empty or covered by empty data cells.

A data cell is said to be an empty data cell if it contains no elements and its text content, if any, consists only of characters in the Unicode character class Zs. [UNICODE]

User agents may remove empty data cells when analyzing data in a table.

4.9 Forms

This section will contain definitions of the form element and so forth.

This section will be a rewrite of the HTML4 Forms and Web Forms 2.0 specifications, with hopefully no normative changes.

4.9.1 The form element

4.9.2 The fieldset element

4.9.3 The input element

4.9.4 The button element

4.9.5 The label element

4.9.6 The select element

4.9.7 The datalist element

4.9.8 The optgroup element

4.9.9 The option element

4.9.10 The textarea element

4.9.11 The output element

4.9.12 Processing model

See WF2 for now

4.9.12.1. Form submission

See WF2 for now

If a form is in a browsing context whose sandboxed forms browsing context flag is set, it must not be submitted.

4.10 Scripting

Scripts allow authors to add interactivity to their documents.

Authors are encouraged to use declarative alternatives to scripting where possible, as declarative mechanisms are often more maintainable, and many users disable scripting.

For example, instead of using script to show or hide a section to show more details, the details element could be used.

Authors are also encouraged to make their applications degrade gracefully in the absence of scripting support.

For example, if an author provides a link in a table header to dynamically resort the table, the link could also be made to function without scripts by requesting the sorted table from the server.

4.10.1 The script element

Categories
Metadata content.
Phrasing content.
Contexts in which this element may be used:
Where metadata content is expected.
Where phrasing content is expected.
Content model:
If there is no src attribute, depends on the value of the type attribute.
If there is a src attribute, the element must be empty.
Element-specific attributes:
src
async
defer
type
charset
DOM interface:
interface HTMLScriptElement : HTMLElement {
           attribute DOMString src;
           attribute boolean async;
           attribute boolean defer;
           attribute DOMString type;
           attribute DOMString charset;
           attribute DOMString text;
};

The script element allows authors to include dynamic script and script data in their documents.

When used to include dynamic scripts, the scripts may either be embedded inline or may be imported from an external file using the src attribute. If the language is not that described by "text/javascript", then the type of the script's language must be given using the type attribute.

When used to include script data, the script data must be embedded inline, the format of the data must be given using the type attribute, and the src attribute must not be specified.

The type attribute gives the language of the script or format of the script data. If the attribute is present, its value must be a valid MIME type, optionally with parameters. The charset parameter must not be specified. (The default, which is used if the attribute is absent, is "text/javascript".) [RFC2046]

The src attribute, if specified, gives the address of the external script resource to use. The value of the attribute must be a valid URL identifying a script resource of the type given by the type attribute, if the attribute is present, or of the type "text/javascript", if the attribute is absent.

The charset attribute gives the character encoding of the external script resource. The attribute must not be specified if the src attribute is not present. If the attribute is set, its value must be a valid character encoding name, and must be the preferred name for that encoding. [IANACHARSET]

The encoding specified must be the encoding used by the script resource. If the charset attribute is omitted, the character encoding of the document will be used. If the script resource uses a different encoding than the document, then the attribute must be specified.

The async and defer attributes are boolean attributes that indicate how the script should be executed.

There are three possible modes that can be selected using these attributes. If the async attribute is present, then the script will be executed asynchronously, as soon as it is available. If the async attribute is not present but the defer attribute is present, then the script is executed when the page has finished parsing. If neither attribute is present, then the script is downloaded and executed immediately, before the user agent continues parsing the page. The exact processing details for these attributes is described below.

The defer attribute may be specified even if the async attribute is specified, to cause legacy Web browsers that only support defer (and not async) to fall back to the defer behavior instead of the synchronous blocking behavior that is the default.

Changing the src, type, charset, async, and defer attributes dynamically has no direct effect; these attribute are only used at specific times described below (namely, when the element is inserted into the document).

script elements have four associated pieces of metadata. The first is a flag indicating whether or not the script block has been "already executed". Initially, script elements must have this flag unset (script blocks, when created, are not "already executed"). When a script element is cloned, the "already executed" flag, if set, must be propagated to the clone when it is created. The second is a flag indicating whether the element was "parser-inserted". This flag is set by the HTML parser and is used to handle document.write() calls. The third and fourth pieces of metadata are the script's type and the script's character encoding. They are determined when the script is run, based on the attributes on the element at that time.

Running a script: When a script block is inserted into a document, the user agent must act as follows:

  1. If the script element has a type attribute and its value is the empty string, or if the script element has no type attribute but it has a language attribute and that attribute's value is the empty string, or if the script element has neither a type attribute nor a language attribute, let the script's type for this script element be "text/javascript".

    Otherwise, if the script element has a type attribute, let the script's type for this script element be the value of that attribute.

    Otherwise, the element has a language attribute; let the script's type for this script element be the concatenation of the string "text/" followed by the value of the language attribute.

  2. If the script element has a charset attribute, then let the script's character encoding for this script element be the encoding given by the charset attribute.

    Otherwise, let the script's character encoding for this script element be the same as the encoding of the document itself.

  3. If the script element is without script, or if the script element was created by an XML parser that itself was created as part of the processing of the innerHTML attribute's setter, or if the user agent does not support the scripting language given by the script's type for this script element, or if the script element has its "already executed" flag set, then the user agent must abort these steps at this point. The script is not executed.

  4. The user agent must set the element's "already executed" flag.

  5. If the element has a src attribute, then a load for the specified content must be started.

    Later, once the load has completed, the user agent will have to complete the steps described below.

    For performance reasons, user agents may start loading the script as soon as the attribute is set, instead, in the hope that the element will be inserted into the document. Either way, once the element is inserted into the document, the load must have started. If the UA performs such prefetching, but the element is never inserted in the document, or the src attribute is dynamically changed, then the user agent will not execute the script, and the load will have been effectively wasted.

  6. Then, the first of the following options that describes the situation must be followed:

    If the document is still being parsed, and the element has a defer attribute, and the element does not have an async attribute
    The element must be added to the end of the list of scripts that will execute when the document has finished parsing. The user agent must begin the next set of steps when the script is ready. This isn't compatible with IE for inline deferred scripts, but then what IE does is pretty hard to pin down exactly. Do we want to keep this like it is? Be more compatible?
    If the element has an async attribute and a src attribute
    The element must be added to the end of the list of scripts that will execute asynchronously. The user agent must jump to the next set of steps once the script is ready.
    If the element has an async attribute but no src attribute, and the list of scripts that will execute asynchronously is not empty
    The element must be added to the end of the list of scripts that will execute asynchronously.
    If the element has a src attribute and has been flagged as "parser-inserted"
    The element is the pending external script. (There can only be one such script at a time.)
    If the element has a src attribute
    The element must be added to the end of the list of scripts that will execute as soon as possible. The user agent must jump to the next set of steps when the script is ready.
    Otherwise
    The user agent must immediately execute the script, even if other scripts are already executing.

When a script completes loading: If a script whose element was added to one of the lists mentioned above completes loading while the document is still being parsed, then the parser handles it. Otherwise, when a script completes loading, the UA must run the following steps as soon as as any other scripts that may be executing have finished executing:

If the script's element was added to the list of scripts that will execute when the document has finished parsing:
  1. If the script's element is not the first element in the list, then do nothing yet. Stop going through these steps.

  2. Otherwise, execute the script (that is, the script associated with the first element in the list).

  3. Remove the script's element from the list (i.e. shift out the first entry in the list).

  4. If there are any more entries in the list, and if the script associated with the element that is now the first in the list is already loaded, then jump back to step two to execute it.

If the script's element was added to the list of scripts that will execute asynchronously:
  1. If the script is not the first element in the list, then do nothing yet. Stop going through these steps.

  2. Execute the script (the script associated with the first element in the list).

  3. Remove the script's element from the list (i.e. shift out the first entry in the list).

  4. If there are any more scripts in the list, and the element now at the head of the list had no src attribute when it was added to the list, or had one, but its associated script has finished loading, then jump back to step two to execute the script associated with this element.

If the script's element was added to the list of scripts that will execute as soon as possible:
  1. Execute the script.

  2. Remove the script's element from the list.

If the script is the pending external script:

The script will be handled when the parser resumes.

The download of an external script must delay the load event.

Executing a script block: When the steps above require that the script be executed, the user agent must act as follows:

If the load resulted in an error (for example a DNS error, or an HTTP 404 error)

Executing the script must just consist of firing an error event at the element.

If the load was successful
  1. If the script element's Document is the active document in its browsing context, the user agent must execute the script:

    If the script is from an external file

    That file must be used as the file to execute.

    The file must be interpreted using the character encoding given by the script's character encoding, regardless of any metadata given by the file's Content-Type metadata.

    This means that a UTF-16 document will always assume external scripts are UTF-16...? This applies, e.g., to document's created using createDocument()... It also means changing document.charSet will affect the character encoding used to interpret scripts, is that really what happens?

    If the script is inline

    For scripting languages that consist of pure text, user agents must use the value of the DOM text attribute (defined below) as the script to execute, and for XML-based scripting languages, user agents must use all the child nodes of the script element as the script to execute.

    In any case, the user agent must execute the script according to the semantics defined by the language associated with the script's type (see the scripting languages section below).

    The script execution context of the script must be the Window object of that browsing context.

    The script document context of the script must be the Document object that owns the script element.

    The element's attributes' values might have changed between when the element was inserted into the document and when the script has finished loading, as may its other attributes; similarly, the element itself might have been taken back out of the DOM, or had other changes made. These changes do not in any way affect the above steps; only the values of the attributes at the time the script element is first inserted into the document matter.

  2. Then, the user agent must fire a load event at the script element.

The DOM attributes src, type, charset, async, and defer, each must reflect the respective content attributes of the same name.

The DOM attribute text must return a concatenation of the contents of all the text nodes that are direct children of the script element (ignoring any other nodes such as comments or elements), in tree order. On setting, it must act the same way as the textContent DOM attribute.

4.10.1.1. Scripting languages

A user agent is said to support the scripting language if the script's type matches the MIME type of a scripting language that the user agent implements.

The following lists some MIME types and the languages to which they refer:

text/javascript
ECMAScript. [ECMA262]
text/javascript;e4x=1
ECMAScript with ECMAScript for XML. [ECMA357]

User agents may support other MIME types and other languages.

When examining types to determine if they support the language, user agents must not ignore unknown MIME parameters — types with unknown parameters must be assumed to be unsupported.

4.10.2 The noscript element

Categories
Metadata content.
Phrasing content.
Contexts in which this element may be used:
In a head element of an HTML document, if there are no ancestor noscript elements.
Where phrasing content is expected in HTML documents, if there are no ancestor noscript elements.
Content model:
Without script, in a head element: in any order, zero or more link elements, zero or more style elements, and zero or more meta elements.
Without script, not in a head element: transparent, but there must be no noscript element descendants.
With script: text that conforms to the requirements given in the prose.
Element-specific attributes:
None.
DOM interface:
Uses HTMLElement.

The noscript element does not represent anything. It is used to present different markup to user agents that support scripting and those that don't support scripting, by affecting how the document is parsed.

The noscript element must not be used in XML documents.

The noscript element is only effective in the HTML serialization, it has no effect in the XML serialization.

When used in HTML documents, the allowed content model is as follows:

In a head element, if the noscript element is without script, then the content model of a noscript element must contain only link, style, and meta elements. If the noscript element is with script, then the content model of a noscript element is text, except that invoking the HTML fragment parsing algorithm with the noscript element as the context element and the text contents as the input must result in a list of nodes that consists only of link, style, and meta elements.

Outside of head elements, if the noscript element is without script, then the content model of a noscript element is transparent, with the additional restriction that a noscript element must not have a noscript element as an ancestor (that is, noscript can't be nested).

Outside of head elements, if the noscript element is with script, then the content model of a noscript element is text, except that the text must be such that running the following algorithm results in a conforming document with no noscript elements and no script elements, and such that no step in the algorithm causes an HTML parser to flag a parse error:

  1. Remove every script element from the document.
  2. Make a list of every noscript element in the document. For every noscript element in that list, perform the following steps:
    1. Let the parent element be the parent element of the noscript element.
    2. Take all the children of the parent element that come before the noscript element, and call these elements the before children.
    3. Take all the children of the parent element that come after the noscript element, and call these elements the after children.
    4. Let s be the concatenation of all the text node children of the noscript element.
    5. Set the innerHTML attribute of the parent element to the value of s. (This, as a side-effect, causes the noscript element to be removed from the document.)
    6. Insert the before children at the start of the parent element, preserving their original relative order.
    7. Insert the after children at the end of the parent element, preserving their original relative order.

The noscript element has no other requirements. In particular, children of the noscript element are not exempt from form submission, scripting, and so forth, even when the element is with script.

All these contortions are required because, for historical reasons, the noscript element is handled differently by the HTML parser based on whether scripting was enabled or not when the parser was invoked. The element is not allowed in XML, because in XML the parser is not affected by such state, and thus the element would not have the desired effect.

The noscript element interacts poorly with the designMode feature. Authors are encouraged to not use noscript elements on pages that will have designMode enabled.

4.10.3 The event-source element

Categories
Metadata content.
Phrasing content.
Contexts in which this element may be used:
Where metadata content is expected.
Where phrasing content is expected.
Content model:
Empty.
Element-specific attributes:
src
DOM interface:
interface HTMLEventSourceElement : HTMLElement {
           attribute DOMString src;
};

The event-source element represents a target for events generated by a remote server.

The src attribute, if specified, must give a valid URL identifying a resource that uses the text/event-stream format.

When an event-source element with a src attribute specified is inserted into the document, and when an event-source element that is already in the document has a src attribute added, the user agent must run the add declared event source algorithm.

While an event-source element is in a document, if its src attribute is mutated, the user agent must must run the remove declared event source algorithm followed by the add declared event source algorithm.

When an event-source element with a src attribute specified is removed from a document, and when an event-source element that is in a document with a src attribute specified has its src attribute removed, the user agent must run the remove declared event source algorithm.

When it is created, an event-source element must have its current declared event source set to "undefined".

The add declared event source algorithm is as follows:

  1. Resolve the URL specified by the event-source element's src attribute.
  2. If that fails, then set the element's current declared event source to "undefined" and abort these steps.
  3. Otherwise, act as if the addEventSource() method on the event-source element had been invoked with the resulting absolute URL.
  4. Let the element's current declared event source be that absolute URL.

The remove declared event source algorithm is as follows:

  1. If the element's current declared event source is "undefined", abort these steps.
  2. Otherwise, act as if the removeEventSource() method on the event-source element had been invoked with the element's current declared event source.
  3. Let the element's current declared event source be "undefined".

There can be more than one event-source element per document, but authors should take care to avoid opening multiple connections to the same server as HTTP recommends a limit to the number of simultaneous connections that a user agent can open per server.

The src DOM attribute must reflect the content attribute of the same name.