1.0 The Example
Immediately following the Table header is a StringPool chunk.
0000000 02 00 0c 00 64 04 00 00 01 00 00 00 01 00 1c 00
0000010 d0 00 00 00 06 00 00 00 00 00 00 00 00 01 00 00
0000020 34 00 00 00 00 00 00 00 00 00 00 00 1d 00 00 00
0000030 3a 00 00 00 57 00 00 00 6d 00 00 00 8f 00 00 00
0000040 1a 1a 72 65 73 2f 64 72 61 77 61 62 6c 65 2d 6c
0000050 64 70 69 2f 69 63 6f 6e 2e 70 6e 67 00 1a 1a 72
0000060 65 73 2f 64 72 61 77 61 62 6c 65 2d 6d 64 70 69
0000070 2f 69 63 6f 6e 2e 70 6e 67 00 1a 1a 72 65 73 2f
0000080 64 72 61 77 61 62 6c 65 2d 68 64 70 69 2f 69 63
0000090 6f 6e 2e 70 6e 67 00 13 13 72 65 73 2f 6c 61 79
00000a0 6f 75 74 2f 6d 61 69 6e 2e 78 6d 6c 00 1f 1f 48
00000b0 65 6c 6c 6f 20 57 6f 72 6c 64 2c 20 50 65 6e 64
00000c0 72 61 67 6f 6e 41 63 74 69 76 69 74 79 21 00 09
00000d0 09 50 65 6e 64 72 61 67 6f 6e 00 00 00 02 1c 01
00000e0 88 03 00 00 7f 00 00 00 78 00 70 00 65 00 72 00
00000f0 2e 00 72 00 65 00 73 00 6f 00 75 00 72 00 63 00
0000100 65 00 73 00 2e 00 70 00 65 00 6e 00 64 00 72 00
0000110 61 00 67 00 6f 00 6e 00 00 00 00 00 00 00 00 00
0000120 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
...
The bytes in blue are the StringPool chunk header and those in green the StringPool chunk body.
2.0 The StringPool Chunk Header
The format of a StringPool chunk header is defined by the following C++ struct
(see frameworks/base/include/ResourceTypes.h
lines 382-410)
struct ResStringPool_header
{
struct ResChunk_header header;
// Number of strings in this pool (number of uint32_t indices that follow
// in the data).
uint32_t stringCount;
// Number of style span arrays in the pool (number of uint32_t indices
// follow the string indices).
uint32_t styleCount;
// Flags.
enum {
// If set, the string index is sorted by the string values (based
// on strcmp16()).
SORTED_FLAG = 1<<0,
// String pool is encoded in UTF-8
UTF8_FLAG = 1<<8
};
uint32_t flags;
// Index from header of the string data.
uint32_t stringsStart;
// Index from header of the style data.
uint32_t stylesStart;
};
2.1 header
The header field is a struct ResChunk_header instance.
The header.type
field is always 0x0001
(RES_STRING_POOL_TYPE
).
The header.headerSize
field is always 0x001c
.
2.2 stringCount
The stringCount
field specifies the number of strings in the StringPool.
2.3 styleCount
The styleCount
field specifies the number of strings which have associated style data in the body of this chunk. This field maybe, and in fact usually is, zero.
2.4 flags
The flags
field holds none, either, or both of the bit-flags SORTED_FLAG
and UTF8_FLAG
.
2.5 stringsStart
The stringsStart
field specifies the offset from the start of the StringPool chunk to the start of the string data in the body of this chunk.
2.6 stylesStart
The stylesStart
field specifies the offset from the start of the StringPool chunk to the start of the string style data in the body of this chunk. This field will be zero if the styleCount
field is zero.
3.0 The StringPool Chunk Body
The StringPool chunk body comprises either four sections
-
a table of string indices
-
a table of style indices
-
the string data
-
the style data
if string style data is present, or two sections
-
a table of string indices
-
the string data
if it is not.
3.1 The String Indices
Immediately following the StringPool chunk header are stringCount
32-bit integers. Each integer specifies the start of the data defining a string as an offset from the start of the string data section.
3.2 The String Data
Despite the following comment in frameworks/base/include/ResourceTypes.h
lines 370-375
At stringsStart are all of the UTF-16 strings concatenated together; each starts with a uint16_t of the string's length and each ends with a 0x0000 terminator. If a string is > 32767 characters, the high bit of the length is set meaning to take those 15 bits as a high word and it will be followed by another uint16_t containing the low word.
the string data can be in two different formats.
If the UTF8_FLAG
is not set in the flags
field then the string data format is as described in the comment, otherwise it is in a UTF-8 format.
3.2.1 The 16-bit Format
The 16-bit format is as described in the comment.
The data for each string comprises
-
the length of the string in characters
-
the 16-bit characters
-
a trailing 16-bit zero
The length is encoded as either one or two 16-bit integers as per the comment.
The length does not include the trailing zero.
3.2.2 UTF-8 Format
The UTF-8 data format can be determined by examining the code used to write it (see frameworks/base/tools/aapt/StringPool.cpp
lines 233-277) or the code used to read it (see frameworks/base/include/ResourceTypes.cpp
lines 545-564).
The data for each string comprises
-
the length of the string in characters
-
the length of the UTF-8 encoding of the string in bytes
-
the UTF-8 encoded string
-
a trailing 8-bit zero
The lengths are encoded in the same way as for the 16-bit format but using 8-bit rather than 16-bit integers.
The lengths do not include the trailing zero.
3.2.3 Padding
Irrespective of the format the string data section is always padded with zero bytes so that it ends on a 32-bit boundary. This ensures that 32-bit integer fields that follow in this chunk or in following chunks are correctly aligned.
An Aside
If you create an Android project using the Eclipse ADT plugin the string data in the StringPool chunks in the Resource Table will be in the UTF-8 format.
If you create an Android project using the android
command line tool and then build it from the command line using ant
the string data in the StringPool chunks in the Resource Table will be in the 16-bit format.
Strange but true.
3.3 The Style Indices
If present then the style indices section immediately follows the string indices section. It comprises styleCount
32-bit integers. Each integer specifies the start of the style data for a string as an offset from the start of the style data section.
The string and style indices are paired. The style data specified by the entry at index i
in the style indices is for the string specified by the entry at index i
in the string indices.
3.4 The Style Data
When present the style data section comprises styleCount
pieces of individual string style data.
The style data for an individual string comprises a sequence of instances of the C++ struct ResStringPool_span
which is defined as follows (see frameworks/base/include/ResourceTypes.h
lines 416-429)
struct ResStringPool_span
{
enum {
END = 0xFFFFFFFF
};
// This is the name of the span -- that is, the name of the XML
// tag that defined it. The special value END (0xFFFFFFFF) indicates
// the end of an array of spans.
ResStringPool_ref name;
// The range of characters in the string that this span applies to.
uint32_t firstChar, lastChar;
};
The sequence of ResStringPool_span
s for an individual string is terminated by a 32-bit integer with the value END
(0xFFFFFFFF).
The style data section itself is terminated by two further 32-bit integer each with the value END
(0xFFFFFFFF).
4.0 The Example Annotated
This is the annotated version of the StringPool chunk immediately following the Table chunk header from the example.
...
0000000c 01 00 // type [STRING_POOL]
0000000e 1c 00 // header size
00000010 d0 00 00 00 // chunk size
--------------------
00000014 06 00 00 00 // stringCount
00000018 00 00 00 00 // styleCount
0000001c 00 01 00 00 // flags
00000020 34 00 00 00 // stringsStart (address 00000040)
00000024 00 00 00 00 // stylesStart (address 0000000c)
++++++++++++++++++++
00000028 00 00 00 00 // string[0]
0000002c 1d 00 00 00 // string[1]
00000030 3a 00 00 00 // string[2]
00000034 57 00 00 00 // string[3]
00000038 6d 00 00 00 // string[4]
0000003c 8f 00 00 00 // string[5]
00000040 1a 1a 72 65 // [0] "res/drawable-ldpi/icon.png"
00000044 73 2f 64 72
00000048 61 77 61 62
0000004c 6c 65 2d 6c
00000050 64 70 69 2f
00000054 69 63 6f 6e
00000058 2e 70 6e 67
0000005c 00 1a 1a 72 // [1] "res/drawable-mdpi/icon.png"
00000060 65 73 2f 64
00000064 72 61 77 61
00000068 62 6c 65 2d
0000006c 6d 64 70 69
00000070 2f 69 63 6f
00000074 6e 2e 70 6e
00000078 67 00 1a 1a // [2] "res/drawable-hdpi/icon.png"
0000007c 72 65 73 2f
00000080 64 72 61 77
00000084 61 62 6c 65
00000088 2d 68 64 70
0000008c 69 2f 69 63
00000090 6f 6e 2e 70
00000094 6e 67 00 13 // [3] "res/layout/main.xml"
00000098 13 72 65 73
0000009c 2f 6c 61 79
000000a0 6f 75 74 2f
000000a4 6d 61 69 6e
000000a8 2e 78 6d 6c
000000ac 00 1f 1f 48 // [4] "Hello World, PendragonActivity!"
000000b0 65 6c 6c 6f
000000b4 20 57 6f 72
000000b8 6c 64 2c 20
000000bc 50 65 6e 64
000000c0 72 61 67 6f
000000c4 6e 41 63 74
000000c8 69 76 69 74
000000cc 79 21 00 09 // [5] "Pendragon"
000000d0 09 50 65 6e
000000d4 64 72 61 67
000000d8 6f 6e 00 00
==================== [End of STRING_POOL]
...
5.0 Styled Strings: An Example
For the Table’s StringPool chunk to contain any style data there must be at least one Resource which is a styled string.
If we create a vanilla Android project using ADT in Eclipse and then modify the generated strings.xml
file to look like this
<?xml version="1.0" encoding="utf-8"?>
<resources>
<string name="hello"><b>Hello</b> <u>World</u>, <i>TintagelActivity!</i></string>
<string name="app_name">Tintagel</string>
</resources>
then the resulting Resource Table’s StringPool chunk looks like this
...
0000000c 01 00 // type [STRING_POOL]
0000000e 1c 00 // header size
00000010 3c 01 00 00 // chunk size
--------------------
00000014 09 00 00 00 // stringCount
00000018 05 00 00 00 // styleCount
0000001c 00 01 00 00 // flags
00000020 54 00 00 00 // stringsStart (address 00000060)
00000024 fc 00 00 00 // stylesStart (address 00000108)
++++++++++++++++++++
00000028 00 00 00 00 // string[0]
0000002c 1d 00 00 00 // string[1]
00000030 3a 00 00 00 // string[2]
00000034 57 00 00 00 // string[3]
00000038 6d 00 00 00 // string[4]
0000003c 8e 00 00 00 // string[5]
00000040 99 00 00 00 // string[6]
00000044 9d 00 00 00 // string[7]
00000048 a1 00 00 00 // string[8]
0000004c 00 00 00 00 // style[0]
00000050 04 00 00 00 // style[1]
00000054 08 00 00 00 // style[2]
00000058 0c 00 00 00 // style[3]
0000005c 10 00 00 00 // style[4]
00000060 1a 1a 72 65 // [0] "res/drawable-ldpi/icon.png"
00000064 73 2f 64 72
00000068 61 77 61 62
0000006c 6c 65 2d 6c
00000070 64 70 69 2f
00000074 69 63 6f 6e
00000078 2e 70 6e 67
0000007c 00 1a 1a 72 // [1] "res/drawable-mdpi/icon.png"
00000080 65 73 2f 64
00000084 72 61 77 61
00000088 62 6c 65 2d
0000008c 6d 64 70 69
00000090 2f 69 63 6f
00000094 6e 2e 70 6e
00000098 67 00 1a 1a // [2] "res/drawable-hdpi/icon.png"
0000009c 72 65 73 2f
000000a0 64 72 61 77
000000a4 61 62 6c 65
000000a8 2d 68 64 70
000000ac 69 2f 69 63
000000b0 6f 6e 2e 70
000000b4 6e 67 00 13 // [3] "res/layout/main.xml"
000000b8 13 72 65 73
000000bc 2f 6c 61 79
000000c0 6f 75 74 2f
000000c4 6d 61 69 6e
000000c8 2e 78 6d 6c
000000cc 00 1e 1e 48 // [4] "Hello World, TintagelActivity!"
000000d0 65 6c 6c 6f
000000d4 20 57 6f 72
000000d8 6c 64 2c 20
000000dc 54 69 6e 74
000000e0 61 67 65 6c
000000e4 41 63 74 69
000000e8 76 69 74 79
000000ec 21 00 08 08 // [5] "Tintagel"
000000f0 54 69 6e 74
000000f4 61 67 65 6c
000000f8 00 01 01 62 // [6] "b"
000000fc 00 01 01 75 // [7] "u"
00000100 00 01 01 69 // [8] "i"
00000104 00 00 00 00
00000108 ff ff ff ff // [0] END
0000010c ff ff ff ff // [1] END
00000110 ff ff ff ff // [2] END
00000114 ff ff ff ff // [3] END
00000118 06 00 00 00 // [4][0] name
0000011c 00 00 00 00 // [4][0] firstChar
00000120 04 00 00 00 // [4][0] lastChar
00000124 07 00 00 00 // [4][1] name
00000128 06 00 00 00 // [4][1] firstChar
0000012c 0a 00 00 00 // [4][1] lastChar
00000130 08 00 00 00 // [4][2] name
00000134 0d 00 00 00 // [4][2] firstChar
00000138 1d 00 00 00 // [4][2] lastChar
0000013c ff ff ff ff // [4] END
00000140 ff ff ff ff //
00000144 ff ff ff ff //
==================== [End of STRING_POOL]
...
The first four strings in the StringPool are not styled and hence have no style data but empty entries need to be present so that the style data for the fifth string which is styled can be represented.
Another Aside
A curious thing about styled strings is that you can style them any way you like.
The example above actually works. This is the result
However changing the strings.xml
file above to look like this
<?xml version="1.0" encoding="utf-8"?>
<resources>
<string name="hello"><bold>Hello</bold> <underline>World</underline>, <italic>TintagelActivity!</italic></string>
<string name="app_name">Tintagel</string>
</resources>
still results in a StringPool chunk with style data it is just has no effect whatsoever.
Copyright (c) 2011 By Simon Lewis. All Rights Reserved.