summaryrefslogtreecommitdiff
path: root/tiff/html/internals.html
blob: 3cc967312542bc5f5abeac621d31d87d94d1488a (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
<HTML>
<HEAD>
<TITLE>
Modifying The TIFF Library
</TITLE>
</HEAD>
<BODY BGCOLOR=white> 
<FONT FACE="Arial, Helvetica, Sans">
<H1>
<IMG SRC=images/dave.gif WIDTH=107 HEIGHT=148 BORDER=2 ALIGN=left HSPACE=6>
Modifying The TIFF Library
</H1>


<P>
This chapter provides information about the internal structure of
the library, how to control the configuration when building it, and
how to add new support to the library.
The following sections are found in this chapter:

<UL>
<LI><A HREF=#Config>Library Configuration</A>
<LI><A HREF=#Portability>General Portability Comments</A>
<LI><A HREF="#Types">Types and Portability</A>
<LI><A HREF="addingtags.html">Adding New Tags</A>
<LI><A HREF=#AddingCODECS>Adding New Builtin Codecs</A>
<LI><A HREF="addingtags.html#AddingCODECTags">Adding New Codec-private Tags</A>
<LI><A HREF=#Other>Other Comments</A>
</UL>


<A NAME="Config"><P><HR WIDTH=65% ALIGN=right><H3>Library Configuration</H3></A>

Information on compiling the library is given
<A HREF=build.html>elsewhere in this documentation</A>.
This section describes the low-level mechanisms used to control
the optional parts of the library that are configured at build
time.   Control is based on
a collection of C defines that are specified either on the compiler
command line or in a configuration file such as <TT>port.h</TT>
(as generated by the <TT>configure</TT> script for UNIX systems)
or <B>tiffconf.h</B>.

<P>
Configuration defines are split into three areas:
<UL>
<LI>those that control which compression schemes are
    configured as part of the builtin codecs,
<LI>those that control support for groups of tags that
    are considered optional, and
<LI>those that control operating system or machine-specific support.
</UL>

<P>
If the define <TT>COMPRESSION_SUPPORT</TT> is <STRONG>not defined</STRONG>
then a default set of compression schemes is automatically
configured:
<UL>
<LI>CCITT Group 3 and 4 algorithms (compression codes 2, 3, 4, and 32771),
<LI>the Macintosh PackBits algorithm (compression 32773),
<LI>a 4-bit run-length encoding scheme from ThunderScan (compression 32809),
<LI>a 2-bit encoding scheme used by NeXT (compression 32766), and
<LI>two experimental schemes intended for images with high dynamic range
(compression 34676 and 34677).
</UL>

<P>

To override the default compression behaviour define
<TT>COMPRESSION_SUPPORT</TT> and then one or more additional defines
to enable configuration of the appropriate codecs (see the table
below); e.g.

<UL><PRE>
#define	COMPRESSION_SUPPORT
#define	CCITT_SUPPORT
#define	PACKBITS_SUPPORT
</PRE></UL>

Several other compression schemes are configured separately from
the default set because they depend on ancillary software
packages that are not distributed with <TT>libtiff</TT>.

<P>
Support for JPEG compression is controlled by <TT>JPEG_SUPPORT</TT>.
The JPEG codec that comes with <TT>libtiff</TT> is designed for
use with release 5 or later of the Independent JPEG Group's freely
available software distribution.
This software can be retrieved from the directory
<A HREF=ftp://ftp.uu.net/graphics/jpeg>ftp.uu.net:/graphics/jpeg/</A>.


<P>
<IMG SRC="images/info.gif" ALT="NOTE: " ALIGN=left HSPACE=8>
<EM>Enabling JPEG support automatically enables support for
the TIFF 6.0 colorimetry and YCbCr-related tags.</EM>

<P>
Experimental support for the deflate algorithm is controlled by
<TT>DEFLATE_SUPPORT</TT>.
The deflate codec that comes with <TT>libtiff</TT> is designed
for use with version 0.99 or later of the freely available
<TT>libz</TT> library written by Jean-loup Gailly and Mark Adler.
The data format used by this library is described
in the files
<A HREF=ftp://ftp.uu.net/pub/archiving/zip/doc/zlib-3.1.doc>zlib-3.1.doc</A>,
and
<A HREF=ftp://ftp.uu.net/pub/archiving/zip/doc/deflate-1.1.doc>deflate-1.1.doc</A>,
available in the directory
<A HREF=ftp://ftp.uu.net/pub/archiving/zip/doc>ftp.uu.net:/pub/archiving/zip/doc</A>.</EM>
The library can be retried from the directory
<A HREF=ftp://ftp.uu.net/pub/archiving/zip/zlib/>ftp.uu.net:/pub/archiving/zip/zlib/</A>
(or try <A HREF=ftp://quest.jpl.nasa.gov/beta/zlib/>quest.jpl.nasa.gov:/beta/zlib/</A>).

<P>
<IMG SRC="images/warning.gif" ALT="NOTE: " ALIGN=left HSPACE=8 VSPACE=6>
<EM>The deflate algorithm is experimental.  Do not expect
to exchange files using this compression scheme;
it is included only because the similar, and more common,
LZW algorithm is claimed to be governed by licensing restrictions.</EM>


<P>
By default <B>tiffconf.h</B> defines
<TT>COLORIMETRY_SUPPORT</TT>, 
<TT>YCBCR_SUPPORT</TT>,
and 
<TT>CMYK_SUPPORT</TT>.

<P>
<TABLE BORDER CELLPADDING=3>

<TR><TH ALIGN=left>Define</TH><TH ALIGN=left>Description</TH></TR>

<TR>
<TD VALIGN=top><TT>CCITT_SUPPORT</TT></TD>
<TD>CCITT Group 3 and 4 algorithms (compression codes 2, 3, 4,
    and 32771)</TD>
</TR>

<TR>
<TD VALIGN=top><TT>PACKBITS_SUPPORT</TT></TD>
<TD>Macintosh PackBits algorithm (compression 32773)</TD>
</TR>

<TR>
<TD VALIGN=top><TT>LZW_SUPPORT</TT></TD>
<TD>Lempel-Ziv & Welch (LZW) algorithm (compression 5)</TD>
</TR>

<TR>
<TD VALIGN=top><TT>THUNDER_SUPPORT</TT></TD>
<TD>4-bit
run-length encoding scheme from ThunderScan (compression 32809)</TD>
</TR>

<TR>
<TD VALIGN=top><TT>NEXT_SUPPORT</TT></TD>
<TD>2-bit encoding scheme used by NeXT (compression 32766)</TD>
</TR>

<TR>
<TD VALIGN=top><TT>OJPEG_SUPPORT</TT></TD>
<TD>obsolete JPEG scheme defined in the 6.0 spec (compression 6)</TD>
</TR>

<TR>
<TD VALIGN=top><TT>JPEG_SUPPORT</TT></TD>
<TD>current JPEG scheme defined in TTN2 (compression 7)</TD>
</TR>

<TR>
<TD VALIGN=top><TT>ZIP_SUPPORT</TT></TD>
<TD>experimental Deflate scheme (compression 32946)</TD>
</TR>

<TR>
<TD VALIGN=top><TT>PIXARLOG_SUPPORT</TT></TD>
<TD>Pixar's compression scheme for high-resolution color images (compression 32909)</TD>
</TR>

<TR>
<TD VALIGN=top><TT>SGILOG_SUPPORT</TT></TD>
<TD>SGI's compression scheme for high-resolution color images (compression 34676 and 34677)</TD>
</TR>

<TR>
<TD VALIGN=top><TT>COLORIMETRY_SUPPORT</TT></TD>
<TD>support for the TIFF 6.0 colorimetry tags</TD>
</TR>

<TR>
<TD VALIGN=top><TT>YCBCR_SUPPORT</TT></TD>
<TD>support for the TIFF 6.0 YCbCr-related tags</TD>
</TR>

<TR>
<TD VALIGN=top><TT>CMYK_SUPPORT</TT></TD>
<TD>support for the TIFF 6.0 CMYK-related tags</TD>
</TR>

<TR>
<TD VALIGN=top><TT>ICC_SUPPORT</TT></TD>
<TD>support for the ICC Profile tag; see
<I>The ICC Profile Format Specification</I>,
Annex B.3 "Embedding ICC Profiles in TIFF Files";
available at
<A HREF=http://www.color.org>http://www.color.org</A>
</TD>
</TR>

</TABLE>


<A NAME="Portability"><P><HR WIDTH=65% ALIGN=right><H3>General Portability Comments</H3></A>

This software is developed on Silicon Graphics UNIX
systems (big-endian, MIPS CPU, 32-bit ints,
IEEE floating point). 
The <TT>configure</TT> shell script generates the appropriate
include files and make files for UNIX systems.
Makefiles exist for non-UNIX platforms that the
code runs on -- this work has mostly been done by other people.

<P>
In general, the code is guaranteed to work only on SGI machines.
In practice it is highly portable to any 32-bit or 64-bit system and much
work has been done to insure portability to 16-bit systems.
If you encounter portability problems please return fixes so
that future distributions can be improved.

<P>
The software is written to assume an ANSI C compilation environment.
If your compiler does not support ANSI function prototypes, <TT>const</TT>,
and <TT>&lt;stdarg.h&gt;</TT> then you will have to make modifications to the
software.  In the past I have tried to support compilers without <TT>const</TT>
and systems without <TT>&lt;stdarg.h&gt;</TT>, but I am
<EM>no longer interested in these
antiquated environments</EM>.  With the general availability of
the freely available GCC compiler, I
see no reason to incorporate modifications to the software for these
purposes.

<P>
An effort has been made to isolate as many of the
operating system-dependencies
as possible in two files: <B>tiffcomp.h</B> and
<B>libtiff/tif_&lt;os&gt;.c</B>.  The latter file contains
operating system-specific routines to do I/O and I/O-related operations.
The UNIX (<B>tif_unix.c</B>),
Macintosh (<B>tif_apple.c</B>),
and VMS (<B>tif_vms.c</B>)
code has had the most use;
the MS/DOS support (<B>tif_msdos.c</B>) assumes
some level of UNIX system call emulation (i.e.
<TT>open</TT>,
<TT>read</TT>,
<TT>write</TT>,
<TT>fstat</TT>,
<TT>malloc</TT>,
<TT>free</TT>).

<P>
Native CPU byte order is determined on the fly by
the library and does not need to be specified.
The <TT>HOST_FILLORDER</TT> and <TT>HOST_BIGENDIAN</TT>
definitions are not currently used, but may be employed by
codecs for optimization purposes.

<P>
The following defines control general portability:

<P>
<TABLE BORDER CELLPADDING=3 WIDTH=100%>

<TR>
<TD VALIGN=top><TT>BSDTYPES</TT></TD>
<TD>Define this if your system does NOT define the
		usual BSD typedefs: <TT>u_char</TT>,
		<TT>u_short</TT>, <TT>u_int</TT>, <TT>u_long</TT>.</TD>
</TR>

<TR>
<TD VALIGN=top><TT>HAVE_IEEEFP</TT></TD>
<TD>Define this as 0 or 1 according to the floating point
		format suported by the machine.  If your machine does
		not support IEEE floating point then you will need to
		add support to tif_machdep.c to convert between the
		native format and IEEE format.</TD>
</TR>

<TR>
<TD VALIGN=top><TT>HAVE_MMAP</TT></TD>
<TD>Define this if there is <I>mmap-style</I> support for
mapping files into memory (used only to read data).</TD>
</TR>

<TR>
<TD VALIGN=top><TT>HOST_FILLORDER</TT></TD>
<TD>Define the native CPU bit order: one of <TT>FILLORDER_MSB2LSB</TT>
 or <TT>FILLORDER_LSB2MSB</TT></TD>
</TR>

<TR>
<TD VALIGN=top><TT>HOST_BIGENDIAN</TT></TD>
<TD>Define the native CPU byte order: 1 if big-endian (Motorola)
 or 0 if little-endian (Intel); this may be used
 in codecs to optimize code</TD>
</TR>
</TABLE>

<P>
On UNIX systems <TT>HAVE_MMAP</TT> is defined through the running of
the <TT>configure</TT> script; otherwise support for memory-mapped
files is disabled.
Note that <B>tiffcomp.h</B> defines <TT>HAVE_IEEEFP</TT> to be
1 (<TT>BSDTYPES</TT> is not defined).


<A NAME="Types"><P><HR WIDTH=65% ALIGN=right><H3>Types and Portability</H3></A>

The software makes extensive use of C typedefs to promote portability.
Two sets of typedefs are used, one for communication with clients
of the library and one for internal data structures and parsing of the
TIFF format.  There are interactions between these two to be careful
of, but for the most part you should be able to deal with portability
purely by fiddling with the following machine-dependent typedefs:


<P>
<TABLE BORDER CELLPADDING=3 WIDTH=100%>

<TR>
<TD>uint8</TD>
<TD>8-bit unsigned integer</TD>
<TD>tiff.h</TD>
</TR>

<TR>
<TD>int8</TD>
<TD>8-bit signed integer</TD>
<TD>tiff.h</TD>
</TR>

<TR>
<TD>uint16</TD>
<TD>16-bit unsigned integer</TD>
<TD>tiff.h</TD>
</TR>

<TR>
<TD>int16</TD>
<TD>16-bit signed integer</TD>
<TD>tiff.h</TD>
</TR>

<TR>
<TD>uint32</TD>
<TD>32-bit unsigned integer</TD>
<TD>tiff.h</TD>
</TR>

<TR>
<TD>int32</TD>
<TD>32-bit signed integer</TD>
<TD>tiff.h</TD>
</TR>

<TR>
<TD>dblparam_t</TD>
<TD>promoted type for floats</TD>
<TD>tiffcomp.h</TD>
</TR>

</TABLE>

<P>
(to clarify <TT>dblparam_t</TT>, it is the type that float parameters are
promoted to when passed by value in a function call.)

<P>
The following typedefs are used throughout the library and interfaces
to refer to certain objects whose size is dependent on the TIFF image
structure:


<P>
<TABLE BORDER CELLPADDING=3 WIDTH=100%>

<TR>
<TD WIDTH=25%>typedef unsigned int ttag_t;</TD>	<TD>directory tag</TD>
</TR>

<TR>
<TD>typedef uint16 tdir_t;</TD>		<TD>directory index</TD>
</TR>

<TR>
<TD>typedef uint16 tsample_t;</TD>	<TD>sample number</TD>
</TR>

<TR>
<TD>typedef uint32 tstrip_t;</TD>	<TD>strip number</TD>
</TR>

<TR>
<TD>typedef uint32 ttile_t;</TD>		<TD>tile number</TD>
</TR>

<TR>
<TD>typedef int32 tsize_t;</TD>		<TD>i/o size in bytes</TD>
</TR>

<TR>
<TD>typedef void* tdata_t;</TD>		<TD>image data ref</TD>
</TR>

<TR>
<TD>typedef void* thandle_t;</TD>	<TD>client data handle</TD>
</TR>

<TR>
<TD>typedef int32 toff_t;</TD>		<TD>file offset (should be off_t)</TD>
</TR>

<TR>
<TD>typedef unsigned char* tidata_t;</TD> <TD>internal image data</TD>
</TR>

</TABLE>

<P>
Note that <TT>tstrip_t</TT>, <TT>ttile_t</TT>, and <TT>tsize_t</TT>
are constrained to be
no more than 32-bit quantities by 32-bit fields they are stored
in in the TIFF image.  Likewise <TT>tsample_t</TT> is limited by the 16-bit
field used to store the <TT>SamplesPerPixel</TT> tag.  <TT>tdir_t</TT>
constrains
the maximum number of IFDs that may appear in an image and may
be an arbitrary size (without penalty).  <TT>ttag_t</TT> must be either
<TT>int</TT>, <TT>unsigned int</TT>, pointer, or <TT>double</TT>
because the library uses a varargs
interface and ANSI C restricts the type of the parameter before an
ellipsis to be a promoted type.  <TT>toff_t</TT> is defined as
<TT>int32</TT> because
TIFF file offsets are (unsigned) 32-bit quantities.  A signed
value is used because some interfaces return -1 on error (sigh).
Finally, note that <TT>tidata_t</TT> is used internally to the library to
manipulate internal data.  User-specified data references are
passed as opaque handles and only cast at the lowest layers where
their type is presumed.


<P><HR WIDTH=65% ALIGN=right><H3>General Comments</H3></A>

The library is designed to hide as much of the details of TIFF from
applications as
possible.  In particular, TIFF directories are read in their entirety
into an internal format.  Only the tags known by the library are
available to a user and certain tag data may be maintained that a user
does not care about (e.g. transfer function tables).

<A NAME=AddingCODECS><P><HR WIDTH=65% ALIGN=right><H3>Adding New Builtin Codecs</H3></A>

To add builtin support for a new compression algorithm, you can either
use the "tag-extension" trick to override the handling of the
TIFF Compression tag (see <A HREF=addingtags.html>Adding New Tags</A>), 
or do the following to add support directly to the core library:

<OL>
<LI>Define the tag value in <B>tiff.h</B>.
<LI>Edit the file <B>tif_codec.c</B> to add an entry to the
   _TIFFBuiltinCODECS array (see how other algorithms are handled).
<LI>Add the appropriate function prototype declaration to
   <B>tiffiop.h</B> (close to the bottom).
<LI>Create a file with the compression scheme code, by convention files
   are named <B>tif_*.c</B> (except perhaps on some systems where the
   tif_ prefix pushes some filenames over 14 chars.
<LI>Edit <B>Makefile.in</B> (and any other Makefiles)
   to include the new source file.
</OL>

<P>
A codec, say <TT>foo</TT>, can have many different entry points:

<PRE>
TIFFInitfoo(tif, scheme)/* initialize scheme and setup entry points in tif */
fooSetupDecode(tif)	/* called once per IFD after tags has been frozen */
fooPreDecode(tif, sample)/* called once per strip/tile, after data is read,
			    but before the first row is decoded */
fooDecode*(tif, bp, cc, sample)/* decode cc bytes of data into the buffer */
    fooDecodeRow(...)	/* called to decode a single scanline */
    fooDecodeStrip(...)	/* called to decode an entire strip */
    fooDecodeTile(...)	/* called to decode an entire tile */
fooSetupEncode(tif)	/* called once per IFD after tags has been frozen */
fooPreEncode(tif, sample)/* called once per strip/tile, before the first row in
			    a strip/tile is encoded */
fooEncode*(tif, bp, cc, sample)/* encode cc bytes of user data (bp) */
    fooEncodeRow(...)	/* called to decode a single scanline */
    fooEncodeStrip(...)	/* called to decode an entire strip */
    fooEncodeTile(...)	/* called to decode an entire tile */
fooPostEncode(tif)	/* called once per strip/tile, just before data is written */
fooSeek(tif, row)	/* seek forwards row scanlines from the beginning
			   of a strip (row will always be &gt;0 and &lt;rows/strip */
fooCleanup(tif)		/* called when compression scheme is replaced by user */
</PRE>

<P>
Note that the encoding and decoding variants are only needed when
a compression algorithm is dependent on the structure of the data.
For example, Group 3 2D encoding and decoding maintains a reference
scanline.  The sample parameter identifies which sample is to be
encoded or decoded if the image is organized with <TT>PlanarConfig</TT>=2
(separate planes).  This is important for algorithms such as JPEG.
If <TT>PlanarConfig</TT>=1 (interleaved), then sample will always be 0.

<A NAME=Other><P><HR WIDTH=65% ALIGN=right><H3>Other Comments</H3></A>

The library handles most I/O buffering.  There are two data buffers
when decoding data: a raw data buffer that holds all the data in a
strip, and a user-supplied scanline buffer that compression schemes
place decoded data into.  When encoding data the data in the
user-supplied scanline buffer is encoded into the raw data buffer (from
where it is written).  Decoding routines should never have to explicitly
read data -- a full strip/tile's worth of raw data is read and scanlines
never cross strip boundaries.  Encoding routines must be cognizant of
the raw data buffer size and call <TT>TIFFFlushData1()</TT> when necessary.
Note that any pending data is automatically flushed when a new strip/tile is
started, so there's no need do that in the tif_postencode routine (if
one exists).  Bit order is automatically handled by the library when
a raw strip or tile is filled.  If the decoded samples are interpreted
by the decoding routine before they are passed back to the user, then
the decoding logic must handle byte-swapping by overriding the
<TT>tif_postdecode</TT>
routine (set it to <TT>TIFFNoPostDecode</TT>) and doing the required work
internally.  For an example of doing this look at the horizontal
differencing code in the routines in <B>tif_predict.c</B>.

<P>
The variables <TT>tif_rawcc</TT>, <TT>tif_rawdata</TT>, and
<TT>tif_rawcp</TT> in a <TT>TIFF</TT> structure
are associated with the raw data buffer.  <TT>tif_rawcc</TT> must be non-zero
for the library to automatically flush data.  The variable
<TT>tif_scanlinesize</TT> is the size a user's scanline buffer should be.  The
variable <TT>tif_tilesize</TT> is the size of a tile for tiled images.  This
should not normally be used by compression routines, except where it
relates to the compression algorithm.  That is, the <TT>cc</TT> parameter to the
<TT>tif_decode*</TT> and <TT>tif_encode*</TT>
routines should be used in terminating
decompression/compression.  This ensures these routines can be used,
for example, to decode/encode entire strips of data.

<P>
In general, if you have a new compression algorithm to add, work from
the code for an existing routine.  In particular,
<B>tif_dumpmode.c</B>
has the trivial code for the "nil" compression scheme,
<B>tif_packbits.c</B> is a
simple byte-oriented scheme that has to watch out for buffer
boundaries, and <B>tif_lzw.c</B> has the LZW scheme that has the most
complexity -- it tracks the buffer boundary at a bit level.
Of course, using a private compression scheme (or private tags) limits
the portability of your TIFF files.

<P>
<HR>

Last updated: $Date: 2004/09/10 14:47:31 $

</BODY>

</HTML>