A system has been built that embeds arbitrary digital data in an iconic representation of a text image. For encoding, a page image containing text is analyzed for the text regions. A highly reduced image of the page is generated, with an iconic version of the text that encodes an input data stream substituting for the text regions. The data is encoded into modulations of rectangular iconic representations of text words, where the length, height and vertical positioning of rectangles, as well as the spacing between rectangles, can all be independently varied. No correspondence need be maintained between the words in the document and the word icons. Word icons or other marks on each line can be used for identifying, calibrating and justifying iconic text. Decoding proceeds by finding iconic lines and determining the iconic word sizes and locations. Word icons printed with 8x reduction are reliably decoded from 300 ppi binary scans. One application is to present iconified first pages of many documents on a sheet of paper, where the URL of each document is encoded in its icon. Icon scanning and selection then allows retrieval of the full document. Another use is to print an icon on every page of a document, containing meta-information about the document or the specific page, such as the version, revision history, keywords, authorization, or a signed hashing of the full image for authentication.