Concepts of plain and formatted text -

Concepts of Plain and Formatted Text

In multimedia presentations, text can be combined with other media in a powerful way to present information and express moods. Text can be of various types: Plaintext, consisting of fixed sized characters having essentially the same type of appearance. Formatted text, where appearance can be changed using font parameters. Hypertext, which can serve to link different electronic documents and enable the user to jump from one to the other in a non-linear way.

Internally text is represented via binary codes as per the ASCII table. The ASCII table is however quite limited in its scope and a new standard has been developed to eventually replace the ASCII standard. This standard is called the Unicode standard and is capable of representing international characters from various languages throughout the world. We also generate text automatically from a scanned version of a paper document or image using Optical Character Recognition (OCR) software.

There are three types of text that can be used to produce pages of a document:

Unformatted text
Formatted text
Hypertext

Concepts of Plain and Formatted Text

1. Unformatted Text

Also known as plaintext, this comprise of fixed sized characters from a limited character set. The character set is called ASCII table which is short for American Standard Code for Information Interchange and is one of the most widely used character sets. It basically consists of a table where each character is represented by a unique 7-bit binary code. The characters include a to z, A to Z, 0 to 9, and other punctuation characters like parenthesis, ampersand, single and double quotes, mathematical operators, etc. All the characters are of the same height. In addition, the ASCII character set also includes a number of control characters. These include BS (backspace), LF (linefeed), CR (carriage return), SP (space), DEL (delete), ESC (escape), FF (form feed) and others.

2. Formatted Text

Formatted text are those where apart from the actual alphanumeric characters, other control characters are used to change the appearance of the characters, e.g. bold, underline, italics, varying shapes, sizes and colors etc., Most text processing software use such formatting options to change text appearance. It is also extensively used in the publishing sector for the preparation of papers, books, magazines, journals, and so on.

3. Hypertext

The term Hypertext is used to mean certain extra capabilities imparted to normal or standard text. Like normal text, a hypertext document can be used to reconstruct knowledge through sequential reading but additionally it can be used to link multiple documents in such a way that the user can navigate non- sequentially from one document to the other for cross-references. These links are called hyperlinks. The underlined text string on which the user clicks the mouse is called an anchor and the document which opens as a result of clicking is called the target document. On the web target documents are specified by a specific nomenclature called Web site address technically known as Uniform Resource Locators or URL.

FONT

In traditional typography, a font is a particular size, weight and style of a typeface. Each font was a matched set of type, one piece (called a “sort”) for each glyph, and a typeface comprised a range of fonts that shared an overall design.

In modern usage, with the advent of digital typography, “font” is frequently synonymous with “typeface”, although the two terms do not necessarily mean the same thing.

In particular, the use of “vector” or “outline” fonts means that different sizes of a typeface can be dynamically generated from one design. Each style may still be in a separate “font file”—for instance, the typeface “Bulmer” may include the fonts “Bulmer roman”, “Bulmer italic”, “Bulmer bold” and “Bulmer extended”—but the term “font” might be applied either to one of these alone or to the whole typeface.

Text can be inserted in a document using a variety of methods. These are:

1. Using a keyboard

The most common process of inserting text into a digital document is by typing the text using an input device like the keyboard. Usually a text editing software, like Microsoft Word, is used to control the appearance of text which allows the user to manipulate variables like the font, size, style, color, etc.

2. Copying and Pasting

Another way of inserting text into a document is by copying text from a pre-existing digital document. The existing document is opened using the corresponding text processing program and portions of the text may be selected by using the keyboard or mouse. Using the Copy command the selected text is copied to the clipboard. By choosing the Paste command, whereupon the text is copied from the clipboard into the target document.

3. Using an OCR Software

A third way of inserting text into a digital document is by scanning it from a paper document. Text in a paper document including books, newspapers, magazines, letterheads, etc. can be converted into the electronic form using a device called the scanner. The electronic representation of the paper document can then be saved as a file on the hard disk of the computer.

To be able to edit the text, it needs to be converted from the image format into the editable text format using software called an Optical Character Recognition (OCR). The OCR software traditionally works by a method called pattern matching. Recent research on OCR is based on another technology called feature extraction. Using this method, the software attempts to extract the core features of the characters and compare them to a table stored within itself for recognition.

Concepts of plain and formatted text

Table of Contents