The Wide, Wide World Digital Edition explores how Susan Warner's The Wide, Wide World shaped and was shaped by transatlantic literary culture during the novel’s primary dates of publication: 1850-1950. In addition to Susan Warner's authorship, therefore, the project is concerned with the fans, reading communities, reviewers, editors, publishers, illustrators, and book designers who played a role in how the book was interpreted during this period. The project benefits from John Bryant’s work on Melville’s Typee and his theory of editorial practice as offered in The Fluid Text. Bryant explains that the collaborative, conflict-driven development of a text over time is the rule rather than the exception. Bryant’s work indicates that rather than giving readers a singular edition, the apparatus of a fluid edition must engage them with a text historically by exposing them to “the distances between multiple versions” and providing “a library with ‘reading rooms’ for the inspection of intended and social texts as well as cultural adaptations.”
In order to visualize The Wide, Wide World’s history of textual and visual fluidity, The Wide, Wide World Digital Edition is organized as a series of interpretive and comparative galleries that put the novel’s revisions into cultural and scholarly context. The project provides access to a wide range of materials, including fan letters, prize plates, reviews, translations, and, of course, Warner's manuscript and publishers’ resulting versions of the novel. In addition to these materials, the project includes a comprehensive encyclopedia on major reprint publishers, a guide to printing technologies during the period, mapped explorations of the novel's international reception, and critical galleries that consider the iconic visual elements of the novel's reprints.
Access to the Novel and its Reprints
In order to map the novel's transatlantic publication and reception, the project edits and compares the 141 versions of the novel, derived from fifty-three sets of variant printers’ plates. We provide high-quality facsimile images of each variant text along with electronically edited transcriptions that allow for maximum searchability within and across editions. Because comparison is an essential framework of the project's methodology, we are developing an Omeka plug-in for the NINES textual comparison tool Juxta. This will eventually provide users with the ability to compare the entire textual corpus using a heat map that identifies the level of textual variation at particular points in the text and to compare any two texts in the corpus side-by-side. Each edition in Juxta will be accompanied by an introduction and annotations of textual variants. Visit the Compare Editions gallery on this site for more information.
In addition to displaying all editions in the Juxta plug-in, they are also available individually. We encode each text without regularization, including end-of-line hyphens, running headers, and misspellings so that our edition provides the most comprehensive access to accurate representations of the work of reprinters. Because the book’s readers are important for understanding the book’s readership, marginal markings are included in annotations and displayed on facsimile pages.
Just as significant as the textual variants, the multitude of visual icons across the novel’s corpus, including over 140 cover designs and forty-seven sets of illustrations, are essential for understanding the novel’s import to transatlantic literary culture. We have developed a comprehensive controlled vocabulary for describing these visual elements. We display all decorations, bindings, and illustrations in the edited text and as separate searchable items in the project’s database. These multiple-access points allow readers to search for illustrations according to their own scholarly interests.
Preparing the Editions
Work on each edition of the novel begins when we obtain high-resolution digital images. Eighty-eight editions of the novel were scanned at Southern Illinois University Edwardsville using an archival book scanner with two mounted DSLR cameras. We captured these images in RAW format and then saved them as TIFF files with a resolution of 600 dots-per-inch (dpi). We processed the bindings using a flatbed scanner to capture 800 dpi TIFF files. The TIFF file format is archival because, unlike a JPEG, TIFF files do not degrade in quality when edited. Saving our files using this format ensures that we will always have a strong dataset from which to create new image derivatives.
Other images of the novel’s paratexts have been collected from repositories at over seventeen institutions using the same archival-quality standards. Other than cropping the images and saving them as JPEG files for web viewing, the images have not been re-touched in any way. We have added each item to our database using Dublin Core metadata standards, so that users have access to information about its source, the image, and its quality.
After collecting image files for each edition, we begin the work of transcription. We use the Optical Character Recognition Software ABBYY FineReader to obtain our initial machine-readable text. A pair of staff members then works together to double-check the transcription for spacing, hyphenation, and spelling against the digital facsimile. One staff member reads directly from the digital image (including all punctuation) while the other corrects the transcription. A senior staff member then reviews and corrects the transcription by proofreading against the digital facsimile.
After assuring that we have an accurate transcription, a staff member encodes each edition using the Text Encoding Initiative’s standards (TEI) for encoding digital texts in XML (Extensible Markup Language). The TEI allows us to record features of the physical text, including all front and back matter, page breaks, line breaks, epigraphs, and running headers. The staff members also use TEI’s naming capabilities to link all mentions of specific characters. After the TEI is validated against the project’s customized schema, a senior staff member then re-checks the encoding and accounts for any salient features not considered in the initial process. Next, a member of the editorial board reviews the manuscript with a senior staff member, completing a final proofreading and resolving any remaining encoding issues. Each edition is then published as a part of the project’s Omeka collections and as a viewable full-text edition. Our schema is available for downloads along with project templates and documentation at our project blog.
After all of the manuscripts are completed, staff members will begin the process of identifying and annotating textual variants across the full corpus using Juxta. At this time, the project uses Juxta internally as a comparison tool, but Juxta will eventually be integrated into the site design so that users can access these textual variants and their contextual annotations. Although the editorial team considered encoding the variants using TEI, Juxta is a more practical decision for the project because the tool does not require our users to settle on a single base text for comparison. Also because there are 2,809 possible comparison scenarios, the work of encoding those variants using the TEI would not have been economically or temporally feasible. Juxta allows us to provide and annotate these variant sites for our users without limiting their viewing options.
Editing Reviews & Letters
The project first obtains digital facsimiles of all letters and reviews using the archival standards listed above. If digital copies are not accessible, then we work from microfilm or photocopies. Staff members transcribe reviews and letters by hand. We provide access to the digital images on the site when they are available. In our digital renditions of the letters and reviews, we do not attempt to exactly mimic the original line breaks, hyphenation, or ornamentation, but we do encode all insertions and deletions. All edited letters and reviews are double-checked by an editorial board member against the digital image.
We are using a TEI schema customized by the Women Writers Project Cultures of Reception−an NEH-funded study of the transatlantic reception of women authors. Their schema emphasizes standards for encoding names and places so that letters and reviews will be searchable across digital projects. Because we are analyzing the novel’s transatlantic importance, the reviews and letters are carefully encoded for spatial and temporal data that we then contextualize for our users with mapping and timelining software.
Visual Images and a Controlled Vocabulary
Images, perhaps even more than text, are subjective in their meaning. When searching in a text, a user might try keywords that the author actually used such as “Bible,” “Christian,” or “Evangelical,” making it less challenging to find thematic sites of interest without the outside influence of the person who encoded the edition. This isn’t possible with images; a scholarly editor must first write a detailed description of a visual image before a user can search for it within a large collection. As a result, images are more subject to the influences of the encoder.
The Database for Mid-Victorian Illustration attempted to develop a controlled vocabulary for iconographic illustration in 2007 that would solve this dilemma, writing:
What if users wish to search using words like 'love' or 'death'? Fitting such terms into a hierarchical system is problematic in itself, but even more difficult is trying to ensure that such subjective ideas are deployed consistently across the whole collection of images. 'Death' could be used in several ways: when a dead body is present, when a coffin or grave appears, when a death-bed scene is depicted, for a courtroom scene in which a criminal is sentenced to hang, when characters are shown in mourning dress…. The response of DMVI to these kinds of questions has been that it is better to display too many images than too few, and it should be up to the user to make the final decision of relevance.
This passage exemplifies just how open to interpretation and therefore guided by the hand of the editor image tagging can be. In this case, the editors of DMVI have determined to tag everything broadly and let users do what they will with the output. We have largely made the same decision (with some caveats) for The Wide, Wide World Digital Edition. Our editorial policy for encoding and describing images include the following rationales.
1) We have used the DMVI’s iconographic taxonomy in combination with a list of bibliographic terminology derived from Phillip Gaskell’s work, and terms specific to The Wide, Wide World’s corpus to develop a controlled vocabulary for tagging illustrations and covers as a part of our Omeka database that is searchable for our users. Please visit our project blog for further information regarding how items are added to Omeka.
2) Because it is important for the images to be accessible both as individual items and within the context of their own editions, we have used this same controlled vocabulary to describe the in-text illustrations as a part of our TEI encoding process.
3) Rather than limiting our approach to images to our own editorial interests, the project takes DMVI’s principles one step further and invites users of the Omeka database to add to the rich list of tags already provided for a specific image. All users who sign up for a The Wide, Wide World Digital Edition account will be able to tag images themselves using our controlled vocabulary. Although Omeka will not accept tags from users that are not already a part of the over-arching list, users may also contact the Editorial Board to have relevant terms added to the official vocabulary for future tagging.
4) The Wide, Wide World’s story is one of consistent repurposing and reimagining on the part of a plethora of readers, publishers, and editors with little to no concern for the intentions or authority of the author. The novel's adaptations are inherently collaborative and contain the work of illustrators, woodcutters, printers, and binders. These multivocal revision sites actively resist a stable meaning. As such, we have envisioned the site as a place where the cultural remix that so characterized The Wide, Wide World’s history can continue. Members of the editorial board have developed their own reading galleries that analyze and put the Edition's content into historical context. We also encourage registered users to create their own posters exploring the site’s larger corpus of covers and illustrations. In the future, the site will include an image comparison tool that will allow users to select up to twenty images for side-by-side comparison and annotation with the option of sharing their work with others.
The Context Galleries
In addition to our work with the primary text, The Wide, Wide World Digital Edition is also a primary access point on scholarship about reprinters and book design during the nineteenth-century. We provide encyclopedic histories with timelines about the dates when book technologies were in use, publishers were in operation, and illustrators were working. In each instance, we provide a detailed bibliography so that site users can trace the same research paths that we have followed in amassing this information. The projects’ editorial staff members work with assistant staff members to collect the research and write narratives for these portions of the site. Another senior staff member double-checks the source materials and proofreads the exhibits prior to publication.
We use Omeka, an open-source publishing platform and content management system that was designed by George Mason University’s Roy Rosenzweig Center for History and New Media. Omeka is specifically designed with the needs of digital humanities projects in mind. Omeka runs on a PHP database and uses Dublin Core Metadata fields for describing each item added to the project. Dublin Core is the approved set of standards used for library cataloging. Using Omeka's Dublin Core fields ensures that we have extensive, reliable metadata. Omeka also provides us with useful methods for organizing and searching collections and a suite of software plug-ins that allow us to map and contextualize our content.
All of The Wide, Wide World Digital Edition’s data undergoes daily incremental backups four times a day on magnetic tape. Southern Illinois University Edwardsville’s Information Technology Service is responsible for the backup of the project’s server, and all magnetic tapes are also stored at an off-site location.
 Herman Melville, Typee, Ed. John Bryant (Charlottesville: Rotunda, 2002). John Bryant, The Fluid Text: A Theory of Revision and Editing for Book and Screen (Ann Arbor: U of Michigan P, 2002).
 John Bryant, the Fluid Text, 123-124.