Methodology for Information Management in Historical Research
The research process for a large paper or book on an historical topic involves the collection of an enormous amount of data. The researcher needs to organize that information in such a way that obscure items are easily found and complex relationships understood. As the digital age provides so much more access, it also creates the challenge of information management. In a 2003 essay, digital historian Roy Rozenzweig writes about the problem of information overload as more and more information is digitized and made readily available. He encourages historians to become involved in developing a new system for managing the problem of abundance. Logically, digital tools should provide an enormous advantage to historians hoping to effectively and efficiently manage the data they collect and the resources they reference. Most modern historians are comfortable using word processors and electronic spreadsheets. Digitally savvy historians may be experienced in using another well-established tool, the relational database. New tools, such as Zotero, are emerging to help with information management. But still, many historians are reluctant to embrace these tools, preferring the old ‘tried and true’ methods, from when they first learned to research.
Many of today’s digital history projects involve advanced technological tools that require a major collaborative effort among professionals from several disciplines. Learning to use digital tools to manage information while conducting research in graduate school or while researching for the thesis or dissertation will position the aspiring historian to work successfully on one of these larger projects in their future. In addition, research management skills will position the researcher to effectively manage information for larger-scale book projects.
This paper proposes a methodology for managing research information in a small-scale project, such as a thesis, conducted by an individual. By developing a research methodology for my master’s thesis project, I will test and refine a proposal for a broader methodology for managing the research process with digital tools. While these tools may also apply in collaborative environments, this project is focused on methods to help the individual manage their research information.
The project process involved determining current methods used by students and historians, analyzing a set of tools, selecting a tool, and developing a design and methodology. A small group of graduate students was surveyed to discover their current methods for managing research projects. A review of scholarly work was conducted to discover information about the work of historians who have used digital tools to manage and organize research data. After a set of potential tools was identified and analyzed, the tool Zotero, a project of the Roy Rosenzweig Center for History and New Media, was selected for managing both project references and project data. A simple design and structure was developed, based on my thesis project that will analyze how rural women helped to create community on the frontier at the end of the nineteenth century. In addition, a research methodology guideline was written that describes how to implement this design for any project, with examples based on the design for my thesis project.
It is probable thatmany students in history graduate programs have not considered developing a system for research management using digital tools. Survey results from five graduate students in a digital history class at the University of Wisconsin-Milwaukee revealed that these students have not considered creating a system for conducting research, whether it is based on paper methods or computer methods. Few have written a paper longer than forty pages or considered using digital tools for managing resources. Filing systems included a combination of electronic files, physical stacks, and file folders. Annotation of sources ranged from not creating notes to handwriting them to writing on sticky notes to creating electronic files. One student had just recently started using the note feature of Zotero. The students have also not developed a system for managing citations, mostly creating them from scratch as they were writing or after they finished composing their papers. When asked which tools they used or planned to use, only two mentioned Zotero. The results of this survey indicate students are not thinking about developing a process that will be helpful when writing long papers or books, and they are not embracing digital tools as an essential skill for managing research in their future. While some historians still write in long-hand, and others maintain a manual system, or no system, a systematic methodology that is easy to apply, even though it requires a certain amount of discipline, can still provide students and historians with the capability of more effectively managing research over time, by saving time and preventing duplication of effort, and, in some cases, providing new ways to analyze information and discover new truths.
Research on the work of historians using digital tools in the research process focused on seeking examples of researchers writing about managing data and references for individual historical research projects. The research yielded few articles discussing a data management methodology for quantitative papers, but the articles that were found provided insights regarding design, as well as discussion on pros and cons of different kinds of tools. In the six most relevant papers, the digital tools that were featured included the markup language of XML, the relational databases of MySQL and FileMaker Pro, and the reference management tool Zotero. Only two of the scholarly articles were written at institutions located in the United States. The papers did highlight important items to consider when designing a method to manage historical information. In 2002, Franco Niccolucci, University of Florence, warns about the difficulty of designing relational databases to record and compare fuzzy dates, such as changes in types of calendars, date ranges, partially known dates and estimated dates. In addition, he advises that the design must consider how to handle variants in name and place. Niccolucci proposes the use of XML to encode historical text, rather than trying to fit imprecise data in a relational database. Agreeing with Niccolucci in 2004, Urs Dietrich-Felber, University of Bern, prefers XML files to relational databases, explaining, “For those who gather data, the effective use of such possibilities as data exchange, compatibility, and simplicity of survey and the reuse of data in other contexts and platforms becomes increasingly important. He argues relational databases are insufficient because the stored data loses too much of the original source. Seven years later, Jean Bauer, University of Virginia, discusses a relational database she developed using MySQL. Bauer believes historians have given up on relational databases too soon, but admits the most useful applications are those that involve “discrete pieces of information with clear connections between them.” Bauer believes the use of the relational database helps her to find new and interesting connections in her research. Bauer consider the design process useful, as it requires the researcher to answer important questions about the key elements of the topic being studied and the relationships between these elements. In 2012, Ansley T. Erickson, Columbia University, developed a relational database, using FileMaker Pro, for the specific purpose of note-taking and managing references. Erickson discusses the flexibility of keywords that can allow for grouping and regrouping sources to reveal new insights that is not easily achieved in a manual system. Also in 2012, Johann Peter Murmann, University of New South Wales, also designed a solution using FileMakerPro. Murmann explains that the first task in database design is to determine the “fundamental unit of analysis.” For efficiency and flexibility, Murmann recommends a unit comprised of an agent, an event and a date. The agent can be applied to people, groups, organizations, etc. as well as academic articles, books, elections, etc. In a 2012 blog post, Alex Hope explains how he uses Zotero to manage his research library. He explains his process for identifying and storing references across multiple platforms, as well as integrating this product with other digital tools. These articles highlighted the differences between using a mark-up language like XML or a relationship database for managing research data. Both approaches are valuable, and the researcher should consider their project goals and determine the best tool. These authors provided useful insights regarding issues to ponder during the design process, including fuzzy dates, imprecise data, flexibility, compatibility, relationship between key elements of research, use of keywords, and determining key unit of observation.
Project Design and Tool Selection
Informed by the experiences of other historians seeking to manage the research process, I developed a set of criteria for selecting a tool to manage the research for my thesis project and to provide an example of a research methodology. The digital product should provide flexibility in constructing data elements, be platform independent, and provide the ability to track and annotate sources. In addition, the product should provide the ability to collect and relate information on individuals, organizations and events. A short learning curve and minimal start-up effort is desired.
The XML mark-up language was rejected, as it appeared more suitable for a text-based project than one seeking to find relationships between different agents under study. In addition, I have no experience with this tool and it would require a long learning curve and start-up time. While I am familiar with both of the relational database tools, Microsoft Access and FileMaker Pro, they were rejected because of start-up effort and lack of flexibility. In considering these tools, I designed the relationship tables, but found them cumbersome for managing kinship relations. Both of these tools would also require the time-consuming effort to develop a number of custom reports. The tool Refworks is designed to manage reference sources, but does not provide as many features or the same level as flexibility as the tool Zotero. The reference management tool Zotero was chosen because it best meets all the criteria. The tool is already designed to run on most platforms, including tablets, to track sources and to manage notes. By re-purposing some of its structure and taking advantage of the feature that allows the user to relate multiple sources to a given source, it is possible to collect and relate information on individuals, organizations and events. Keywords provide additional flexibility. The structure and reporting functions already exist, drastically reducing the start-up effort that would be required for a relational database solution. A great advantage is that sources and data can be managed with one digital tool. While the design attempts to fit individuals, organizations and events into a structure designed for some other purpose, it appears to create the potential for a new way of applying a popular digital tool that may provide value to other historians.
The methodology is designed to provide guidelines for a systematic research process from the initial survey of likely sources to saving and annotating resources to manually filing printed sources to creating a classification system for easy retrieval of notes, on-line sources, bibliographies and manually filed sources. The methodology provides guidelines for using several features in Zotero. The collection feature of Zotero provides a hierarchical structure for organizing sources, also allowing a source to be filed in more than one collection. This methodology uses the collection feature to indicate the progress of note taking, moving a source from temporary collection folders to ones separating primary and secondary sources. In addition one collection folder is used to identify all sources listed in the working bibliography. A set of pre-defined keywords are used to tag items and notes according to research categories. Additional keywords can be added at any time. Individual notes are tagged to indicate type of note, such as summary, analysis, or quote. If a physical representation of the source exists, a specific field in the Zotero item for that source will indicate how the item as been manually filed. The item type called ‘Report’ will be specially configured to record information about people, events and organizations by using redefining the purposes of some metadata fields. Each source that provides information regarding people, events and organizations will be linked through the feature of Zotero that allows the user to identify related sources. In addition, kinship relationships will be documented by relating people to people using this related feature. The Zotero timeline feature will produce a timeline based on dates recorded for events in the ‘Report’ item type.
The manual filing system will use the metadata field Lib Catalog to indicate where the source can be found, such as the user’s personal library or a public library or a specific archive. The metadata field Extra will indicate whether this is a book or the name of a file folder, manual or electronic. This proposed methodology will be tested and modified as I write my thesis in the coming year.
This methodology is intended to provide researchers with a system that makes efficient use of their time and helps them organize sources that can be easily incorporated in future projects. While this methodology requires early planning on the front-end of the project and discipline in following the system throughout the research phase, the ability to organize, retrieve, and analyze the results should be greatly enhanced. If this methodology works as proposed, this organizational system for taking notes, identifying relationships and managing sources, should provide the researcher with an elegant and efficient tool for managing resource information that can help the researcher identify complex connections and relationships in research data, increase their ability to quickly locate source information, and manage notes to facilitate writing process. An effective information management system will prepare the researcher for future collaborative projects or a large-scale book project. This methodology is an attempt to successfully meet the challenge of information management in the digital age.
“Bamboo Dirt.” Accessed February 24, 2014. http://dirt.projectbamboo.org/.
Cohen, Daniel & Roy Rosenzweig. “Digital History: A Guide to Gathering, Preserving, and Presenting the Past on the Web.” Accessed January 28, 2014. http://chnm.gmu.edu/digitalhistory/index.php.
Cooper, Joshua, and Anne James. “Challenges for Database Management in the Internet of Things.” IETE Technical Review 26, no. 5 (September 2009): 320–329. doi:10.4103/0256-4602.55275.
Dietrich-Felber, Urs. “Using Java and XML in Interdisciplinary Research.” Historical Methods 37, no. 4 (Fall 2004): 174–185. https://ezproxy.lib.uwm.edu/login?url=http://search.ebscohost.com/login.aspx?direct=true&AuthType=ip,uid&db=tfh&AN=15537500&site=ehost-live&scope=site.
Hall, Gwendolyn Midlo. “Africa and Africans in the African Diaspora: The Uses of Relational Databases.” American Historical Review 115, no. 1 (February 2010): 136–150. https://ezproxy.lib.uwm.edu/login?url=http://search.ebscohost.com/login.aspx?direct=true&AuthType=ip,uid&db=mth&AN=48376542&site=ehost-live&scope=site.
“HASTAC: Digital History.” HASTAC. Accessed March 4, 2014. http://www.hastac.org/groups/digital-history.
Hope, Alex. “How to Manage a Research Library with Zotero.” The Impact of Social Sciences (blog). LSE Public Policy Group, July 6, 2012. Accessed April 28, 2014. http://blogs.lse.ac.uk/impactofsocialsciences/2012/07/06/manage-a-research-library-with-zotero/.
Katz, Stanley N. “Why Technology Matters: The Humanities in the Twenty-First Century.” Interdisciplinary Science Reviews 30, no. 2 (June 2005): 105–118. doi:10.1179/030801805X25909.
Murmann, Johann Peter. “Constructing Relational Databases to Study Life Histories on Your PC or Mac.” Historical Methods 43, no. 3 (Summer 2010): 109–123. https://ezproxy.lib.uwm.edu/login?url=http://search.ebscohost.com/login.aspx?direct=true&AuthType=ip,uid&db=tfh&AN=52617055&site=ehost-live&scope=site.
Nawrotzki, Kristen and Jack Dougherty. “Writing History in the Digital Age.” Accessed January 28, 2014. http://www.digitalculture.org/books/writing-history-in-the-digital-age/.
Niccolucci, Franco. “XML and the Future of Humanities Computing.” SIGAPP Appl. Comput. Rev. 10, no. 1 (April 2002): 43–47. doi:10.1145/568235.568244.
Rosenzweig, Roy. “Scarcity or Abundance? Preserving the Past in a Digital Era.” American Historical Review 108, no. 3 (June 2003): 735–62. http://search.ebscohost.com/login.aspx?direct=true&AuthType=ip,uid&db=a9h&AN=10104028&site=ehost-live&scope=site.
Weaver, Chris, David Fyfe, Anthony Robinson, Deryck Holdsworth, Donna Peuquet, and Alan M. MacEachren. “Visual Exploration and Analysis of Historic Hotel Visits.” Information Visualization 6, no. 1 (Spring 2007): 89–103. doi:http://dx.doi.org/10.1057/palgrave.ivs.9500145.
“Writing History in the Digital Age » Fielding History (Bauer) Fall 2011.” Accessed March 26, 2014. http://writinghistory.trincoll.edu/data/fielding-history-bauer/.
“Writing History in the Digital Age » Reflections on 10,000 Notecards (Erickson).” Accessed March 26, 2014. http://writinghistory.trincoll.edu/data/erickson-2012-spring/.
 Roy Rosenzweig, “Scarcity or Abundance? Preserving the Past in a Digital Era,” American Historical Review 108, no. 3 (June 2003): 738, http://search.ebscohost.com/login.aspx?direct=true&AuthType=ip,uid&db=a9h&AN=10104028&site=ehost-live&scope=site.
 Franco Niccolucci, “XML and the Future of Humanities Computing,” SIGAPP Appl. Comput. Rev. 10, no. 1 (April 2002): 43–44, doi:10.1145/568235.568244.
 Urs Dietrich-Felber, “Using Java and XML in Interdisciplinary Research,” Historical Methods 37, no. 4 (Fall 2004): 174, https://ezproxy.lib.uwm.edu/login?url=http://search.ebscohost.com/login.aspx?direct=true&AuthType=ip,uid&db=tfh&AN=15537500&site=ehost-live&scope=site.
 “Writing History in the Digital Age » Fielding History (Bauer), Fall 2011,” accessed March 26, 2014, http://writinghistory.trincoll.edu/data/fielding-history-bauer/.
 “Writing History in the Digital Age » Reflections on 10,000 Notecards (Erickson),” 21, accessed March 26, 2014, http://writinghistory.trincoll.edu/data/erickson-2012-spring/.
 Johann Peter Murmann, “Constructing Relational Databases to Study Life Histories on Your PC or Mac,” Historical Methods 43, no. 3 (Summer 2010): 112, https://ezproxy.lib.uwm.edu/login?url=http://search.ebscohost.com/login.aspx?direct=true&AuthType=ip,uid&db=tfh&AN=52617055&site=ehost-live&scope=site.
 Ibid., 112,122.
 Alex Hope, “How to Manage a Research Library with Zotero,” The Impact of Social Sciences (blog), LSE Public Policy Group, July 6, 2012, http://blogs.lse.ac.uk/impactofsocialsciences/2012/07/06/manage-a-research-library-with-zotero/.