Choose the Right Tool

Choose the Right Tool

Text Analysis 

  • Examples from our past/current (if extant) 

Methods 

  • Topic Modeling 

  • Information Retrieval  

  • Text Classification  

  • Sentiment Analysis 

  • Word Frequency Analysis 

  • Named Entity Recognition 

  • Collocation 

  • Word Embeddings 

  • Transformer Models 

  • Concordancing 

Tools 

  • Voyant Tools- web-based reading and analysis environment for digital texts  

  • Mallet- Java-based package for statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other machine learning applications to text 

  • WordSeer 4- text analysis environment that combines visualization, information retrieval, sensemaking and natural language processing to make the contents of text navigable, accessible, and useful 

  • Orange Text Mining- open-source machine learning and data visualization for novices and experts. Interactive data analysis workflows with a large toolbox 

  • Antconc- A freeware corpus analysis toolkit for concordancing and text analysis. 

  • Lexos- text analysis tool out of Wheaton College that offers both web-app and local installation options. 

  • Constellate- A tool that allows you to create corpora from JSTOR's collections, with a built-in platform and tutorials for doing analysis in Python 

  • HathiTrust Research Center Analytics- Supports large-scale computational analysis of the works in the HathiTrust Digital Library to facilitate non-profit and educational research 

  • HTRC Algorithms- a set of tools for assembling collections of digitized text from the HathiTrust corpus and performing text analysis on them. Including copyrighted items for ALL USERS 

  • Extracted Features Dataset- a dataset allowing non-consumptive analysis on specific features extracted from the full text of the HathiTrust corpus. Including copyrighted items for ALL USERS 

  • HathiTrust + Bookworm- a tool for visualizing and analyzing word usage trends in the HathiTrust corpus. Including copyrighted items for ALL USERS. 

  • HTRC Data Capsule- a secure computing environment for researcher-driven text analysis on the HathiTrust corpus. All users may access public domain items. Access to copyrighted items is available ONLY to member-affiliated researchers 

  • Google Books- Ngram Viewer- a tool to graph the usage of terms/phrases over a set period 

  • TAPoR 3- tools used in sophisticated text analysis and retrieval  

Visual Analysis 

  • Examples from our past/current (if extant) 

Tools 

  • IIIF- way to standardize the delivery of images and audio/visual files from servers to different environments on the Web where they can then be viewed and interacted with in many ways; a variety of API’s like Presentation which allows you to annotate images. 

  • Loris- A IIIF image server written in Python 

  • Cantaloupe- An open-source dynamic image server for on-demand generation of derivatives of high-resolution source images, written in Java. 

  • Mirador 3- Open-source, web-based, multi-window image viewing platform with the ability to zoom, display, compare, and annotate images from around the world. 

  • Universal Viewer- a community-developed open-source tool that allows the viewing of a variety of file types 

  • CatchPy- an annotation server originally developed at HarvardX, to store annotations on MCH IIIF image assets 

  • Tropy- Designed for researchers working in archives, Tropy helps users organize and annotate photos of archival resources 

  • CVAT- the open-source tool for image and video annotation  

Data Visualizations  

  • Examples from our past/current (if extant) 

Tools 

  • Tableau- query relational databases, online analytical processing cubes, cloud databases, and spreadsheets to generate graph-type data visualizations. The software can also extract, store, and retrieve data from an in-memory data engine 

  • Flourish- create stunning charts, maps, and interactive content that engage and inspire – instantly. No coding is required. 

  • Datawrapper- online tool to create interactive, responsive & beautiful data visualizations 

  • Google Looker Studio- an online tool for converting data into customizable, informative reports and dashboards 

  • Infogr.am- Free basic account with optional fee-based infographic service 

  • Piktochart- An editor created for your convenience 

  • Canva- Canva’s infographic maker includes hundreds of free design elements, allowing you to experiment with data visualization like a pro 

  • Easel.ly- website that features thousands of free infographic templates and design objects that users can customize to create and share their visual ideas online 

  • D3.J3- The JavaScript library for bespoke data visualization. 

Digital Annotation 

  • Examples from our past/current (if extant) 

Tools 

  • Hypothes.is- An open and intuitive annotation software that runs through a Chrome browser extension. Use Hypothesis right now to hold discussions, read socially, organize your research, and take personal notes. 

  • Tropy- Designed for researchers working in archives, Tropy helps users organize and annotate photos of archival resources 

  • Recognito- Annotate documents and photographs with a simple-to-use web-app 

  • Annotation Studio- A suite of collaborative web-based annotation tools from MIT HyperStudio 

  • Neatline- A suite of add-on tools for Omeka, Neatline offers image and map annotation capabilities 

  • Scalar- Primarily a scholarly publishing software, Scalar also allows users to annotate video, images, source code, audio, and text with built-in annotation tools. 

Spatial Analysis and Web Mapping 

  • Examples from our past/current (if extant) 

Tools 

  • QGIS- cross-platform free and open-source desktop geographic information system application that provides data viewing, editing, and analysis capabilities  

  • CARTO- software as a service (SaaS) spatial analysis platform that provides GIS, web mapping, data visualization, spatial analytics, and spatial data science features 

  • Esri ArcGIS Online- client for using, creating, and sharing ArcGIS maps online 

  • Neatline- allows scholars, students, and curators to tell stories with maps and timelines. As a suite of add-on tools for Omeka, it opens new possibilities for hand-crafted, interactive spatial and temporal interpretation 

  • Google Maps API- Create real-world, real-time experiences with the latest Maps, Routes, and Places features from Google Maps Platform. Built by the Google team for developers everywhere 

  • Open Layers- makes it easy to put a dynamic map in any web page. It can display map tiles, vector data, and markers loaded from any source 

  • Mapbox- use Mapbox APIs and SDKs, ready-made map styles, and live updating data to build customizable maps for web, mobile, automotive, and AR 

  • Story Maps- Esri Story Maps lets you combine authoritative maps with narrative text, images, and multimedia content. They make it easy to harness the power of maps and geography to tell your story 

  • Palladio- visualize complex historical data with ease 

  • Clio- educational website and mobile application developed at Marshall University that uses GPS to connect users to the history that surrounds them 

  • Leaflet- Javascript class for interactive maps  

  • Tilegrams- an online tool that allows you to create tiled cartograms 

  • MapAlList- tool for creating customized Google maps from lists of addresses 

Network Analysis 

  • Examples from our past/current (if extant) 

Tools 

  • Gephi- visualization and exploration software for all kinds of graphs and networks 

  • Net.Create- an open-source analysis tool that offers simultaneous multi-user network data entry that accommodates duplicate and ambiguous network data, provides live visualizations of up-to-the-minute entries from other team members, and is structured around clear citational and interpretive practices 

  • Palladio- visualize complex historical data with ease,  

  • Cytoscape- an open-source software platform for visualizing complex networks and integrating these with any type of attribute data 

  • NodeXL- a Microsoft Excel plugin making it easy for you to collect, store, analyze, visualise, and report at the click of a button 

  • NetworkX- a Python package for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks. 

  • Igraph- a collection of network analysis tools with an emphasis on efficiency, portability ,and ease of use 

  • VizNetwork- an R package for network visualization, using vis.js javascript library 

  • D3.js- Javascript library for bespoke data visualization 

 

Timeline and Temporal Analysis 

Tools 

  • TimelineJS- an open-source tool that enables anyone to build visually rich, interactive timelines,  

  • Chronos Timeline- render interactive timelines in your Obsidian notes from simple Mardown.   

  • Neatline- allows scholars, students, and curators to tell stories with maps and timelines. As a suite of add-on tools for Omeka, it opens new possibilities for hand-crafted, interactive spatial and temporal interpretation 

  • TimeGlider- A web-based timeline builder 

  • TimeToast- A tool for creating timelines that can be added to a website or blog 

  • Viewshare- A free platform for generating and customizing views, such as interactive maps and timelines 

 

Machine Learning 

  • Examples from our past/current (if extant) 

Tools 

  • ChapGPT 

  • Google Gemini 

  • Microsoft Co-pilot 

  • Check out our AI Toolkit! 

Database Development 

  • Examples from our past/current (if extant) 

Tools 

  • Filemaker Pro- a cross-platform relational database application 

  • PostgreSQL- powerful, open-source object-relational database system 

  • MySQL- an open-source relational database management system 

  • MongoDB- a source-available, cross-platform, document-oriented database program 

  • Elasticsearch- open source distributed, RESTful search and analytics engine, scalable data store, and vector database capable of addressing a growing number of use cases. 

  • Solr- open source, multi-modal search platform built on the full-text, vector, and geospatial search capabilities of Apache Lucene 

  • AWS DynamoDB- a serverless, NoSQL database service that allows you to develop modern applications at any scale 

  • Neo4J- a graph database management system 

  • Datagrip- cross-platform tool for relational and NoSQL databases 

  • Postico- The native Mac app for PostgreSQL 

  • SQL Server Management Studio- a software application developed by Microsoft that is used for configuring, managing, and administering all components within Microsoft SQL Server. 

  • Corpora- serving as a database, a REST API, a data collection/curation interface, and a Python-powered asynchronous task queue all wrapped into one 

Data Cleaning 

  • Examples from our past/current (if extant) 

Tools 

  • OpenRefine- an open-source desktop application for data cleanup and transformation to other formats, an activity commonly known as data wrangling 

  • Tidyverse- tidyr provides a set of functions that help you get to tidy data. Tidy data is data with a consistent form 

  • Pandas- open source data analysis and manipulation tool, built on top of the Python programming language 

Project Management  

Tools 

  • Trello- a web-based, kanban-style, list-making application developed by Atlassian. 

  • Github Projects- an adaptable spreadsheet, task-board, and road map that integrates with your issues and pull requests on GitHub to help you plan and track your work effectively 

  • Asana- a web and mobile "work management"[3] platform designed to help teams organize, track, and manage their work 

  • Monday.com- an adaptable project management software 

  • Airtable - a variety of project management templates 

Citation Management 

Tools 

  • Zotero- a free, easy-to-use tool to help you collect, organize, cite, and share research 

  • EndNote- a commercial reference management software package, used to manage bibliographies and references when writing essays, reports, and articles. 

  • Mendeley- reference management software 

Digital Collections 

  • Examples from our past/current (if extant) 

Tools 

  • Omeka- free, flexible, and open source web-publishing platform for the display of library, museum, archives, and scholarly collections and exhibitions 

  • Scalar- free, open source authoring and publishing platform that’s designed to make it easy for authors to write long-form, born-digital scholarship online 

  • Story Maps- combine authoritative maps with narrative text, images, and multimedia content. They make it easy to harness the power of maps and geography to tell your story 

  • Murkutu- content management system that allows people build archives and to share information in a culturally relevant way. 

  • Neatline- allows scholars, students, and curators to tell stories with maps and timelines. As a suite of add-on tools for Omeka, it opens new possibilities for hand-crafted, interactive spatial and temporal interpretation 

Digital Publishing 

  • Examples from our past/current (if extant) 

Tools 

  • Juxta- an open-source tool for comparing and collating multiple witnesses to a single textual work 

  • Oxygen- a comprehensive suite of XML authoring and development tools 

  • Manifold Scholarship- open- source, free publication software 

  • eScholarship: eScholarship® provides scholarly publishing and repository services that enable departments, research units, publishing programs, and individual scholars associated with the University of California to have direct control over the creation and dissemination of the full range of their scholarship. 

  • UCLA Library- Open Access Research Guide 

  • UCLA Library- Open Access & Publishing 

  • Open Access Monograph via Luminos or TOME with publication fees covers by the library for UCLA faculty authors with support from Arcadia.  

  • UCLA faculty authors can submit monograph proposals to the relevant UC Press Editor.  

Web Development 

  • Examples from our past/current (if extant) 

Tools 

  • Drupal- free, open-source content management system 

  • WordPress- web content management system 

  • Google Sites- free, easy-to-use website builder 

  • GitHub Pages 

Data Curation and Management 

  • Examples from our past/current (if extant) 

Tools 

  • Git- a distributed version control system that tracks versions of files. It is often used to control source code by programmers who are developing software collaboratively 

  • Dataverse- Open source research data repository software 

  • Github- a proprietary developer platform that allows developers to create, store, manage, and share their code. 

Programming Languages and Packages 

Tools 

  • Jupyter Notebooks- Free software, open standards, and web services for interactive computing across all programming languages 

Python  

  • Natural Language Toolkit (NLTK)- leading platform for building Python programs to work with human language data. It provides easy-to-use interfaces to over 50 corpora and lexical resources such as WordNet, along with a suite of text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning, wrappers for industrial-strength NLP libraries, and an active discussion forum.  

  • SpaCy- free, open-source library for advanced Natural Language Processing (NLP) in Python. 

  • Gensim- free open-source Python library for representing documents as semantic vectors, as efficiently (computer-wise) and painlessly (human-wise) as possible.  

  • Matplotlib- comprehensive library for creating static, animated, and interactive visualizations in Python. Matplotlib makes easy things easy and hard things possible. 

  • Seaborn- Python data visualization library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics. 

  • Plotly- open source graphing library 

Coding 

Transcription 

Tools 

  • Abby Finereader 

  • Scripto- open-source tool that permits registered users to view digital files and transcribe them with an easy-to-use toolbar, rendering that text searchable 

  • OTranscribe- Free web-based audio transcription interface that offers an audio file player together with a word processing interface 

  • eScriptorium- a Digital Text Production Pipeline for Print and Handwritten Texts using machine learning techniques. 

  • Transkribus- an AI platform that supports your work with historical documents. Transkribus enables you to automatically recognize text, layout, and structure in your documents with the power of AI 

  • Amazon Textract- a machine learning (ML) service that automatically extracts text, handwriting, layout elements, and data from scanned documents 

  • Google Document AI- Create document processors that help automate tedious tasks, improve data extraction, and gain deeper insights from unstructured or structured document information.