Coding Analysis Toolkit (CAT)

Posted: December 1st, 2010 | Author: | Filed under: research tools | 4 Comments »

Today I attended a data collection + analysis workshop led by UMass Amherst professor Stuart Shulman.  The workshop focused on two web-based tools developed by Dr. Shulman – an old one (CAT) and a new one (Discover Text).

Here’s a thumbnail sketch of CAT and some of its potential uses.

Coding Analysis Toolkit (CAT)

CAT is a web-based system into which you can upload text files for team-based qualitative analysis.  It is intended primarily for Atlas.ti users who are working on collaborative projects involving a number of coders.  The idea is that you upload your Atlas.ti HUs (“hermeneutic units,” which is just a complicated name for “projects”) into CAT, and then run reliability tests on your coders’ work.  CAT users are probably asking questions like these:

  • Is there consistency in how my project’s coders are labeling, categorizing, or otherwise coding particular data?
  • Is there consistency across codes?
  • Is there consistency across coded excerpts?
  • What are the codes or excerpts for which there is strong pattern of disagreement?

If you do not have Atlas.ti data, but are looking for a platform for team/collaborative coding, CAT could also be useful to you.  The key thing here is that CAT is well suited to quickly coding very large batches of texts that are short and highly consistent.

Let me explain.

In a program like Atlas.ti you highlight and code small bits of text (for example, a word, sentence, paragraph, exchange, etc.) that are contained within a larger piece of text (such as an interview, an interaction transcript, an article or news story, etc.)  You are constantly highlighting and “tagging” contextualized data.  You are also able to see visual representations of the codes present in whichever file you are working on.

CAT, on the other hand, seeks to do away with the clicks and drags of selecting, highlighting, and tagging data with your keyboard and mouse.  Instead, CAT allows you to import broken up (or “demarcated”) data, which then gets separated into “pages” on the UI.  That is, instead of seeing one long interview transcript on the screen, I see just the first paragraph on the UI.  In one open field I type in a code (or multiple codes) for that paragraph.  Alternately, I can select a code from my list.  Once this paragraph is coded, I do a simple click to get to the next paragraph.  Again, instead of seeing the whole interview (or article, or transcript, etc.) I just see one piece of it at a time.  It’s like flipping through a book in which each “page” is a small piece of data.

This sort of approach would be well suited to examining archives of Twitter posts, or Facebook status updates, memos, interviews – any type of texts that are limited in size OR can easily be broken up into smaller pieces, and which have a consistent format.

As you can imagine, this would probably not be the ideal tool for you if you needed or wanted to keep your data embedded in its larger context as you were analyzing it.  CAT seems to be less suited to fine-grained analysis than Atlas.ti or other similar programs, but I can certainly see how it would be useful for doing concerted, rapid, first-run group analyses of very large data sets.

Using CAT is free, but you do need to create accounts for yourself as well as your coders.  You can also assign the coders on your project various permissions.

For more, see this introduction and this overview.

Next post:  Discover Text

4 Comments on “Coding Analysis Toolkit (CAT)”

  1. 1 Tabitha Hart » Blog Archive » DiscoverText for Facebook, Twitter, YouTube said at 1:36 pm on March 10th, 2011:

    […] DiscoverText is a relatively new tool used for scraping and analyzing textual data from Facebook, Twitter, YouTube, blogs, RSS feeds, etc.  It was created by Dr. Stuart Shulman, the same person behind Coding Analysis Toolkit (CAT).  (See my previous post on CAT here.) […]

  2. 2 Vincent said at 1:28 pm on June 5th, 2011:

    Hi Tabitha,

    I’m having difficulty in working with CAT and have some questions.

    Is there a “help desk” or something similar that is ‘out there’ that I could go to for assistance?

    Thank you,


  3. 3 Tabitha Hart said at 5:56 pm on June 8th, 2011:

    Hi Vincent,

    I think your best bet is to try the CAT help wiki at

  4. 4 Tabitha Hart » Blog Archive » Collecting Tweets: TwapperKeeper said at 12:40 pm on September 30th, 2011:

    […] become interested in tools for collecting and analyzing Tweets.  I know that DiscoverText, which I’ve mentioned before, can be used for these purposes, and I’ve just begun experimenting with TwapperKeeper. […]