ToDo or Toodledo. That is the Question. Again.

One of my most popular posts at my other blog is a comparison of ToDo and ToodleDo for the iPhone.  The original post was written a while...

One of my most popular posts at my other blog is a comparison of ToDo and ToodleDo for the iPhone.  The original post was written a while ago, and both apps have had several significant revisions since then.  So, I'm refreshing the post here: I've gone over the presentation, update the information to reflect the current versions of these two apps, and tweaked the data to reflect my most current thinking.

I like PDAs because they help me manage the things I have to do – and I’m all about the todo lists.   I don’t know if I’ve become dependent on lists because I have a bad memory, or if my memory is failing because I use lists for everything.  Still, there it is.

Over the past year or so, a number of task manager apps have come out for my beloved iPhone, and I’ve been trying most of them.   It’s surprising how I keep coming back to the same two apps, and equally surprising (to me) that after months of playing around with them, I still can’t quite decide which one I prefer.

The two apps is Appigo’s ToDo, and ToodleDo for the iPhone.  Both cost only a few dollars, and both are very well-rated by the public at large.

So, I figured, let's use some design analysis tools to evaluate the two apps, and see what the numbers say.

I’m going to use two tools: pairwise comparison, and a weighted decision matrix. These tools aren’t only useful for analyzing designs – they’re basic decision-making tools, and they’ve always done right by me to evaluate designs, conceptual or otherwise.

Both tools depend on having a good set of criteria against which the two apps will be compared. You might not know what decision to make, but you need to know how you’ll know you’ve made the right one.  In our case here: How do I know when I’ve found a good task manager app?

The formal term for what I’m doing here is qualitative, multi-criterion decision-making. It generally comes involves four tasks, which in my case are:
  1. Figure out criteria that apply to any “best” task manager.
  2. Rank the criteria by importance, because some criteria will affect my decision more than others.
  3. Develop a rating scale to rate each app.
  4. Rate the apps with the rating scale and the weights.
Here’s my criteria, in no particular order of importance, based on years of using other task management tools:
  • Fast.  No long delays when telling the app to do something.
  • Easy.  Minimal clicking (e.g. not having to hit “accept” or "save" for everything, or burrow into deeply nested forms and subforms).
  • Start dates.  Tasks shouldn't appear on any standard task list until its start date (if given).
  • Due dates.  Obviously, but not mandatory on all tasks.
  • Repeats.  Repeating tasks at regular intervals.
  • Priorities.  At least three levels of priority for tasks.
  • Sync.  Easy syncing to some remote service that is fairly robust, using standard formats, that let's me access my tasks from other devices.
  • Groups.  Group tasks by tag or folder or project or whatever.
  • Sorting.  Multiple ways to sort tasks.
  • Hotlist.  Some overview page showing only near-term, important tasks; preferably customizable in terms of how I define "important."
  • Restart.  Picks up next time I run it where I left off last time (oddly, not every iPhone app does this).
  • Recovery.  Be able to uncheck tasks that were accidentally checked off.
  • Subtasks. Treat a single task as if it were a group/project/folder.
  • Checklists. A degenerate case of a task is just an item in a checklist.  Not every "task" really deserves all the attributes.  Checklists that can be used as templates (i.e. copied over and over again) would be even better.
  • Conditional deadlines.  Due dates based on due dates of other items (e.g. task B is due two weeks after task A is completed).
  • Backlinks. Given a task, one-tap access to the group/project/folder in which the task lives.
Oddly, not a single iPhone app I’ve checked out so far meets all my requirements.   In particular, I’ve not even heard of an app that even tries to meet the last two requirements. I say “oddly” because I don’t think these requirements are excessive or bizarre, and I do think they'd be immensely useful.  Still, there it is.

Next, we have to develop weights to assign relative importance to the criteria.  The word relative is key here; we’re not going to say that one criterion is certainly and universally more important than any other.  What I want is to know how important each is with respect to the others and my own experience.  Remember, one size never fits all.

This is where pairwise comparison comes in. Details on how this works are given in another web page (it isn’t hard). The chart below is just the end results. In each cell is the criterion that I thought was more important of the pair given by that cell’s row and column. Since it doesn’t make sense to compare something to itself, and since these comparisons are symmetric (comparing A and B is the same as comparing B and A), then I only need to fill in a little less than half of the whole chart. If you’re thinking this took a long time, you’d be wrong. It took me about 30 minutes to fill in the whole thing.

A B C D E F H I J K L M N O P Q
A Fast - B A D E F H I J K L M N A P A
B Easy
- C D E B B B J K B B B B P Q
C Start Dates

- D E F H I J C C M N C P C
D Due Dates


- DE D D I D D D D D D D D
E Repeats



- EF E I E E E E E E E E
F Priorities




- H I J F F M N O P Q
H Sync





- H J K L H N O H H
I Groups






- J I I I I I I I
J Sorting







- J J J J O J J
K Hotlist








- L K N K P Q
L Restart









- M N L P Q
M Recovery










- N O P M
N Subtasks











- O P N
O Checklists












- P O
P Cond. Deadlines












- P
Q Backlinks -


This leads to the following weights:

Fast 2.46%
Easy 6.56%
Start Dates 4.10%
Due Dates 11.48%
Repeats 11.48%
Priorities 4.10%
Sync 5.74%
Groups 9.84%
Sorting 9.84%
Hotlist 4.10%
Restart 3.28%
Recovery 4.10%
Subtasks 6.56%
Checklists 4.92%
Cond. Deadlines 8.20%
Backlinks 3.28%

So this tells me, for instance, that having due dates and repeating tasks are the two most important criteria.  Task grouping and sorting are a close second.  And so on.

The point of this process is that the human mind is not good at juggling a bunch of variables, but it is very good at comparing one thing against another. Take the trivial case of choosing between three alternatives, A, B, and C. If you prefer A to B, and B to C, then you should accept the logic that A is the most preferred item. To do otherwise just isn’t rational. That’s exactly what pairwise comparison does. And there’s good evidence that this technique actually works.

The next step is to choose a rating scale. This scale will be used to rate each app with respect to each criterion.
There’s a variety of scales I could use, and a great deal of research into qualitative measurement scales has been done. The scale that works best for me – and seems to be the most general – is a five-point scale from -2 to +2, where 0 means “neutral,” -2 means “horrible,” +2 means “excellent,” and -1 and +1 are in-between values. If you prefer something a little finer, you can use a 7-point scale from -3 to +3. I think it’s important to have a zero value to indicate neutrality, and I find it meaningful to have negative numbers stand for bad things and positive numbers for good things.

It’s interesting to note that in some industries (e.g. aerospace), I’ve noticed a tendency to use an exponential scale – something like (0, 1, 3, 9). This is because aerospace people tend to be extremely conservative (for reasons both technical and otherwise), so they tend to underrate the goodness of things. This scale inflates any reasonable rating to make up for that conservatism.

But I’m neither an aerospace engineer nor particularly conservative, so I’ll use the -2 to +2 scale.

Now we can do the weighted decision matrix. The gory details are given elsewhere. The weights come from the pairwise comparison above. In a decision matrix, we rank each alternative to some well-defined reference or base item. We need a reference because we need a fixed point against which to measure things.  For this comparison, I'll use the task manager that I am actually using these days, Pocket Informant for the iPhone, as the reference.

I worked up a weighted decision matrix comparing ToodleDo to ToDo. Here it is:

Ref (PI) ToodleDo ToDo
Wgt R S R S R S
Fast 2.46 0 0 0 0 0 0
Easy 6.56 0 0 -1 -6.56 1 6.56
Start Dates 4.10 0 0 0 0 -2 -8.2
Due Dates 11.48 0 0 1 11.48 1 11.48
Repeats 11.48 0 0 1 11.48 1 11.48
Priorities 4.10 0 0 -1 -4.1 1 4.1
Sync 5.74 0 0 0 0 0 0
Groups 9.84 0 0 0 0 0 0
Sorting 9.84 0 0 1 9.84 0 0
Hotlist 4.10 0 0 1 4.1 1 4.1
Restart 3.28 0 0 0 0 0 0
Recovery 4.10 0 0 0 0 0 0
Subtasks 6.56 0 0 0 0 0 0
Checklists 4.92 0 0 0 0 1 4.92
Cond. Deadlines 8.20 0 0 0 0 0 0
Backlinks 3.28 0 0 0 0 0 0
100.04 0 26.24 34.44

This table might not look like much, but it tells a bit of a story.  The column marked Wgt is the weight of that criterion taken from the pairwise comparison.  Each of the three apps gets two columns.  The R column is the rating I gave it; PI is the reference, so it gets zeros in every category.  That way, if another app does better than the reference, it gets a positive rating, and if it does worse than the reference, it gets a negative rating.  The S column is the actual score, which is the rating multiplied by the weight for that criterion.  The numbers at the bottom of the S columns are just the arithmetic sums of the individual scores.

If you look at the ratings for ToDo, you see that it’s a bit better than ToodleDo on some points, and a bit worse on others. But the +1's don’t actually cancel out the -1's because of the weights. The criteria on which ToDo beat ToodleDo are more important to me than the others, because the weights are higher. That makes ToDo noticeably better than ToodleDo.

It's interesting to note that this version has me preferring ToDo over ToodleDo, whereas my original post had it the other way around.  This is because of all the updates to both apps since I first compared them.  Even though there are some things about ToodleDo that really turn my crank, ToDo is the better app, because it does better on things that I think are more important.

And that jives nicely with my intuition.  I started with ToDo, then switched to ToodleDo (just before I did my first comparison).  But now, given the improvements to ToDo, it's taken the lead again.  If it weren't for the decision matrix, I'd only have a "gut feeling" telling me which was better.  But now, having done the comparison twice, I understand and can explain why I preferred one, then the other, then the one again.

One might ask, then, why I'm using Pocket Informant since both ToodleDo and ToDo beat PI.  The answer is simple: appointments.  PI integrates appointments sync'd with Google Calendar right into the app.  That is an absolute deal-breaker for me: it's just too useful for me to have my appointments and tasks all available under one roof, so to speak.  If I'd've added appointments as a criterion, both ToDo and ToodleDo would have lost to PI.

Back during my first comparison, I ran into a problem with ToodleDo that - though it has been corrected since - remains noteworthy with respect to doing these kinds of comparisons.

The problem was this: ToodleDo used to generate the next in a series of repeating events only when it sync'd with the ToodleDo service.  ToDo, on the other hand, handled repeating events internally.

This was a problem for me when I travelled. I had gone to Berlin for a conference. And I didn’t have a data plan for my iPhone (that’s a whole separate story), so I couldn’t sync either app. But that meant ToodleDo couldn’t roll repeating items over properly.  So before I went to Berlin, I sync’d up ToDo and used it while I was gone.  And when I came back I switched back to ToodleDo.  I did that whenever I travelled.

Does the evaluation consider that? No it doesn’t, because I didn’t. The evaluation is only as good as the evaluator. When I evaluated the two apps, I was nestled snugly at home, WiFi at the ready – and sync’ing either ToDo or ToodleDo was a non-issue. If I’d've done the evaluation in Berlin, I’m sure I’d've gotten different numbers, because the repeating events problem would have been right there in my face, irritating the hell out of me.

So this underscores a limit with the evaluation method – indeed, a limit with any method: it’s only as good as the situation you’re in when you use it. Some people might say a method is only as good as the information you use, but it’s more than that. My situation, in this case, includes me, my goals (at the time), my experiences, all the information I have handy, constraints, and anything else can possibly influence my decisions at the time.

The problem, then, is that a method depends on the situation when it’s used. But that situation may be different for the person doing the evaluation than for the person(s) who will have to live with the decision being made. Indeed, it’s virtually guaranteed that the situations will be different, if for no other reason than the implications of a decision will only occur later.

Does this put the kibosh on these kinds of methods?

Not at all.   It just means that we must be vigilant and diligent in their application.   If I did the evaluation in Berlin, ToDo would have won, because in that situation, ToodleDo would have scored poorly on repeating events.  This is as it should be.  That means that in the two different situations, the method worked.  The problem is that in any one given situation, there’s no way to take into account any other situations.

Happily, there is fruitful and vigorous research concerned exactly with this. Some people call it situated cognition; others call it situated reasoning. We’ve not yet figured out how to treat situations reliably, but I think it’s only a matter of time before we do.

In the meantime, there is at least one other possible way to treat other situations. A popular technique to help set up a design problem is the use case (or what I call a usage scenario). These are either textual or visual descriptions of the interactions involved in using the thing you’ll design. They can be quite complex and detailed. Usage scenarios try to capture a specific situation other than the one that includes the designers during the design process. So it’s at least possible that usage scenarios could help designers evaluate designs and products better.

One final caveat: this evaluation is particular to me. It is unlikely that anyone will agree completely with my evaluation, because their situations are different from mine. So I’m not saying ToDo “is better” than ToodleDo. I’m just saying it seems to be better for me.

As they say: your mileage may vary.

COMMENTS

Name

academia activism adaptation admin aesthetics affect ageing AI analogy android anthropology anticipation app architecture art arts Asia assistive technology automobile balance biology biomimetics book branding building business CAD Canada care case cfp change revision children codesign cognition collaboration colonization commercialization commonplacing communication design competition complexity computation computer science computing concept map conference constructivism conversational analysis craft creative arts creativity CSCW culture cybernetics degrowth dementia design design thinking digital digital media digital reproduction digital scholarship disability dissertation drawing economics education effectiveness efficiency emotion engineering environment ergonomics ethics ethnography Evernote evolution exhibition exoskeleton experience experimental studies fail fashion featured film food function modeling futurism gender studies Germany globalization grantsmanship graphic design Greece HCI health heritage history housing human factors humanism identity image inclusivity industrial design informatics information innovation interaction interior design internet of things intervention iphone journal journalism language law library life life cycle lifehack logistics luxury making management manufacturing material culture materials mechanics media method migration mobile motion design movie new product development Nexus 6 olfaction online organization packaging paper participatory design PBL pengate performance PhD philosophy planning policy politics practice predatory preservation prison proceedings productivity project management public space publishing reading Remember The Milk reproduction research resource-limited design reuse review Samsung scholarship science science fiction semiotics senses service design simplicity society sociology software space strategic design student sustainability systems tactile tangibility technology textile theatre theory Toodledo Toronto tourism traffic transhumanism transnationalism transportation tv uncertainty universal design urban usa usability user experience visualization wearable well-being women workshop writing
false
ltr
item
The Trouble with Normal...: ToDo or Toodledo. That is the Question. Again.
ToDo or Toodledo. That is the Question. Again.
The Trouble with Normal...
http://filsalustri.blogspot.com/2010/05/todo-or-toodledo-that-is-question-again.html
http://filsalustri.blogspot.com/
http://filsalustri.blogspot.com/
http://filsalustri.blogspot.com/2010/05/todo-or-toodledo-that-is-question-again.html
true
389378225362699292
UTF-8
Not found any posts VIEW ALL Readmore Reply Cancel reply Delete By Home PAGES POSTS View All RECOMMENDED FOR YOU LABEL ARCHIVE SEARCH ALL POSTS Not found any post match with your request Back Home Sunday Monday Tuesday Wednesday Thursday Friday Saturday Sun Mon Tue Wed Thu Fri Sat January February March April May June July August September October November December Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec just now 1 minute ago $$1$$ minutes ago 1 hour ago $$1$$ hours ago Yesterday $$1$$ days ago $$1$$ weeks ago more than 5 weeks ago Followers Follow THIS CONTENT IS PREMIUM Please share to unlock Copy All Code Select All Code All codes were copied to your clipboard Can not copy the codes / texts, please press [CTRL]+[C] (or CMD+C with Mac) to copy